As Artificial Intelligence gets better and better, the quality of things they can produce is also increasing. Here we take a look at a tool called Bark, created by Suno AI.

Bark is a transformer-based text-to-audio model. It can generate highly realistic, multilingual speech as well as other audio such as music, background noise and sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. It’s actually quite remarkable how far audio based AI has come.

If you are interested in setting this up for yourself, you can use the Google Colab notebook or run it on your own hardware, here’s how:

You’ll need Python, I highly recommend Anaconda if you have an interest in AI or data science, as it comes preloaded with Jupyter notebook support and other handy tools.

  • To install Bark, you’ll need to grab it using Python’s pip. There are 2 ways to do this, but i’ll show you the easiest. Do not try to use pip install bark as it will install a package not managed by Suno. Instead, use the following: pip install git+https://github.com/suno-ai/bark.git which will pull it from github first and then install.
  • You’ll next need to install the transformer library from HuggingFace. Bark is available in the Transformers library from version 4.31.0 onward, requiring minimal dependencies and additional packages.
  • Install the Transformers library by running: pip install git+https://github.com/huggingface/transformers.git
  • Now you can generate basic audio! Just run the following Python code:
from transformers import AutoProcessor, BarkModel

processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
voice_preset = "v2/en_speaker_6"
inputs = processor("Hello, I am an Artificial Intelligence, I will skynet earth", voice_preset=voice_preset)
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()

Code language: Python (python)

You can listen to the generated samples in a Python notebook or save them as a .wav file.

To listen in a notebook:

from IPython.display import Audio

sample_rate = model.generation_config.sample_rate
Audio(audio_array, rate=sample_rate)
Code language: Python (python)

Or to save them as a .wav file using a 3rd party library such as scipy:

import scipy

sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav", rate=sample_rate, data=audio_array)
Code language: JavaScript (javascript)

Here is how a basic python file for Bark should look when we put everything together:

from bark import SAMPLE_RATE, generate_audio, preload_models from scipy.io.wavfile import write as write_wav from IPython.display import Audio # download and load all models preload_models() # generate audio from text text_prompt = """ Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe. """ audio_array = generate_audio(text_prompt) # save audio to disk write_wav("bark_generation.wav", SAMPLE_RATE, audio_array) # play text in notebook Audio(audio_array, rate=SAMPLE_RATE)Code language: PHP (php)

Obviously guys these are the most basic of instructions. For more in-depth information check out the Bark documentation or the Google Colab notebook which documents everything quite nicely all while allowing you to play around with Bark. You can also check out the Bark Github here.

Hope you’ve enjoyed this quick dive into Bark AI! Let me know what creations you’ve made with AI in the comments!

By Zeekz