Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track

On Monday, Deezer, a French online music streaming service, released Spleeter which is a music separation engine. It comes in the form of a Python Library based on Tensorflow. Stating the reason behind Spleeter, the researchers state, “We release Spleeter to help the Music Information Retrieval (MIR) community leverage the power of source separation in various MIR tasks, such as vocal lyrics analysis from audio, music transcription, any type of multilabel classification or vocal melody extraction.”

It can also train source separation models or fine-tune pre-trained ones with Tensorflow if you have a dataset of isolated sources.

Deezer benchmarked Spleeter against Open-Unmix another open-source model recently released and reported slightly better performances with increased speed. It can perform separation of audio files to 4 stems 100x faster than real-time when running on a GPU.

You can use Spleeter straight from the command line as well as directly in your own development pipeline as a Python library. It can be installed with Conda, with pip or be used with Docker. Spleeter creators mention a number of potential applications of source separation engine including remixes, upmixing, active listening, educational purposes, and pre-processing for other tasks such as transcription.

Spleeter received mostly positive feedback on Twitter, as people experimented to separate vocals from music.

"Spleeter is the Deezer source separation library with pretrained models written in Python and uses TensorFlow". Meaning it separates vocals and different types of accompaniments (eg, Piano). Looks very interesting. https://t.co/IIW0pCPD7r

Only yesterday I was wondering are there any tools to separate/extract different components of music tracks.Well, this will allow you to create tools to extract voice, piano, drums, etc. from any music track using Machine Learning.https://t.co/8ceIdiGj01

Wavy.org also ran several songs through the two-stem filter and evaluated them in a blog post. They tried a variety of soundtracks across multiple genres. The performance of audio was much better than expected, however, vocals sometimes felt robotically autotuned. The amount of bleed was shockingly low relative to other solutions and surpassed any available free tool and rival commercial plugins and services.

This new blazingly-fast open-source library isolates vocals from music, built on a TensorFlow model trained on tens of thousands of songs. I tested it on Lizzo, Billie Eilish, Lil Nas X, Marvin Gaye, and others—listen to the results here. https://t.co/HGgUYI7MT7