Often the song is mixed in center, i.e. equally present in the right and left channel, while other instruments/sounds often are panned a bit to either channel. If this is the case and you phase shift one channel and mix this together with the unaltered other channel into a mono track I believe you pretty much eliminiate the song. Obiously this will impact and possibly ruin other parts of the music, so it might not be a solution to you, but maybe good to know for someone.
–
Ulf ÅkerstedtMay 23 '12 at 23:08

4 Answers
4

As a friend of mine once explained, "It's like paint. If you mix together several colors of paint, you can't un-mix it and get the original separate colors of paint back."

Vocal removal software, as mentioned in other answers here, is only of limited usefulness due to laws of physics that cannot be circumvented. All such products use the principle of applying phase cancellation to a 2-channel stereo recording.

It is only possible to totally remove the lead vocal from a mixed recording using vocal removal software if the original recording is mixed under a precise combination of conditions: if the recording is 2-channel stereo, mixed with a wide stereo field, but the lead vocal is positioned exactly in the center, and there is no reverberation or echo applied to the lead vocal. Furthermore the process will always also remove any other instruments or musical elements that are also positioned in the exact center of the stereo field (where the bass instrument is usually positioned) so it is usually the case that removing the lead vocal also removes most of the bass instrument, the kick drum, and a great deal of bass frequencies at any point in the recording where the vocal removal effect is applied.

If the recording from which you wish to remove the lead vocal was not mixed according to these exact conditions, then you will encounter only a partial reduction in the volume of the lead vocal. If the lead vocal was processed with a stereo reverb effect, you might be able to remove the lead vocal but you will still hear the stereo reflections (echoes) of the lead vocal through the reverb effect. Regardless, you will probably notice undesirable artifacts and changes in the frequency spectrum.

Furthermore, the process of vocal removal won't work at all with mono recordings or, to the best of my knowledge, 5.1 or other configurations of multi-channel surround sound.

As you can see there, the answer is no. There's no good way for software to tell what is voice and what is not for any arbitrary voice and song combined into a single waveform. As you note it can be done to an extent with varying effects on the song, but there is nothing that can do it very well yet.

A quick-and-dirty way is to split the stereo, reverse phase, and combine to mono. Things mixed center -- generally voices -- get phased out while the things that are mixed wide just get squished to mono. Not perfect, but it'll get you closer and it is easy to do with free tools like Audacity.

The only clean way to do this, assuming you don't have access to the pre-mixed individual audio tracks, might be to completely recreate the music without the vocal tracks. Perhaps try to transcribe the music and instrumentation extremely accurately (including all the subtile timing and pitch variations), feed this transcription to a midi synthesizer that has all the same instrument sounds, then play with a sound editor to tweak things until everything lines up with the original enough for your purposes. There might be research software that attempts to do some of this semi-automatically, not sure what state it's in.