In some audio file containing music and speech, I need to remove music (volume reduction say 90%) between each piece of speech. But when volume is instantaneously reduced from 100% to 10% for start of music, then restored from 10% to 100% for start of speech, the transition is unpleasant because too sharp.

Is there any way to smooth these transitions in Sound Forge, perhaps some plugin ?

This question came from our site for engineers, producers, editors, and enthusiasts spanning the fields of video, and media creation.

shoot, I should have asked this first. Are the speech and music on separate tracks or together in one? because if that is the case, my answer will be completely different. Also, I'm guessing your goal is to have just the speech parts correct? Do you want the music taken out too? Have you tried gradual change change instead of instantaneous change?
–
Travis Dtfsu CrumOct 3 '12 at 20:29

Yes, same track. What do you mean by "gradual change", is that a Sound Forge feature ?
–
drake035Oct 3 '12 at 21:44

Cant you just change the volume transition to be a curve? You should be able to do this.
–
MagrangsOct 4 '12 at 8:41

1 Answer
1

For starters, I don't recommend using Sound Forge to do what you are attempting. Compared to other DAWs, it is limited.

What I would do if possible is to have the volume change be gradual instead of instantaneous. In every other DAW this is done by applying an automation or envelope to the gain/volume with either a curve or a simple straight line which represents how much it changes over the allotted time frame.Looks like this in Logic, like this in Ableton, and like this in FL Studio. This may require you to zoom in and make this gradual transition by hand, which would be a huge pain in the butt.

The closet thing I found after a googling was adjusting the volume envelope in Sound Forge:

Also take note, having such a vast difference in volume (90% change!) is making the smoothness of the transition much more noticeable.

If you want to completely separate the speech from the audio without any of the song in the background, there are a few different options as well. The first is through a technique which requires you to have an exact copy of the song but in instrumental form. This technique is explained in this eHow tutorial. While doable in Sound Forge with this technique, I think this is easier to do in Audacity (which is free) though. The quick how to of that is to invert the wave signature of the instrumental track and overlay it onto the original making the audio signals cancel each other out leaving only the vocal elements.

My favorite way of doing this is using the amazing software known as Melodyne because of its ease of use that the fact that it was designed to do tasks like this. Melodyne is freaking awesome at isolating different parts of songs and audio files.

IF I HAD TO DO IT, I WOULD DO IT WITH MELODYNE BECAUSE IT WOULD COMPLETELY REMOVE THE BACKGROUND SONG.

Lastly, if you are extremely limited in your options, you could always do it with Audacity since its free. Here is a video explaining how to use the volume envelope.

If you are completely set on using Sound Forge here are some videos explaining how to do what you ask: video 1, video 2.

Wow thanks that's quite an answer ! The SF volume envelope does the job, though it will be time-consuming this way. I have Melodyne too but achieving this task with it is a mystery to me. Could you explain how ? Otherwise I'll create a new question.
–
drake035Oct 5 '12 at 21:41

I like to go all out on my answers ;)
–
Travis Dtfsu CrumOct 6 '12 at 5:30

Thanks a lot ! Indeed this program is terrific. However I suspect it's easier to work on a song like in the tutorial than in a film or documentary. I'm working on a documentary excerpt and most music blobs are very very hard to distinguish from speech blobs, it's a total mess ! The app is very impressive though.
–
drake035Oct 8 '12 at 16:43

Ah that makes sense. Did you try switching to the note assignment mode to specify which blobs were the vocals and which weren't? The default detection level doesn't always isolate all the sounds as well. If you slide the right bracket of detection slider to right. That increases the amount of possible blobs
–
Travis Dtfsu CrumOct 8 '12 at 16:52