Pages

Tuesday, February 23, 2016

Vocaloid Tutorial: Using Velocity and Dynamics to Improve Realism

Hello, everyone! Satoshi here, back with another Vocaloid tutorial. Last time, I talked about using pitch bend and pitch bend sensitivity to stretch out lyrics over several notes. Today, I'm going to talk about using the Velocity (VEL) and Dynamics (DYN) parameters to make your vocals sound more human. Megumi thinks her singing is fine as is, but I always like to tweak her singing after she lays down a track--just don't tell her!

So what does VEL and DYN do? VEL controls the attack of the note. Keyboard players know what I'm talking about. Imagine playing the piano. If you strike the key fast, it makes a very different sound than if you strike the key slowly. Note that this is different from how loud the sound is--that is controlled by DYN. So VEL is basically how fast or how slowly the note is sounded out while DYN is how loud or soft the note actually is--meaning, the volume. VEL and DYN go hand in hand.

You're probably wondering, in terms of vocals, what does this mean? DYN for vocals is pretty easy to understand--it's how loudly or softly you are singing. But for VEL? Imagine if you are singing slowly. You are stretching the sounds of each word as you sing. This is especially pronounced with consonants. Trying singing the word "say" slowly. You'll see that you are slurring that "s" sound in the beginning--sssssay.

Looking at the waveform of a real
singer, you can see how the volume
swells up and down

By default, every note entered in Vocaloid has the same value. VEL and DYN have a range from 0 to 127, and the default is right in the middle--at 64. Obviously people don't sing like that. They might slow down or speed up at some parts, and especially they will sing some parts louder and some parts softer.

A good way to practice using VEL and DYN is to take a real singer and try to duplicate it with Vocaloid. In fact, you're all probably familiar with this since most of you use Vocaloid to do covers of songs. But many of you may be concentrating on the phonetics, trying to get the pronunciation right. Of course, that's the most important thing, but tweaking the parameters like VEL and DYN goes a long way into making the performance sound more human.

Lowering the values for VEL will slow down the sound of
the notes, especially the beginning consonant sounds

Let's take a real world example. This clip is of Emmy Rossum singing That's All I Ask of Youfrom The Phantom of the Opera. What is noticeable is how controlled and slowly she is singing. So right away, we know we can use VEL to adjust the notes. Hear how a lot of the consonants are slurred, like in the beginning of the lyrics "say" and "head"?

We can go to the Menu Bar and select View and choose Control Parameters to display the parameter grid at the bottom. Select DYN (the word will turn aqua) and use the trusty pencil tool to adjust the height of the DYN bars for each lyric. I've lowered the values across the board, but especially for words like "say" and "head" to emphasize the slurring of the beginning consonant sound.

We can also hear how the volume changes with slight swells, especially when she sings "summertime." That's a long note, and we can hear it get a little softer in the middle and then louder again towards the end when she sings "time" before she trails off. So we know we can use DYN to adjust for that.

Again, using the pencil tool and making sure that DYN is
selected, we can add in some subtle volume swells to make
the vocals more realistic

You can see from the picture that I've drawn in some volume swells. For some lyrics, I've made it louder in the beginning, and for other lyrics, I've made it louder towards the end, especially if it's a long, drawn out lyric. I probably overdid it in the beginning because when Emmy sings, "Say you'll love me," she is pretty steady with the volume across the board here, so that part really only needed some very minor adjustments.

Give it a try! Download the Emmy Rossum part and try to duplicate it in Vocaloid and let me know how you did! Oh, one other thing--I also adjusted the vibrato in several spots. Emmy uses a lot of vibrato, and you can really hear it in the words "head" and "talk."