I'm interested in the musicality of the human voice. Normally, it's hard to study because one gets distracted by the meanings of the words.

I'd like to modify recorded human speech so that it sounds, as much as possible, like a musical instrument -- I'd hear the notes and rhythms, without distinct vowel or consonant sounds. It's probably more important to blur consonants than vowels. but I'm not certain of that.

Please suggest a good combination of effects or filters for this purpose. I'll buy a commercial plug in, if necessary, though, of course, I'd prefer to do it free, if possible.

That's a good effort, and I appreciate it, but, no, that's not what I want.

I want to study the musicality of familiar voices, speaking English.

For instance, I'm interested to learn if I can recognize the voices of familiar speakers from the melody and rhythm of their speech, while obscuring the other qualities of their voices.

Voice pitches typically slide up and down as they go from one syllable to the next, so I don't think patching in sampled human voices would do the trick.

If there's a way to get sampled sounds to follow the pitch and rhythms of a recorded voice, that might work well, but I doubt it's possible and I have no idea how to do that. That's why I was thinking about effects and filters.

If there's a way to get sampled sounds to follow the pitch and rhythms
of a recorded voice, that might work well, but I doubt it's possible
and I have no idea how to do that. That's why I was thinking about
effects and filters.

I don't think you can do what you want strictly with effects and
filters, but there are some vocal tract models that do kind of what
you want.

Vocoders used for telephony typically analyze the speech signal and
separate it into a pitch and a set of parameters that can be
interpreted as describing the shapes and positions of various elements
of the speaker's vocal tract.

These models allow effects like resynthesizing the speech at a
different pitch or with modified vocal tract characteristics.

In order to be useful for telephony, the models also have to handle
non-pitched sounds like sibillance, plosive attacks, etc. It's not
clear what you would want to hear in place of those things in the
modified version of speech you're aiming for.

I think what you're talking about is a bit of a research project, not
something that can be built entirely from canned stuff that already
exists. But vocoder techniques will probably help, as you could
substitute modified filter parameters for the voiced portions of the
speech while leaving the pitch signal alone, thereby causing the same
rise and fall of vocal intonation with different phonemes.

That's a good effort, and I appreciate it, but, no, that's not what I want.

I want to study the musicality of familiar voices, speaking English.

For instance, I'm interested to learn if I can recognize the voices of familiar speakers from the melody and rhythm of their speech, while obscuring the other qualities of their voices.

Voice pitches typically slide up and down as they go from one syllable to the next, so I don't think patching in sampled human voices would do the trick.

If there's a way to get sampled sounds to follow the pitch and rhythms of a recorded voice, that might work well, but I doubt it's possible and I have no idea how to do that. That's why I was thinking about effects and filters.

More suggestions welcome.
Tim

Great research subject, Tim

You might try (a demo of) Melodyne http://www.celemony.com/cms/index.php?id=358
It analyzes all kinds of characteristics of a sound and makes it possible to transfer these. You can then replace a singers voice with an instrument, for example.

"Melodyne is intelligent: it recognizes the music in an audio file and represents it directly as notes"

One problem, it certainly isn't free, but the demo is. I have been planning for some time to investigate this, but did not make enough time yet.

Vocoders used for telephony typically analyze the speech signal and
separate it into a pitch and a set of parameters that can be
interpreted as describing the shapes and positions of various elements
of the speaker's vocal tract.

These models allow effects like resynthesizing the speech at a
different pitch or with modified vocal tract characteristics.

That's cool, Robert. Unfortunately, I know nothing about this sort of thing. Is there some commercial or shareware vocoder product I could mess around with? Preferably on my Macintosh? (I do have a Windows XP machine, only for dire emergencies

Quote:

I don't think you can do what you want strictly with effects and
filters, but there are some vocal tract models that do kind of what
you want.

Understood. Nevertheless, considering the cost of commercial products, such as Melodyne, I'd like to try -- see how far I can get. If I can reduce speech intelligibility, while preserving melody and rhythm, that would be a step in the right direction.

I wonder if there's a way to prolong decay and attack on short, high amplitude sounds. Those would be consonants, I suppose. Compressing the dynamic range might help, though I've never been able to figure out how to use a compressor. Severely attenuating frequencies above a certain point might help. I'm not very good at this sort of thing, so I haven't gotten very far with it yet, though I'm experimenting.

What about something comparable to a "gaussian blur," as in photoshop? Is there a way to do something like that?

GerardBik wrote:

Quote:

You might try (a demo of) Melodyne http://www.celemony.com/cms/index.php?id=358
It analyzes all kinds of characteristics of a sound and makes it possible to transfer these. You can then replace a singers voice with an instrument, for example.

Thanks a bunch, Gerard. It's gratifying to know that someone is interested in this topic. It's rather obscure, isn't it?

Melodyne might be just the ticket. I haven't had a chance to download the demo and try it yet. Will do soon, and will report back. Hooh, boy! The purchase price is high. Will the VST plugin work with Amadeus and OS X?

The only way to be sure is to try out the demo, but a priori I don't see

any reason for it not too work.

Regarding the main topic of the thread, this sounds indeed like a nice
research project. Looking into vocoders would be a good first step (there
are free ones around as VST plug-ins I think), but it is not clear that the
result would really be what you expect... Regards,