Vocaloid What does a Japanese Vocaloid's singing sound like to native japanese speakers?

I don't understand japanese, so i've always wondered.
does a japanese vocaloid's singing actually sound like accurate japanese? or does it kinda sound distorted like how a lot of english voice banks do to english speakers like me?
and even though i put this under vocaloid, does the same go for a decent quality UTAU like Yamine Renri or Namine Ritsu etc
if not, is it possible that a UTAU like Namine Ritsu can pronounce japanese better than a vocaloid like Gumi or vice versa?

I don't understand japanese, so i've always wondered.
does a japanese vocaloid's singing actually sound like accurate japanese? or does it kinda sound distorted like how a lot of english voice banks do to english speakers like me?
and even though i put this under vocaloid, does the same go for a decent quality UTAU like Yamine Renri or Namine Ritsu etc
if not, is it possible that a UTAU like Namine Ritsu can pronounce japanese better than a vocaloid like Gumi or vice versa?

Click to expand...

Tbh I think something like this can be detected even by a non-native Japanese speaker. Listen at how the syllables/words are being sang/spoken, and if they sound intelligible when a vocaloid sings them. Utau and Vocaloid produce different sounds, and some Utau are more realistic than others and even sound more realistic/human than Vocaloids, so I'm sure many can produce clear vocals that make it easier for listeners to understand/make out whatever it is they're saying.

Like with English Vocaloids, I'd suspect that a Japanese Vocaloid's, well, Japanese, would be decipherable, but have that slurred sound(hard to describe; the sound Vocaloid outputs in general) with decent/no tuning done. Not everything will sound entirely clear, which also helps make sense of the fact that just about all Vocaloid songs have subtitles. With really spectacular tuning, however, where a Vocaloid can sound more human-like, and therefore may sound more understandable.

I imagine that a Japanese speaker finds vocaloids to be uncanny like an English speaker would with libraries aimed at the language: the pronunciation is often either too perfect and precise or just a little "off" from how an actual human vocalist would approach a word/phrase (sometimes the "off" sound is due to dialect/accent that one's ear may not be used to; other times, it's with tuning or lack thereof; with older (V1-V3) vocaloids, the "off" sound is also due to missing phonetic data and/or lack of editing: no schwa, no tap, every consonant being pronounced rather than utilization of glottal endings).

The main difference that makes UTAU sound a little more human is that it - usually - captures variations in recordings better. With Vocaloid, you don't hear much difference between a normal library and it's power variation cause the consonants are consistent where as with UTAU, you can have a library that has normal recordings as well as aspirated recordings. To give a vocaloid example: UNI - being a Korean library - has ㄱ (g/k), ㄲ (kk), ㅋ(k). The difference between all three is that g/k is practically unvoiced (a non-aspirated g like 'good'), kk is a non-aspirated k like in "skip", and k is aspirated like in "kite." Only English libraries have this level of differences in consonants. The closest thing Japanese libraries have to a nonstandard pronunciation is Sachiko and that's due to specialization.

As in any other language, it depends on how much work you put into making the crossfades and syllables sound realistic. I otherwise have not gotten a straight answer to this question, mostly because I don't know how to ask it correctly.

I'm just half Japanese but I can tell that I can understand most of Japanese Vocaloid without lyrics, as long as no weird "tuning" in consonant where shouldn't be, let's say, no extra "n" syllable everywhere (*cough* Kyaami's tuning tuto) made "s" sound extra long with no good reason or stopping of "k/t/p" where shouldn't be.

In Japanese there's difference between "kite/きて" and "kitte/きって" (little "tsu" in this case makes a noticeable small pause between "ki" and "te" unlikely in "kite" which is just continuous) many Oversea users who knows none Japanese often puts that little "tsu" in places where is should NOT be. And it sound WEIRD and UNNATURAL to me. Sometimes people do this is "an accent" purpose but for me it's just... annoying. (IDK where I should compare this but I guess that this is as annoying as someone cannot say "r" in English.)

And then, oh boy, there are some Vocaloids that are recorded(?) so funky way that are quite difficult understand most of time (without hardcore tuning/mixing): Arsloid and Yumemi Nemu and Gachapoid Ryuto. I have no idea why but Nemu's voice sound like her VP was far away(?) from microphone or tried too hard to "accent" her voicebank. Arsloid, duh, I'm disappointed still on him, too much quality issues. Gachapoid, eh, do I need to explain this, his voice is extra froggy and not human like at all.

I do understand Sachiko most of time but because she's genre specific, her consonants are recorded more strongly than in normal Vocaloid. With fast songs she sounds very unnatural with k/t/p sound because it's pronounced so... hardly. If she had "normal k/t/p" sound variation I bet that she would be much easier Vocaloid to use and has more singable genres.

Then there are few Vocaloid with Japanese VB with non-native VP: Yohioloid, SeeU and Luo Tianyi (coming in this year).

I am super honest here but Yohioloid sounds most "native-like" from all 3. Maybe he has mild Swedish accent but it's relatively SMALL, I can hear that his VP definitely KNOWS Japanese and he has worked on it.

I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.

Luo Tianyi Japanese doesn't have many demos yet but what I have hear, her Japanese is not bad as being Chinese (Chinese people struggles a lot with Japanese r, u, ts, ch and sometimes b/g/d too what I have heard from Chinese UTAU). She still have strong-ish "Chinese melody" in her voice that makes me need to more focus to understand her but her pronouncing is better than most of Chinese UTAU with Japanese VB in my opinion (this does not mean that I dislike Chinese UTAU, I'm sure that most of users are doing their best but it's sad that only pinch of them are actually very understandable without lyrics).

I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.

Click to expand...

You could probably compare SeeU's Japanese to Miku's English where they had to learn as they went. I honestly wouldn't be surprised if Kim Dahee's level of Japanese was/is beginner level. Korean groups generally get prepped for international promotions by being taught Japanese and English (best example is probably BoA - iirc, Japanese speakers couldn't tell that she wasn't native and her English album wasn't really accented), but GLAM more than likely didn't receive that kind of investment. They were axed with only 3 or 4 promotional singles to their name, so... more of a one off group than anything.

I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.

Click to expand...

I'm convinced they didn't actually record any new samples for her JP VB, but just recycled the nearest KR sample. (eg, か=카) SeeU's Japanese accent is literally straight up Korean.

I'm convinced they didn't actually record any new samples for her JP VB, but just recycled the nearest KR sample. (eg, か=카) SeeU's Japanese accent is literally straight up Korean.

Click to expand...

No wonder then wow, that's super lazy At least for Yohioloid and Luo, they made separate Japanese recordings which I'm glad about (Chinese sounds are so different compared to Japanese anyway, so they had to made new recording for that anyway. Like... Chinese cannot have [ e ] [ n ] sound alone, no ki/gi, no y-glides etc. While Korean has technically all needed basic sounds for Japanese plus extra amount of y-glide sounds, so...)

No wonder then wow, that's super lazy At least for Yohioloid and Luo, they made separate Japanese recordings which I'm glad about (Chinese sounds are so different compared to Japanese anyway, so they had to made new recording for that anyway. Like... Chinese cannot have [ e ] [ n ] sound alone, no ki/gi, no y-glides etc. While Korean has technically all needed basic sounds for Japanese plus extra amount of y-glide sounds, so...)

Click to expand...

I don't know that they definitely did that, but it certainly sounds like it. Or at best they just had Dahee read off a Japanese reclist written in Korean. Like, I can hear SeeU pronouncing ん as 응.

Shortcuts wouldn't surprise me - honestly speaking - given how ambitious their goals were at the time... Bilinguals - especially the first time through - always sound awkward af or are done lazily to meet deadlines (Yohioloid may be the 1 exception). That's not unique to SeeU.
- Luka English V2 had a lot of missing data (at the time, it was more like a gimmick and made for Japanese users only). V4 had more data, but it was poorly programmed, even for power vocal standards (power vocaloids always have some degree of choppiness, but Luka V4 took the cake)
- Macne Nana... the less said the better. I could use a Japanese library and not tell the difference. That's how accented she is.
- Miku Eng: it's gotten better with V4, but still awkward due to accent issues (r/l/w).
- Dare I mention the bad experiment that was Sonika - and that was a native vocal. Sounds like they tried to experiment with "How cheaply can we make a vocaloid, using a singer that doesn't have access to a studio?" She's not a bilingual, but I do put her in the "ambitious for the time" category. Non-studio vocals have gotten better over time.