Japanese Language Stack Exchange is a question and answer site for students, teachers, and linguists wanting to discuss the finer points of the Japanese language. It's 100% free, no registration required.

Descriptions of Japanese phonology (such as Wikipedia's) usually describe high vowels between voiceless consonants (or word-finally) as "devoiced". For example, the pronunciation of ⟨圧⟩ 'pressure' and ⟨悲観⟩ 'pessimism' are described as:

/aꜜtu/ → [átsu̥]

/hikaɴ/ → [çi̥kãɴ́]

But it often sounds to me like the supposedly devoiced vowel is actually dropped, with the preceding consonant expanding to fill 1 mora's time. Similarly, I have seen word-final devoiced vowels dropped in written pronunciations, such as "/desu/ → [des]" rather than "→ [desu̥]". In this analysis, the pronunciations would be

/aꜜtu/ → [áts] or [áts̩] or [átsː]

/hikaɴ/ → [ç̩kãɴ́] or [çːkãɴ́]

Which of these better describes the actual contemporary Japanese pronunciation? (Further, does this vary by dialect or speech register?)

I don't think the vowels are dropped, and I was taught in Japanese class that they are not dropped.
–
Amanda SJul 14 '11 at 5:32

1

I've seen transcriptions for total deletion too, particularly [des] where [s] then becomes prosodically lengthened to fill the intended timing tier. Though です is such a frequent word I wouldn't be surprised if it had an exceptional pronunciation.
–
taylorSep 21 '12 at 18:15

@AmandaS Your're right, they're not dropped in the sense that at the lexical level they are always most definitely present. When devoicing occurs either the voiceless segment will be the same duration as if it were voiced, or there will be prosodic compensation from a neighbouring segment. In other words, even if it is phonetically absent its prosodic weight still must realize phonologically and phonetically. So it's the inviolability of this prosodic weight that native speakers intuit as the incapacity to be "dropped".
–
taylorSep 21 '12 at 19:00

1

This is really bugging me. Speech recognition textbooks usually assume the signal to symbol conversion has already been done, and so doesn't address the spectral computation to match signal to (candidate) phoneme. I recently found a textbook for analysing the raw speech spectra, but it will be a long while until I can come back here myself and give an answer with actual numerical evidence (you did after all ask for a phonetic description). I encourage anyone to provide a more satisfactory answer than mine! This is probably too specialized to be a bounty question.
–
taylorSep 21 '12 at 19:14

3 Answers
3

I've got an old PDF folder full of papers on Japanese, and I managed to pull up two which might be helpful. (I've been on the search for a full detailed phonetic study of Japanese. Add a comment if you know of some other technical resources!). The first, the open paper Processing missing vowels: Allophonic
processing in Japanese (Ogasawara and Warner, 2009) (from the journal Language and Cognitive Processes) and the second, which I think needs to be purchased, Vowel devoicing and the perception of spoken Japanese words (Cutler & Otake & McQueen,
2008). If you can take a look at those papers, they have spectrogram of the voiced/voiceless sounds and they are quite informative.

Ogasawara has a very informative diagram at the beginning of his paper which displays an oscillogram and spectrogram of two phonetic variants of the non-word /hokita/; one in which the speaker obeys the devoicing (reduction) convention, and the other in which the speaker diligently avoids the devoicing tendency. So, two phonetic forms [hokʲi̥ta] and [hokʲita]. The observation made is that the reduced vowel has no periodic wave or frequency mass resembling a formant, so that the vowel [i̥] appears to be virtually deleted. What is found after the [kʲ] in [hokʲi̥ta] then is "frication noise following the plosive burst of /k/, so that the vowel [i̥] consists of only low amplitude palatalized voiceless noise."

The paper is free, so take a look two spectrograms corresponding to the [-voiced] and [+voiced] forms at the bottom of the figure. The characteristic voice bar which indicates frequency mass at low frequencies (characteristic of phonation) are absent in the devoiced vowel. So, clearly there is some veracity to the term "devoicing". For the [-voiced] vowel, I can't even see any noise in the waveform, so it appears as if [kʲ] is just an extra prolonged stop. Which seems in agreement with Ogasawara's opinion that "phonetically, so-called devoiced vowels are often deleted (Vance, 1987, in press; Yuen, 2000), or a short, low-amplitude vowel may remain (Yuen, 2000)."

It seems that some some authors (Ogasawara, Vance) call it reduced instead of devoiced to skirt the debate between devoicing vs. deletion. However, that is not the end of the phonetic story for the unvoiced vowel, as it has a coarticulatory impact:

Japanese vowels, both unreduced and reduced, cause coarticulation in the
preceding consonant, which allows identification of the /i/ or /u/ even if the
vowel itself is deleted (Ostreicher & Sharf, 1976).

This suggests that the best way to recognize a devoiced vowel is to look for consonants bearing the acoustic cue of this coarticulation as opposed to measuring and comparing for "low amplitude palatalized voiceless noise". Importantly, Ogasawara points out the simple reason why on phonological grounds a vowel must be there in the underlying form:

The reduced vowels are considered to be present at least in the underlying form, because they cause coarticulation, and because, if they were not present at all, this would leave consonant clusters (e.g., /kt/ in [k(i)ta] ‘North’) that are otherwise
phonotactically impossible in Japanese.

And other interesting ideas/observations:

Cutler et al. (in press) find that listeners do not restore reduced
vowels at an early, automatic stage of processing. Furthermore, they find
that the number of words in the lexicon containing a given string with a
reduced vowel affects how likely listeners are to assume a reduced vowel is
present. They conclude that Japanese listeners’ restoration of reduced vowels
happens during lexical, rather than prelexical, processing.

Reduced vowels are acoustically weak, which might make them
harder to process. However, phonotactic knowledge (which indicates that a
vowel must be present because of the consonant cluster) should facilitate the
recognition even of reduced vowels. Moreover, language-specific knowledge
of the allophonic alternation should facilitate recognition of reduced and
unreduced vowels in their appropriate environments (e.g., [(i)] in [k(i)ta]
‘North’ and [i] in [itRigo] ‘strawberry’).

Although devoicing is not obligatory, analyses of
the Corpus of Spontaneous Japanese ͑Maekawa, 2003 show
that it is highly probable ͑over 98% in some environments;
Kondo, 2005; Maekawa and Kikuchi, 2005.

And on the perceptibility of the devoiced vowel, Cutler says:

The effect of this devoicing is the creation of sequences
of consonants not separated by the periodic articulation
normally associated with vowels. In contrast, insertion of
a vowel into a consonant cluster ͑e.g., fillum for film͒ makes
recognition easier, in part because the consonants in the clus-
ter indeed become easier to identify if separated.

So, this might not have been all the information you were looking for, but beyond this you start to get into statistical analysis of signals and techniques from experimental design, which I don't fully understand.

You could always take this program fon.hum.uva.nl/praat, record Japanese speakers and generate data and analysis yourself. Interpreting the analysis and the actual testing of hypotheses would be the difficult part.
–
taylorJul 16 '12 at 17:03

I guess that it depends on dialects, but when vowels /i/ and /ɯ/ are “devoiced” in the Tokyo dialect, these vowels are actually dropped and the preceding consonant fills the mora. Moreover, if the vowel is /i/, the consonant is palatalized.

What about when the preceding consonant is a stop, e.g. in /kiku/?
–
Mechanical snailJul 14 '11 at 23:27

1

@Mechanicalsnail: I do not know the answer myself. Let me quote the page I referred to (translation by me, but be careful because I do not fully understand this description): “無声化した 「キ」、「ク」 の発音については、母音のかわりに、激しい気息をともない、語末では、摩擦子音 [ç], [x] で拍を延長するというクセがあるようです。” (When ki and ku are devoiced, it seems that the following tendency exists. The vowels are replaced by strong aspiration. At the end of a word, the mora is prolonged by using fricative consonants [ç] and [x].) (more)
–
Tsuyoshi ItoJul 15 '11 at 0:05

(Cont’d) Personally, I do not know how to distinguish “devoiced /i/ and /ɯ/” and the combination of aspiration and [ç]/[x]. (For example, when whispering, I think that I pronounce /i/ and /ɯ/ exactly as this combination.) This is why I cannot be sure about the answer to your question.
–
Tsuyoshi ItoJul 15 '11 at 0:09

I see. The distinction is very subtle: [ç]/[x] involves frication (audible noise due to turbulent airflow), versus just a whispered vowel. But practically, they sound almost identical. See en.wikipedia.org/wiki/….
–
Mechanical snailJul 15 '11 at 0:13

Linguists know better than to say the sounds of a language are dependent on any written form it may have. Every language was a spoken language before it was a written language. Written Japanese has undergone well documented relatively recent adaptations and reforms which could just have easily dealt with such issues. Who are these linguists anyway?
–
hippietrailJul 14 '11 at 13:39

@hippietrail: Some (Japanese) guy I met a few years ago. I agree that Japanese was spoken before written, but, wasn't the writing chosen to match the speech then? There is a "n" sound, and a ん character. But there seems to be no "s" sound, as there is no concept of the consonant alone. That's why many often don't realise that "し" sounds irregular to us…
–
AxioplaseJul 14 '11 at 14:35

1

Does that “linguist” claim that because an “s” sound is always followed by a vowel in Japanese, the vowel in す should be always pronounced? That sounds like a circular reasoning.
–
Tsuyoshi ItoJul 15 '11 at 3:06

@Tsuyoshi: Basically, that was it. The writing was chosen according to the speaking, and when writing got ubiquitous, it enforced pronunciation to all. Not circular, but restricted evolution. I think it's quite reasonable, though.
–
AxioplaseJul 15 '11 at 3:39

2

The only problem with that theory is that in the actual evolution, some dialects chose to drop vowels, even without consulting the opinion of a linguist.
–
Tsuyoshi ItoJul 15 '11 at 3:44