As evidenced by a couplerecentposts, many Japanese learners make it surprisingly far into their studies prior to the shocking discovery that pitch accent exists in Japanese, and that understanding how to acquire it correctly it is essential for speaking Japanese without an obvious accent.

Whether or not to dive into this subject (and when) is apparently slightly controversial, but for those who have decided to take the plunge, here’s a place to share resources, strategies, and questions. I’ll keep a running list of all the resources so please kindly share what you know of!

We do have a couple users with deep knowledge of the subject like @jjatria, and I’m fairly sure Dogen lurks in these forums…

Scripts/Plug-ins

References with pitch accent info

Apple J-2-J - if you are on a mac, open up the dictionary app and turn on the Japanese to Japanese dictionary and you’ll find one of the clearest references for pitch-accent information available… the information comes from the 大辞林, and you can see the charts here.

Forvo – get recordings of native speakers pronouncing almost any word, and request items they don’t have (I’ve used the request feature and got a pronunciation within a day)Forvo.com

Japanese Accent Study Website
(A great little online pitch-accent resource where you can look at lists of words by groups (names, counters, verbs, etc) with pitch-accent indicated. A good introductory resource to catch up on learning pitch accent for words you might already know!)

(just a random reddit post but very clearly written – please confirm accuracy!)

Here are the things I think are not entirely accurate:

i. English uses stress.
Stress contains three elements:
A rise in pitch.
A rise in volume.
A slight lengthening of the syllable.

English also uses vowel reduction

English also uses a feature called “vowel reduction”, which basically reduces the phonetic information in non-important vowels, making the accented vowel more prominent by comparison.

Compare the the second /o/ in <photographer> with that in <photograph>. In the first one, in which it is accented, the sound is much more o-like, while in the second, where it is unaccented, it becomes what is caled a schwa, or a central, neutral vowel.

This is important because Japanese does not do vowel reduction, which basically means that vowels in Japanese do not generally change depending on whether or not they are accented.

iii. What is pitch?
It is simply making your voice higher or lower.

Pitch does not technically depend on you

This might be nit-picking, but here goes.

Pitch is the perception of a sound as being higher or lower. Since it is a perceptual feature, it exists only in the brains of those that listen to a sound. It is not an acoustic feature, nor a feature of the sound: it is a feature of the perceived sound.

What is a feature of the sound is its frequency, and frequency is correlated with pitch. But pitch depends on the ability of the listener to perceive those changes as significant differences. A sound can be produced with a higher frequency and still be perceived to have the same pitch.

Think of people that are “tone-deaf”: they can hear two musical notes that are acoustically, factually different, but they will have difficulty identifying them as having a different pitch (because for them, they won’t).

iv. Most important rules

Japanese has two pitches tones: high and low.

The first two syllables mora of a word must be different pitches tones.

[See below]

There can only be one pitch drop in a word. (I.e. if a low syllable mora comes after a high syllable mora, then there cannot be another high syllable tone in the word.)

3. The default pitch pattern for words is for the first syllable to be low and the rest to be high.

Default accent position

This changes depending on the historical origin of the word. It is true that most やまと言葉 are unaccented (low and then high, see photo), but the vast majority of loan words (words that have been borrowed from another language) are indeed accented.

For words that are accented, the default position is for them to be accented on the third mora from the end.

This is from my dissertation, and includes much more detail about this:

An easier way to remember this is that the accent is the drop. If the word is unaccented, there is no drop. So the entire word must be high (because it never falls). But the first and second mora must be different, so the first one has to be low (because the second, and all the others, are high). So you get LHHH…

vii. What counts as a syllable mora in terms of accent marking?

Obvious stuff: ka ki sa shi su tsu etc.

N is always counted as its own syllable mora.

Kyo, kyu, jyu etc. are one syllable mora.

Long vowels. (I.e. Kyuu = 2 syllables moras kyu + u.)

[…]

(Technically these “syllables” are actually called morae, but people are generally more familiar with the word syllable, so I went with that for this 2 minute overview.)

Syllables and moras

This is a pretty big difference, so I don’t think it is useful to “dumb it down”.

With notable exceptions, the Japanese accent system (and certainly the accent system of standard Tokyo Japanese) is not syllable-based it is mora-based. This is also true for traditional Japanese metric: haiku count 5-7-5 moras, not syllables.

A mora is not “a special kind of syllable”. They are entirely different things. The best definition I’ve found for a mora is that a mora is “something of which a long syllable consists of two and a short syllable of one”. It is a metric unit that sits between individual sounds and syllables.

In the case of Japanese, this is easy to remember because the kana alphabets are not syllabic, but moraic: anything that has its own kana (or kana digraph) is a mora.

This includes the katakana vowel lengthener 「ー」, the long vowels, the 「ん」, the 「っ」, etc. All of those are individual moras, and in traditional Japanese metric they should all have the same duration.

So, when you read that the default accented pattern is “three from the back” (「後ろから三番目」), those are always going to be mora.

Your note must have a field called “Expression” (the content of this field is what the plug-in will look up) and another called “Pronunciation” (this is where it will put the accent pattern).

If those are set up and the pitch-accent plug-in is installed, then if you type a word that the plug-in recognizes into the “expression” field and then press tab, the plug-in should automatically put the pitch-accent pattern into the “pronunciation” field.

Pitch is the perception of a sound as being higher or lower. Since it is a perceptual feature, it exists only in the brains of those that listen to a sound. It is not an acoustic feature, nor a feature of the sound: it is a feature of the perceived sound.

What is a feature of the sound is its frequency, and frequency is correlated with pitch. But pitch depends on the ability of the listener to perceive those changes as significant differences. A sound can be produced with a higher frequency and still be perceived to have the same pitch.

Think of people that are “tone-deaf”: they can hear two musical notes that are acoustically, factually different, but they will have difficulty identifying them as having a different pitch (because for them, they won’t).

this sort of blew my mind – I have a friend who wrote a textbook for composers about writing for percussion instruments, and in the process of editing it we spent a year arguing about pitch terminology. In the percussion world, many instruments are commonly referred to as “un-pitched” when the reality is that they actually just have pitch which is unclear/hard to identify, or not notated for various reasons. We wanted to throw out this term because we felt it was misleading ("but… that woodblock isn’t un-pitched… it’s a D#!)

The concept that pitch is a psychological phenomenon whereas frequency is a measurable scientific fact never entered the discussion – we always saw the two terms as just different ways of describing the exact same thing. That may have been a mistake!

So besides human tone-deafness, are there other situations that could cause one person to hear an upstep and another to actually hear a downstep in language?

In music certain instruments with lots of upper partial content can exhibit this: play a low C and a high C on a toy piano and some people will hear an upstep while others will hear a downstep. And if you’ve ever tried to transcribe IDM or music with lots of layered synths, you’ll know how frustrating this phenomenon can be.

So besides human tone-deafness, are there other situations that could cause one person to hear an upstep and another to actually hear a downstep in language?

That would be quite dramatic. I’ve never heard of language differences that would make some people hear an upstep and others hear a downstep. But it is relatively common to see people that do not hear a difference where others do, and vice-versa.

There are also levels at which you can approach this:

One is simply being able to tell whether two sounds (which might not even be speech sounds) are different, and that is related to the notion of the JND, or Just Noticeable Difference: the minimum physical difference that is actually perceived as such by the brain. In intonation, this is most commonly reported to be between 1.5 and 2 semitones. This would be a phonetic difference.

Another is whether the differences you hear are interpreted as meaningful, or linguistically relevant (in other words, a phonological difference). This question has a lot of applications also for second language learning, and there’s been a lot of studies on cross-linguistic perception. Basically: how difficult is it for speakers of A to acquire the phonological distinctions of B? And why?

The most famous are probably studies done with French and Spanish speakers. In French the position of the stress is neutralised (= it doesn’t change), but in Spanish it is a fundamental part of the language. So the question was whether French native speakers would be able to hear those differences. The initial study found that they were terrible at it, and proposed the notion of “stress deafness”. But this has been softened quite a bit since then, as people have realised that it is not so clear cut (French speakers can learn to perceive those differences, and some of them do not need to be taught at all).

Sorry for the intrusion, but: why? I mean, is the goal only to mask your accent and it has no meaningful difference?

Is improving your ability to be easily understood by native speakers an unworthy goal? The closer you get to natural Japanese, the smoother communication becomes. There are meaningful differences between homonyms that are distinguished by pitch, and even if context can solve it, it is distracting to a listener if your intonation is all over the place.