Japanese language

Japanese language, a language isolate (i.e., a language unrelated to any other language) and one of the world’s major languages, with more than 127 million speakers in the early 21st century. It is primarily spoken throughout the Japanese archipelago; there are also some 1.5 million Japanese immigrants and their descendants living abroad, mainly in North and South America, who have varying degrees of proficiency in Japanese. Since the mid-20th century, no nation other than Japan has used Japanese as a first or a second language.

General considerations

Hypotheses of genetic affiliation

Japanese is the only major language whose genetic affiliation is not known. The hypothesis relating Japanese to Korean remains the strongest, but other hypotheses also have been advanced. Some attempt to relate Japanese to the language groups of South Asia such as the Austronesian, the Austroasiatic, and the Tibeto-Burman family of the Sino-Tibetan languages. Beginning in the second half of the 20th century, efforts were focused more on the origins of the Japanese language than on its genetic affiliation per se; specifically, linguists attempted to reconcile some conflicting linguistic traits.

An increasingly popular theory along that line posits that the mixed nature of Japanese results from its Austronesian lexical substratum and the Altaic grammatical superstratum. According to one version of that hypothesis, a language of southern origin with a phonological system like those of Austronesian languages was spoken in Japan during the prehistoric Jōmon era (c. 10,500 to c. 300 bce). As the Yayoi culture was introduced to Japan from the Asian continent about 300 bce, a language of southern Korea began to spread eastward from the southern island of Kyushu along with that culture, which also introduced to Japan iron and bronze implements and the cultivation of rice. Because the migration from Korea did not take place on a large scale, the new language did not eradicate certain older lexical items, though it was able to change the grammatical structure of the existing language. Thus, that theory maintains, Japanese must be said to be genetically related to Korean (and perhaps ultimately to Altaic languages), though it contains Austronesian lexical residues. The Altaic theory, however, is not widely accepted.

Dialects

The country’s geography, characterized by high mountain peaks and deep valleys as well as by small isolated islands, has fostered the development of various dialects throughout the archipelago. Different dialects are often mutually unintelligible; the speakers of the Kagoshimadialect of Kyushu are not understood by the majority of the people of the main island of Honshu. Likewise, northern dialect speakers from such places as Aomori and Akita are not understood by most people in metropolitan Tokyo or anywhere in western Japan. Japanese dialectologists agree that a major dialect boundary separates Okinawan dialects of the Ryukyu Islands from the rest of the mainland dialects. The latter are then divided into either three groups—Eastern, Western, and Kyushu dialects—or simply Eastern and Western dialects, the latter including the Kyushu group. Linguistic unification has been achieved by the spread of the kyōtsū-go “common language,” which is based on the Tokyo dialect. A standardized written language has been a feature of compulsory education, which started in 1886. Modern mobility and mass media also have helped to level dialectal differences and have had a strong effect on the accelerated rate of the loss of local dialects.

Literary history

Written records of Japanese date to the 8th century, the oldest among them being the Kojiki (712; “Records of Ancient Matters”). If the history of the language were to be split in two, the division would fall somewhere between the 12th and 16th centuries, when the language shed most of its Old Japanese characteristics and acquired those of the modern language. It is common, however, to divide the 1,200-year history into four or five periods; Old Japanese (up to the 8th century), Late Old Japanese (9th–11th century), Middle Japanese (12th–16th century), Early Modern Japanese (17th–18th century), and Modern Japanese (19th century to the present).

Grammatical structure

Through the centuries, Japanese grammatical structure has remained remarkably stable, to the degree that with some basic training in the grammar of classical Japanese, modern readers can readily appreciate such classical literature as the Man’yōshū (compiled after 759; “Collection of Ten Thousand Leaves”), an anthology of Japanese verse; the Tosa nikki (935; The Tosa Diary); and the Genji monogatari (c. 1010; The Tale of Genji). Despite that stability, however, a number of features distinguish Old Japanese from Modern Japanese.

Phonology

Old Japanese is widely believed to have had eight vowels; in addition to the five vowels in modern use, /i, e, a, o, u/, the existence of three additional vowels /ï, ë, ö/ is assumed for Old Japanese. Some maintain, however, that Old Japanese had only five vowels and attribute the differences in vowel quality to the preceding consonants. There is also some indication that Old Japanese had a remnant form of vowel harmony. (Vowel harmony is said to exist when certain vowels call for other specific vowels within a certain domain, generally, within a word.) That possibility is stressed by the proponents of the theory that Japanese is related to the Altaic family, where vowel harmony is a widespread phenomenon. The wholesale shift of p to h (and to w between vowels) also took place relatively early, such that Modern Japanese has no native or Sino-Japanese word that begins with p. The remnant forms with the original p are seen among some Okinawan dialects; e.g., Okinawan pi ‘fire’ and pana ‘flower’ correspond to the Tokyo forms hi and hana.

Japanese syntax also has remained relatively stable, maintaining its characteristic subject–object–verb (SOV) sentence structure. A notable change in that domain is the obliteration of the distinction between the conclusive form—the finite form that concludes a sentence—and the noun-modifying form exhibited by certain predicates. For example, in early Japanese otsu and tsuyoshi were conclusive forms, respectively, of the verb ‘to drop’ and the adjective ‘to be strong.’ When these words were used as noun modifiers, the forms were inflected as otsuru, tsuyoki. The distinction between conclusive forms and noun-modifying forms played an important role in the phenomenon of syntactic concord that, for example, called for the noun-modifying forms of predicate even in concluding the predication when a subject or some other word was marked by particles such as the emphatic zo or the interrogative ka or ya. That system of syntactic concord deteriorated in Middle Japanese, and the distinction between the conclusive forms and the noun-modifying forms was also lost, the latter dominating the former. Such modern forms as ochiru ‘to drop’ and tsuyoi ‘to be strong’ are the descendants of the earlier noun-modifying forms.

A single most important development in the history of Japanese is the acquisition of the nativized writing systems that took place between the 8th and the 10th centuries. The Japanese vocabulary has been constantly enriched by loanwords—from Chinese in earlier times and from European languages in more recent history.

Linguistic characteristics of modern Japanese

In Japanese phonology, two suprasegmental units—the syllable and the mora—must be recognized. A mora is a rhythmic unit based on length. It plays an important role especially in the accentual system, but its mundane utilization is most familiar in the composition of Japanese verse forms such as haiku and waka, in which lines are defined in terms of the number of moras; a haiku consists of three lines of five, seven, and five moras. A word such as kantō ‘gallantly’ consists of two syllables kan and tō, but a Japanese speaker further subdivides the word into the four units ka, n, to, and o, which correspond to the four letters of kana. In poetic compositionskantō is counted as having four, rather than two, rhythmic units and would be equivalent in length to a four-syllable, four-mora word such as murasaki ‘purple.’ While ordinary syllables include a vowel, moras need not. In addition to the moraic nasal seen in kantō above, there are several consonantal moras. These are the first of the double consonants—e.g., kukkiri ‘distinctly,’ sappari ‘refreshing,’ katta ‘bought.’ In the traditional phonemic analysis, the moraic nasal is analyzed as /N/ and the nonnasal moraic consonant as /Q/, and their phonetic values are determined by the following consonant (e.g., /kaNpa/, pronounced kampa, ‘cold wave,’ /kaNtoo/, pronounced kantoo, ‘gallantly,’ /kaNkoo/, pronounced kaŋkoo, ‘sightseeing,’ /haQkiri/, pronounced hakkiri, ‘clearly,’ /yaQpari/, pronounced yappari, ‘as expected’), except for an /N/ in final position, which is pronounced as a nasalized version of the preceding vowel (e.g., /hoN/, pronounced hoõ, ‘book,’ /seN/, pronounced seẽ, ‘thousand’). Long vowels count as two moras, and thus ōkii ‘big’ is a two-syllable (ō-kii), four-mora (o-o-ki-i) word.

Both moras and syllables play an important role in the Japanese accentual system, which can be characterized as a word-pitch accent system, in which each word (as contrasted with each syllable as in the prototypical tone languages of Southeast Asia) is associated with a distinct tone pattern. In Tokyo, for example, hashi with a high-low (HL) tone denotes ‘chopstick,’ but with a low-high (LH) tone it denotes ‘bridge’ or ‘edge, end.’ In Kyōto, on the other hand, hashi with a high-low tone means ‘bridge,’ and with a low-high tone it means ‘chopstick,’ whereas the word for ‘edge, end’ is pronounced with a flat high-high tone. The accentual system is one of the features that distinguishes one dialect from another, as each dialect has its own system, though certain dialects in the Tohoku region of northeastern Honshu and in Kyushu and some other areas show no pitch contrast.

In the majority of dialects, the pitch change occurs at the mora, not the syllable, boundary. The Tokyo form kan is a monosyllabic word, but, because it is dimoraic, pitch may change from high to low at the mora boundary, yielding kan (spoken with a high-low tone), which means ‘official,’ or (spoken with a low-high tone) ‘sense.’ Syllables, however, are units that determine the number of potential accentual distinctions, so that, given the possibility of unaccented forms, one-syllable words make two potential distinctions, two-syllable words three potential distinctions, and so forth. Thus, a monosyllabic word such as e can be either accented or unaccented and can be realized as a high-tone word (if accented) or as a low-tone word (if unaccented). The distinction, however, can be observed only when the form in question is followed by a particle such as the nominative particle ga; e-ga (LH) means ‘handle [nominative]’ and e-ga (HL) ‘picture [nominative].’ Since the number of potential distinctions is determined by the number of syllables in a word, monosyllabic and dimoraic words make only two potential distinctions. Thus, while there are accented kan-ga (high-low–low) ‘official [nominative]’ and unaccented kan-ga (low-high–high) ‘sense [nominative],’ there is no word pronounced with a low-high–low pitch. In other words, in the Tokyo dialect the number of potential accentual contrasts equals the number of syllables plus one. The absence of stress accent of the English type, the sequences of high-pitched moras as well as those of low-pitched moras, rather than alternating stressed and unstressed syllables, and the mora-timed characteristic together render Japanese speech rather monotonous compared to a stress-accent language like English or a true tone language like Chinese.

Japanese has the following phonemes: 5 vowels /i, e, a, o, u/, 16 consonants /p, t, k, b, d, g, s, h, z, r, m, n, w, j, N, Q/. The high back vowel u is unrounded [ɯ]. That and the other high vowel i tend to be devoiced between voiceless consonants or in final position after a voiceless consonant. The most pervasive phonological phenomena are palatalization and affrication, which turn t, s, d/z, and h into [tʃ], [ʃ], [dƷ], and [ç] before i, respectively, and t and d/z into [ts] and [dz] before u, respectively. The phonemeh also changes to [ɸ] before u. The effects of these processes are seen in inflected forms of verbs as well as in foreign loans—e.g., /kat-e/ ‘win [imperative]’ /kat-anai/ ‘win [negative],’ /kat-oo/ ‘win [cohortative],’ /katʃ-imasɯ/ ‘win [polite],’ /kats-ɯ/ ‘win [present]’; the English word tool becomes /tsɯɯrɯ/, ticket becomes /tiʃketto/, and single becomes /ʃiŋgɯrɯ/.

Grammatical structure

The first major part-of-speech division in Japanese falls between those elements that express concrete concepts (e.g., nouns, verbs, adjectives) and those that express relational concepts (particles and suffixal auxiliary-like elements). The former elements may stand alone, constituting one-word sentences, whereas the latter always are attached to nouns and verbs and express grammatical concepts such as tense, the grammatical relations of subject and object, and the speaker’s attitudes toward the proposition and toward the listener. Japanese verbs and adjectives conjugate and function as predicates without involving a copula (linking verb), whereas non-conjugating nouns and adjectival nominals (e.g., ganko ‘stubborn’) require the copula da in their predication function—e.g., Tarō-ga ringo-o kau (literally, Taro-[nominative] apple-[accusative] buy [present]) ‘Taro buys an apple,’ Yama-ga taka-i (literally, mountain-[nominative] high-[present]) ‘The mountain is high,’ Tarō-wa sensei-da (literally, Taro-[topic] teacher-[copula present]) ‘Taro is a teacher,’ Tarō-wa ganko-da (literally, Taro-[topic] stubborn-[copula present]) ‘Taro is stubborn.’ Predicates show no agreement for person, number, and gender. Nouns do not decline and do not indicate number or gender, while case distinctions are marked by enclitic particles (that is, particles attached to the end of the previous word), as in the examples above.

Japanese, as a consistent subject–object–verb (SOV) language, places modifiers before the modified, so that adjectives and relative clauses precede the modified nouns and adverbs come before verbs. A predicate complex consists of the stem followed by various suffixal elements expressing relational concepts. The order of these and other end-of-sentence, or sentence-final, elements reflect the ordering of meaning types from concrete to subjective to interpersonal; e.g., Ik-ase-rare-ru darō ka ne (literally, go-[causative]-[passive]-[present], [conjecture], [question particle], [final particle]) ‘Will (I) be made to go? What do you think?’

Elements recoverable from the context are freely omitted from Japanese, so that conversation abounds with sentence fragments, which may convey various meanings depending on the context—e.g., Kaita (literally, write [past]) can mean ‘I (he/she/they) wrote (a book, letters, etc.),’ Tarō to Jirō desu (literally, Taro and Jiro [copula polite]) can mean ‘Taro and Jiro came/played,’ ‘I met Taro and Jiro,’ and so on. Some clues for recovering missing elements are provided for by means of honorific forms. When, for example, the verb kaku ‘to write’ is used in its subject honorific form—kakareru or o-kaki-ni naru—the writer referred to is not the speaker, but someone honoured as superior to the speaker. On the other hand, when the humble form o-kaki suru is used, the referent is likely to be the speaker. The addressee honorific form kakimasu is an index of the social relationship of the speaker to the listener, whereas the plain form kaku is used in addressing an equal, a social inferior, or an indefinite audience (as would be used, for example, in newspaper articles and books). The use of honorifics extends to the forms of personal address; one especially avoids use of anata ‘you,’ even in its honorific form anata-sama, when addressing a superior. The reference is usually omitted altogether, and the subject honorific form of the verb in combination with the addressee honorific form may simply be used, as in O-iki-ni narimasu ka ‘Are [you] going?’ If one must address a superior, that person’s title or a kinship term is used, as in Sensei-wa o-iki-ni narimasu ka ‘Are you going, Teacher?’ Personal terms referring to the first person and particles that end the sentence also indicate the speaker’s sex; opposed to the sex-neutral term watakushi for the first person are male forms boku and ore and typically female forms watashi and atashi. Ze and zo are final particles used by male speakers, while wa and wa yo are used exclusively by females.

The Japanese language exhibits a number of characteristic grammatical constructions not found in English and other European languages. An English sentence such as John came translates into two different expressions in Japanese. The sentence exhibiting the topic construction John-wa kita (John-[topic] came) contrasts with the basic sentence John-ga kita (John-[nominative] came), and the former is used when the referent of the wa-marked nominal (i.e., John) is the topic of discussion, whereas the nontopic sentence simply describes the event in a neutral manner. The structure A-wa B-da (A-[topic] B-[copula]) bears a heavy functional load in Japanese. In addition to its basic identificational function (e.g., Kore-wa hon-da ‘This is a book’), the construction, supported by its context, is used to express a variety of meanings; e.g., Boku-wa ringo da (literally, ‘I am an apple’) can mean ‘I have decided to eat an apple,’ ‘I am going to pick apples,’ and so on; Boku-wa Kōbe-da (literally, ‘I am Kōbe’) can mean ‘I am going to Kōbe,’ ‘I am from Kōbe,’ ‘I am a fan of Kōbe,’ ‘I live in Kōbe,’ ‘I get off the train in Kōbe,’ and so on.

Repetitive expressions abound in Japanese, and they profoundly affect both morphology and syntax. Examples of repetition include the use of syllable reduplication in various onomatopoeic expressions (e.g., ton-ton symbolizes a light knocking sound, don-don symbolizes a heavy banging noise), the formation of plurals for certain nouns (e.g., yama-yama ‘mountains,’ hito-bito ‘people’), and the use of doubling in adverbial phrases for emphasis (e.g., hayaku-hayaku ‘quickly, quickly’). Additionally, the repetition of phrases yields a number of characteristic constructions of Japanese—e.g., yome-ba yomu-hodo omoshiroi (literally, read-if read-to-the-extent interesting) ‘the more (I) read, the more interesting it is,’ katta-ra katta-de ato-ga komaru (literally, bought-if bought-at afterward-[nominative] suffer) ‘If [I] bought [it] after all, then it would become troublesome afterward [I would regret it].’

Japanese vocabulary consists of four lexical strata: native vocabulary, Sino-Japanese words, foreign loans, and onomatopoeic expressions. Each stratum is associated with phonological and semantic characteristics. The native vocabulary reflects the socioeconomic concerns of traditional Japanese society, which were centred on farming and fishing. The words associated with rice, a staple food in Japan, clearly delineate the form or state of the rice to which they refer; the rice plant is ine, raw rice is kome, and cooked rice is either gohan or meshi. Both gohan and meshi are used to refer to meals in general, as an English speaker might use the word bread in the phrase ‘our daily bread.’ Another example of native vocabulary is the variety of names given to certain types of fish according to their size.

Some Chinese words are generally believed to have been introduced into Japan during the 1st century ce, or possibly before that. A systematic introduction of the Chinese language, however, occurred about 400 ce, when Korean scholars introduced Chinese books to Japan. Sino-Japanese words now constitute slightly more than 50 percent of the Japanese vocabulary, a proportion comparable to that of Latinate words in the English vocabulary. Both Chinese or Chinese-based words in Japanese and Latin or Latin-based words in English are also similar in their tendency to express abstract concepts and to make up a great part of the academic vocabulary. Contrary to what is suggested by the term kan-go ‘Chinese word,’ a large number of Sino-Japanese words were actually coined in Japan, using existing Chinese characters. Forms such as shakai ‘society’ and kagaku ‘science’ have been borrowed back into Chinese and adopted by Korean through the medium of shared Chinese characters.

Loanwords other than those constituting the stratum of Sino-Japanese words are lumped together as gairai-go, literally ‘foreign-coming words.’ In the contemporary Japanese vocabulary, English words dominate that category, with slightly more than 80 percent. Also evident are the linguistic legacies of 16th-century Portuguese, Spanish, and, in particular, Dutch missionaries and traders, as in such Modern Japanese words as pan ‘bread’ (from Portuguese paõ), tabako ‘tobacco’ (from Portuguese tabaco), tenpura ‘[English tempura, a deep-fried dish]’ (from Portuguese tempero), biiru ‘beer’ (from Dutch bier), penki ‘paint’ (from Dutch pek), and orugōru ‘music box’ (from Dutch orgel). As illustrated in the last example, foreign loans are phonologically fully Japanized, with vowels appropriately inserted or appended and with occasional consonantal adjustments, although an initial p, which is lacking in Japanese, is left intact.

In fact, only the vocabularies of the native and the Sino-Japanese strata of Modern Japanese lack an initial p. It occurs quite frequently in the onomatopoeic vocabulary—e.g., pachi-pachi (referring to hand-clapping sounds), piku-piki (referring to a slight repetitive movement of an object), piri-piri (referring to a state of annoyance or irritation). As these examples suggest, Japanese sound symbolism encompasses not only mimetic expressions of natural sounds but also those that depict states, conditions, or manners of the external world as well as those symbolizing mental conditions or sensations. Sound-symbolic words permeate Japanese life, occurring in animated speech and abounding in literary works of all sorts.

Writing systems

The earliest attempts to write Japanese involved the use of not only Chinese characters but also Classical Chinese grammar, as is evident in the preface to the 8th-century Kojiki. Within some 50 years, by the time the Man’yōshū was completed, the Japanese had begun to use the sounds of Chinese character names to write Japanese phonetically. For example, the Japanese word yama ‘mountain’ was written phonetically by using the character sounding like ya with another character sounding like ma. Although there are earlier examples of the phonetic use of Chinese characters (such as in the songs of the Kojiki itself), it is known among Japanese grammarians as man’yō-gana, because its expression is most diversified in the Man’yōshū.

Two kinds of kana, or syllabic writing, developed from man’yō-gana. Katakana, which is angular in appearance, developed from the abbreviation of Chinese characters, and hiragana, rounded in appearance, by simplifying the grass (cursive) style of writing. Originally used as mnemonic symbols for reading Chinese characters, kana were eagerly adopted by women with literary aspirations; these women had been discouraged from learning Chinese characters, which belonged to the male domain of learning and writing. Murasaki Shikibu’s 11th-century Genji monogatari, considered by many to be Japan’s greatest literary achievement, was written almost entirely in hiragana. In contemporary Japanese writing, Chinese characters (kanji) and hiragana are used in combination, the former for content words and the latter for words such as particles and inflectional endings that indicate grammatical function. Katakana are used largely for foreign loanwords, telegrams, print advertising, and certain onomatopoeic expressions.

The use of kana made it possible to write a word in two ways. The Japanese word for ‘mountain’ could be written in kana (phonetically) by using two characters—that for ya and that for ma—or in kanji (by using the Chinese character meaning ‘mountain’). That possibility helped to establish a relation between the Chinese character and its Japanese semantic equivalent and led to the practice of assigning a dual reading to Chinese characters: the Sino-Japanese reading (called on-yomi), based on the original Chinese pronunciation, and the Japanese reading (kun-yomi). Thus, the Chinese character originally meaning ‘mountain’ could be read as both san in on-yomi and yama in kun-yomi. Because Chinese words and their pronunciations were borrowed from different parts of China as well as during different historical periods, Modern Japanese includes many characters having more than one on-yomi reading.

Japanese languageKanji characters.Encyclopædia Britannica, Inc.

The complexity of reproducing the strokes for each character and the multiple readings associated with it have stimulated movements to abolish Chinese characters in favour of kana writing or even more radical movements for completely romanizing the Japanese language. All these, however, have failed. Despite their complexity, Chinese characters retain a number of advantages over phonetic writing systems. For one thing, many homophonous words are visually distinguishable. For another, the meanings of unknown words written in Chinese characters can be surmised through the ideographic nature of these characters. That semantic transparency and the characteristic configurations of characters enable easy recognition and understanding of a passage. These strengths and Japan’s high literacy rate make the abolishment of Chinese characters very unlikely.

Nevertheless, the shapes of Chinese characters have been simplified, and the number of commonly used characters has been limited. In 1946 the Japanese government issued a list of 1,850 characters for that purpose. Revised in 1981, the new list (called Jōyō kanji hyō “List of characters for daily use”) contains 1,945 characters recommended for daily use. That basic list of Chinese characters is to be learned during primary and secondary education. When newspapers use characters not on the list, they also supply the reading in hiragana.