exclusive classification

Is there any consensus among historical linguists on what statements such as the following mean?

"English is a Germanic language [and not a Romance/Hellenic/non-IE language]"

"Hungarian is a Uralic language [and not an IE/Turkic/etc. language]"

"Tok Pisin is an English Creole language [not a Papuan/IE/etc. language]"

The only criteria I know of for making these kinds of statements are the Swadesh list and similar lists of "core" vocabulary/morphemes. There seem to be sets of words and other morphemes that are more resilient than other words/morphemes (though never perfectly resilient, that I know of) when they are passed on from person to person.

However, Swadesh-type sets don't constitute anything like the totality of the language spoken by any individual (I don't use only the Swadesh list of English words, people in Hungary don't use only the Swadesh list of Hungarian, etc.). What's the rationale, then, for classifying these totalities in an exclusive way, such that English can only be considered Germanic and not also Romance (despite the vocabulary English has acquired through French, etc.), Hungarian can only be considered Uralic and not also Turkic (despite the vocabulary Hungarian has acquired through Turkish), and so on?

Or, are exclusive classifications primarily a relic of earlier periods in historical linguistics, and not felt to have any validity (except as a practical tool) by most researchers today?

I'm not assuming that the majority of historical linguists think in these terms, but some of the most common terminology still in use (inherited word, loanword, family, cognate) seems to imply exclusivity, and I wonder if this terminology doesn't sometimes affect the thinking of the people who use it.

Words are freely, promiscuously borrowed: in the dictionary, English is half Romance, Japanese is half Chinese, and Arabic has massively influenced everything it's ever touched. But mostly the grammar is untouched. English has entirely Germanic grammar, Japanese bears scarcely any resemblance to Chinese at all, Malay has resolutely Austronesian syntax and morphology, and so on. There are plenty of instances of minor grammatical borrowings and influences, but it's very rare for there to be reshaping on anything like the scale of word borrowing.

Words are freely, promiscuously borrowed: in the dictionary, English is half Romance, Japanese is half Chinese, and Arabic has massively influenced everything it's ever touched. But mostly the grammar is untouched. English has entirely Germanic grammar, Japanese bears scarcely any resemblance to Chinese at all, Malay has resolutely Austronesian syntax and morphology, and so on. There are plenty of instances of minor grammatical borrowings and influences, but it's very rare for there to be reshaping on anything like the scale of word borrowing.

Click to expand...

Granting that this is the case, it's still not clear why this makes English a Germanic (not Romance) language, Malay an Austronesian (not Sinitic) language, and so on. Are grammatical infixes/affixes more essential to the nature of a language than the roots/stems they attach to, and if so, why?

(I don't know if you're taking the position that they are more essential, but it seemed valid to bring up this question regardless.)

I am not an expert on historical linguistics and I see your point as far as the Swadesh list is concerned. However, I would assume that the classification relies on other types of evidence too. Syntax is one point. English does not display a generalized verb second constraint, but there are remnants of it that match the patterns found in Old English (which was clearly Germanic). Phonology is perhaps another point. Stress assignment, syllable structure and certain pronunciation features put English in the same group as the other Germanic languages.

I am not an expert on historical linguistics and I see your point as far as the Swadesh list is concerned. However, I would assume that the classification relies on other types of evidence too. Syntax is one point. English does not display a generalized verb second constraint, but there are remnants of it that match the patterns found in Old English (which was clearly Germanic).

Click to expand...

Was it? It had vocabulary from Latin, Celtic, and probably other languages that wouldn't normally be called Germanic. Perhaps it was a purer Germanic than modern English is, but we still have to define what completely pure Germanic would be (even in theory).

Phonology is perhaps another point. Stress assignment, syllable structure and certain pronunciation features put English in the same group as the other Germanic languages.

Click to expand...

Again, what makes these features more central to the identity of a language than others? The modern-day Germanic languages share many phonetic features with one another but also differ on many features (e.g., voicing is not contrastive in Icelandic, some of the Scandinavian languages have a pitch accent, etc.).

You could start from any features as a basis for classification: you would however find that syntax, morphology, and phonology, being inherited (with modification), actually give you a consistent classification, and to a large extent agree with each other (allowing for drift over time), whereas words don't. Nothing groups Turkish, Persian, Malay, and Swahili together apart from lots of shared words, whereas all their grammatical and phonological differences make them seem completely different. Grouping by lexical borrowing turns out to be shallow and uninformative grouping.

Was it? It had vocabulary from Latin, Celtic, and probably other languages that wouldn't normally be called Germanic. Perhaps it was a purer Germanic than modern English is, but we still have to define what completely pure Germanic would be (even in theory).

Click to expand...

I am not sure a bunch of loanwords from other languages is enough to change linguistic genetic affiliation. If that was the case, it would be impossible to establish languages families.

Again, what makes these features more central to the identity of a language than others? The modern-day Germanic languages share many phonetic features with one another but also differ on many features (e.g., voicing is not contrastive in Icelandic, some of the Scandinavian languages have a pitch accent, etc.).

Click to expand...

I am not saying that these features by themselves are more central to the identity of a language. I am just saying that a bundle of these features (the Swadesh list, phonology, syntax, morphology etc) provides a good basis for claiming that some languages are related.

You could start from any features as a basis for classification: you would however find that syntax, morphology, and phonology, being inherited (with modification), actually give you a consistent classification,

Click to expand...

What do you mean by a consistent classification? I don't think we can expect a language's classification to be completely uniform over time, unless we choose to define certain features as the "essence" of a language and everything else as secondary.

I agree that certain morphemes, vocabulary items and phonetic features tend to be more persistent through time (as they're passed from individual to individual) than others; I agree that these features can be studied as a group and labelled "Indo-European", "Austronesian", etc. But what's the basis for deciding that a given language "is" these features, and only these features (and therefore, that this language must be classified only on the basis of these features)?

and to a large extent agree with each other (allowing for drift over time), whereas words don't. Nothing groups Turkish, Persian, Malay, and Swahili together apart from lots of shared words, whereas all their grammatical and phonological differences make them seem completely different. Grouping by lexical borrowing turns out to be shallow and uninformative grouping.

Click to expand...

Uninformative in regards to what? Are you thinking of how (for example) some linguistic groupings may tell us more about the ethnic history of the people who speak (or have historically spoken) these languages than other groupings would?

I am not sure a bunch of loanwords from other languages is enough to change linguistic genetic affiliation.

Click to expand...

The only sense in which loanwords change genetic affiliation is by making it more complex. Old English didn't cease to be Germanic when it adopted some vocabulary from Latin: it became partly Italic as well as Germanic.

What do you mean by a consistent classification? I don't think we can expect a language's classification to be completely uniform over time, unless we choose to define certain features as the "essence" of a language and everything else as secondary.

I agree that certain morphemes, vocabulary items and phonetic features tend to be more persistent through time (as they're passed from individual to individual) than others; I agree that these features can be studied as a group and labelled "Indo-European", "Austronesian", etc. But what's the basis for deciding that a given language "is" these features, and only these features (and therefore, that this language must be classified only on the basis of these features)?

Click to expand...

A languages is in fact defined by these 3 components. There is a lexicon and there are restrictions on the shape of lexical items (phonology). Finally, there are restrictions on how these lexical items can be combined (syntax). What kind of features do you suggest that we use apart from a lexicon, a phonology and a syntax etc? The number of speakers? The demography of its speakers? Its geographical location?

The only sense in which loanwords change genetic affiliation is by making it more complex. Old English didn't cease to be Germanic when it adopted some vocabulary from Latin: it became partly Italic as well as Germanic.

Click to expand...

Are you proposing that a given language should be considered 80% Germanic, 18% Romance and 2% Slavic, reflected in the number of lexical items found in that language?

I don't think loan words can bring about a change as to how a particular language is classified. Languages are classified as complete linguistic systems, the essential grammatical structure inherited from its ancestor languages being, however, the main criterium for language classification. It is obvious, somehow to me, even on an intuitive level, that English is a Germanic language, even if it had 50% of words of other origins, just the same way as Polish is a Slavic language, although it has many words borrowed from other language groups. I don't think classification into language groups is the same as classification into synthetic and analytic languages, for example, where many languages exhibit features of both. I have never heard about a language classified as Germanic and Romance. I think it is either, or.

Grouping languages by borrowings gives you a number of trans-linguistic isoglosses, but knowing which languages use reflexes of kitab for "book" doesn't even give you much predictive power about which have taken, say, manzil for "house" (as Persian did) or dunya for "world" (Swahili and Turkish, maybe others). Other isoglosses are the very large television one, as opposed to a handful of hold-outs like Fernsehen and (Icelandic) sjónvarp. No doubt these are interesting and worth studying, but they're not by any stretch a significant classification.

A languages is in fact defined by these 3 components. There is a lexicon and there are restrictions on the shape of lexical items (phonology). Finally, there are restrictions on how these lexical items can be combined (syntax). What kind of features do you suggest that we use apart from a lexicon, a phonology and a syntax etc? The number of speakers? The demography of its speakers? Its geographical location?

Click to expand...

I said, "certainmorphemes, vocabulary items and phonetic features", not all such features. For example, in English, morphemes such as -s (plural), words such as man, and phonetic features such as initial stress and aspirated stops (tH, kH, pH) are seen as part of its essential, inherited legacy; but, morphemes such as -er(agent), words such as count, and final-syllable stress (as seen in words like create, devote etc.) are seen as things that merely "happened" to English over the course of its history.

If you were to take the first set of features out of English, many people would say that the resulting language is no longer English. If you were to take the second set of words out of English, I think many people would say that the resulting language is still "essentially" English, even if no English speaker alive today could understand more than a tiny fraction of it.

Are you proposing that a given language should be considered 80% Germanic, 18% Romance and 2% Slavic, reflected in the number of lexical items found in that language?

Click to expand...

No, I'm just proposing that a given language not be considered 100% Germanic. I don't think we can know the exact percentages involved: when a word comes into English via another language, it may influence the language in subtler ways than just occupying a new entry in the dictionary.

Grouping languages by borrowings gives you a number of trans-linguistic isoglosses, but knowing which languages use reflexes of kitab for "book" doesn't even give you much predictive power about which have taken, say, manzil for "house" (as Persian did) or dunya for "world" (Swahili and Turkish, maybe others).

Click to expand...

What predictive power does the classification of English as Germanic have, outside of Swadesh-type vocabulary?

In terms of grammatical morphemes, English doesn't have many left to begin with, and at least one of the most important morphemes in English, 3sg. -s, doesn't regularly correspond (as far as I know) to any morphemes in other Germanic languages.

Other isoglosses are the very large television one, as opposed to a handful of hold-outs like Fernsehen and (Icelandic) sjónvarp. No doubt these are interesting and worth studying, but they're not by any stretch a significant classification.

Click to expand...

I think a significant classification could be built on (to give one example) the large range of Greek and Latin words that have spread through most of Europe (including countries that don't speak IE languages), but not to more distant areas like China, or at least, not in the same degree. Along with the Greek/Latin vocabulary, one could include words/phrases that have originated in English, French, German etc. but have been widely calqued into other European languages.

However, Swadesh-type sets don't constitute anything like the totality of the language spoken by any individual (I don't use only the Swadesh list of English words, people in Hungary don't use only the Swadesh list of Hungarian, etc.). What's the rationale, then, for classifying these totalities in an exclusive way, such that English can only be considered Germanic and not also Romance (despite the vocabulary English has acquired through French, etc.), Hungarian can only be considered Uralic and not also Turkic (despite the vocabulary Hungarian has acquired through Turkish), and so on?

Click to expand...

I think a significant classification could be built on (to give one example) the large range of Greek and Latin words that have spread through most of Europe (including countries that don't speak IE languages), but not to more distant areas like China, or at least, not in the same degree. Along with the Greek/Latin vocabulary, one could include words/phrases that have originated in English, French, German etc. but have been widely calqued into other European languages.

Click to expand...

Your original question has been answered. The Swadesh list is of course not enough but it is supported by syntactic and phonological evidence. That sounds like a pretty solid basis for linguistic affiliation to me. You are calling the traditional classification into question by asking for the rationale behind it. And then you propose something you call a "significant classification" based on random loanwords without the support from syntax and phonology. I am sorry but I really fail to see the rationale behind that.

My question was, simply put, why can't a language belong to two or more families at once? I may be missing something, but I don't see where in this thread the question was answered.

The Swadesh list is of course not enough but it is supported by syntactic and phonological evidence. That sounds like a pretty solid basis for linguistic affiliation to me. You are calling the traditional classification into question by asking for the rationale behind it. And then you propose something you call a "significant classification" based on random loanwords

Click to expand...

I don't see the basis for calling the enormous number of Latin/Greek terms used throughout Europe (with the exception of Iceland) "random loanwords".

Also, I wasn't proposing a classification based on these Lat./Greek terms as a replacement for other (Swadesh-based) classifications of European languages: the two classifications can co-exist without difficulty.

Being told that "this language is Germanic" tells me a lot more than "this language has borrowed the word for biology from Greek".

Click to expand...

If I heard that a language was classified as Germanic, and I knew nothing else about it, all that this would tell me is that it probably shares most of its Swadesh vocabulary and some of its inflectional morphemes with English/German/etc. (exactly which vocabulary and morphemes, I wouldn't know), that it possibly has initial-syllable stress, and perhaps a few other possibilities (e.g., maybe it has no synthetic form for the past imperfect). Again, this is only a tiny fraction of what there is to know about any given language.

In the case of languages (e.g. English) that have more loanwords than inherited words, knowing the major sources of the loans may well provide more information about the language than knowing what ancestral group it has (traditionally) been classified in.

I think that Gavril is wanting the classification of languages by families to provide the same sort of information as the classification of animals and plants where at any given rank it can be stated what characteristics can be attributed to the members of a taxon. However, the classification of languages is about establishing at any given level whether two or more languages have an immediate common ancestor. Languages change quickly so that two languages with a common ancestor may soon become mutually unintelligible and each may develop or lose features so that eventually the point may be reached where any connection between the two is not readily apparent. The classification of languages by families is about showing demonstrable relationships between languages and that involves looking at their history, or deducing it by applying generally accepted "laws" about how languages change. Whether the tree or wave model is used, we are never going to learn from it what the languages covered have in common. They may differ as to the degree of synthesis or analysis, method of word formation, phonemic inventory, branching and favoured order of subject, object and verb.

However, the classification of languages is about establishing at any given level whether two or more languages have an immediate common ancestor. Languages change quickly so that two languages with a common ancestor may soon become mutually unintelligible and each may develop or lose features so that eventually the point may be reached where any connection between the two is not readily apparent.

Click to expand...

My question had nothing to do with how clear or apparent the relationships between languages are. Myslenka seemed to be saying (unless I misunderstood) that the classification of a language as "Germanic" provided more information about that language than classification according to (for ex.) what loanword area the language is in. I questioned whether the first type of classification really does explain more than the second.

It may be misleading to speak of relationships between languages to begin with: the relationships that historical linguistics deals in are (as far as I can see) relationships between individual words and morphemes, not between "languages" as unified/coherent entities. Is this recognized by proponents of the wave model?

In terms of grammatical morphemes, English doesn't have many left to begin with, and at least one of the most important morphemes in English, 3sg. -s,doesn't regularly correspond (as far as I know) to any morphemes in other Germanic languages.

Let's not refer to phonology as an indicator of language relatedness. For example, compare phonological aspects of Spanish and French—neighboring Romance languages—with regard to stress patterns, intonation, number of vowel phonemes and feature combinations utilized, the role of nasality in vowels, vowel/consonant ratio and tolerance for consonant clusters, apical vs. uvular rhotics and the Spanish r/rr contrast, the s/z distinction; the roles in Spanish (but not French) of "beta", "delta", "gamma", "theta" (not on my keyboard), and [x]; etc. Phonology will lead you off the track in the search for shared ancestry.

My question was, simply put, why can't a language belong to two or more families at once? I may be missing something, but I don't see where in this thread the question was answered.

Click to expand...

I understand your question better now. As stated earlier, I am not an expert on historical linguistics but I am guessing the answer is that a language is considered Germanic if the linguistic intergenerational transfer has been continuous (successful) throughout its history, from proto-Germanic to today’s modern languages (i.e. no abrupt changes in the linguistic input). A truly mixed language is perhaps possible but it seems to require very specific circumstances.

I don't see the basis for calling the enormous number of Latin/Greek terms used throughout Europe (with the exception of Iceland) "random loanwords".

Also, I wasn't proposing a classification based on these Lat./Greek terms as a replacement for other (Swadesh-based) classifications of European languages: the two classifications can co-exist without difficulty.

Click to expand...

Loanwords are accidents of history and can be explained by chance (hence "random"), true cognates cannot. You can of course make any linguistic classification you want, but a classification based on Latin/Greek terms would reflect cultural influence more than anything else.

If I heard that a language was classified as Germanic, and I knew nothing else about it, all that this would tell me is that it probably shares most of its Swadesh vocabulary and some of its inflectional morphemes with English/German/etc. (exactly which vocabulary and morphemes, I wouldn't know), that it possibly has initial-syllable stress, and perhaps a few other possibilities (e.g., maybe it has no synthetic form for the past imperfect). Again, this is only a tiny fraction of what there is to know about any given language.

In the case of languages (e.g. English) that have more loanwords than inherited words, knowing the major sources of the loans may well provide more information about the language than knowing what ancestral group it has (traditionally) been classified in.

Click to expand...

Maybe I am missing something, but I am not sure what kind of linguistic conclusions I would draw on the basis of loanwords (except for loanword phonology).

Let's not refer to phonology as an indicator of language relatedness. For example, compare phonological aspects of Spanish and French—neighboring Romance languages—with regard to stress patterns, intonation, number of vowel phonemes and feature combinations utilized, the role of nasality in vowels, vowel/consonant ratio and tolerance for consonant clusters, apical vs. uvular rhotics and the Spanish r/rr contrast, the s/z distinction; the roles in Spanish (but not French) of "beta", "delta", "gamma", "theta" (not on my keyboard), and [x]; etc. Phonology will lead you off the track in the search for shared ancestry.

Click to expand...

Related languages have drifted from each other in various ways and to various extents. Just because phonology maybe turn out to be fruitless in some comparisons doesn't mean it is fruitless in general.

I understand your question better now. As stated earlier, I am not an expert on historical linguistics but I am guessing the answer is that a language is considered Germanic if the linguistic intergenerational transfer has been continuous (successful) throughout its history, from proto-Germanic to today’s modern languages (i.e. no abrupt changes in the linguistic input). A truly mixed language is perhaps possible but it seems to require very specific circumstances.

Click to expand...

If the abruptness of a change is what makes the difference between genetic inheritance and loaning, we still have to define what abruptness means. Does abruptness have to do with how quickly the language becomes unintelligible to previous generations, or does it have to do with which features of the language change?

(If it's the latter, we seem to be back at the question of what is "essential" to a language and what isn't.)

Loanwords are accidents of history and can be explained by chance (hence "random"), true cognates cannot.

Click to expand...

I'm not sure I understand what you mean. In what sense can you use chance to explain the presence of the word sign (< French/Romance) in English, but not the presence of the word token (< Proto-Germ.)?

Maybe I am missing something, but I am not sure what kind of linguistic conclusions I would draw on the basis of loanwords (except for loanword phonology).

Click to expand...

When you say loanword phonology, are you thinking of patterns such as,

If so, aren't we begging the original question of this discussion? In other words, why aren't changes such as the above (and the words that enter a language via such changes) considered essential parts of a language's history, such that they would be taken into consideration when classifying a language? (Even if this would mean that the language would be classified in two or more families at once?)

Also, in response to Alxmrphi: I wasn't saying that English 3sg. pres. ind. -s is unrelated to any other Germanic morpheme, just that it doesn't seem explicable without recourse to borrowing (whether from dialects or related languages).

Another thing -- loan words have usually to adjust themselves (how they are pronounced, or even their declensional endings) to he languages that are the borrowers. They have to become a part of the cohesive linguistic system. The adjusting is a natural process, based on phonological factors, to a large extent.

If the abruptness of a change is what makes the difference between genetic inheritance and loaning, we still have to define what abruptness means. Does abruptness have to do with how quickly the language becomes unintelligible to previous generations, or does it have to do with which features of the language change?

(If it's the latter, we seem to be back at the question of what is "essential" to a language and what isn't.)

Click to expand...

I didn’t mean abruptness in terms of change. I was thinking more about external factors that might disturb first language acquisition (pidgin situations). That being said, I am willing to accept languages with mixed affiliation if:
i) it’s not a pidgin/creole.
ii) they have vocabulary and grammar from (all) the parent languages.

I'm not sure I understand what you mean. In what sense can you use chance to explain the presence of the word sign (< French/Romance) in English, but not the presence of the word token (< Proto-Germ.)?

Click to expand...

Sorry, I was being unclear. I didn’t mean to draw a parallel between these two cases based on whether they are in a given language by chance or not. Loanwords are random in that you cannot predict if a word will be borrowed or not. Besides, as mentioned earlier in the thread, they are useless in establishing language families.

When you say loanword phonology, are you thinking of patterns such as,

Sorry, I was being unclear. I didn’t mean to draw a parallel between these two cases based on whether they are in a given language by chance or not. Loanwords are random in that you cannot predict if a word will be borrowed or not.

Click to expand...

You can't predict if a word will be inherited or not, either (with the possible exception of Swadesh-type vocabulary -- which sometimes contains loans).

You are right.
If these words were taken into account when establishing language families, how would you go about?

Click to expand...

I would use words like Chwefror to establish that part (not all) of Welsh derives from Latin, words like hihna to establish that part of the modern Finnic languages comes from a Baltic source, and so on.

In general, if I was studying the history of a language, I'd go through the language word by word and morpheme by morpheme, try to find out what the source of each word/morpheme was and what changes they had undergone, and group them accordingly. I wouldn't attempt to classify the language as a whole.

That's etymology. How would you go from there to establishing language families?

Click to expand...

I wouldn't. Once I'd classified the parts of a language (the words, grammatical morphemes, and the phonological/syntactic patterns) according to their respective sources, what other groupings would I need to make?

(In other words, I would group the individual components of languages into families, but I wouldn't label entire languages as "Germanic", "Indo-European", "Uralic", etc., because I don't know of any language that belongs exclusively to one group.)

I think that every categorization (including the related terminilogy) serves for practical purposes and not for the description of the "truth", thus every categorization is a priori imprecise and focuses to certain features or properties. So we can say that the English is a Germanic language, for practical reasons, in a well known context. We could also say that the English is x% Germanic y% Romance etc. in an other context, but this might be rather complicated and not "practical" enough for other purposes we want to discuss.

Such kind of "imprecisions" takes place, for practical reasons, not only in linguistics, but everywhere. E.g. we say that "the Earth turns around Sun" instead of "Both the Earth and the Sun turn around a common centre of gravity given by the mass of the Sun and the mass and distance of all the platens from the Sun ...." (or something like this). Or we speak abou a "glass of water" and not about a "glass of water and bubbles of air and millions of bacteria" that are present in the same glass.

... Once I'd classified the parts of a language (the words, grammatical morphemes, and the phonological/syntactic patterns) according to their respective sources, what other groupings would I need to make?

Click to expand...

Maybe you would not need any other groupings, but somone else would, for practical reasons .

I wouldn't. Once I'd classified the parts of a language (the words, grammatical morphemes, and the phonological/syntactic patterns) according to their respective sources, what other groupings would I need to make?

(In other words, I would group the individual components of languages into families, but I wouldn't label entire languages as "Germanic", "Indo-European", "Uralic", etc., because I don't know of any language that belongs exclusively to one group.)

Click to expand...

I think your approach would entail that languages per se don't have a history, only lexical items have. I wonder what the term English would refer to given that approach.

I think your approach would entail that languages per se don't have a history, only lexical items have.

Click to expand...

More or less, yes (though bound morphemes and phonological/syntactic patterns can also have histories, not just lexical items).

In general, I don't think languages are comparable to (for example) biological organisms, which can be clearly defined as (self-contained) systems, and can be clearly distinguished both from other organisms and from their environment. Neither of these things applies, as far as I can tell, to what we normally call "languages", and therefore I don't see a reason to treat languages as more than the sum of their components.

I wonder what the term English would refer to given that approach.

Click to expand...

One possible definition: "the language shared within the English speech community". Once a group of people identify themselves as English-speaking, anyone else who belongs to their speech community can also be defined as English-speaking.

By this definition, the "border" of the English language is based on what the community of English speakers uses and understands, not on any inherently linguistic properties. Thus, if a new word/morpheme is created or introduced into the English speech community, it becomes a part of English as long as it's used and understood within this group of speakers.

One possible definition: "the language shared within the English speech community". Once a group of people identify themselves as English-speaking, anyone else who belongs to their speech community can also be defined as English-speaking.

Click to expand...

This definition is a bit tautological. Isn't it?
"English is the language shared within the English speech community".

As if we said:
"English is the language spoken by people who speak English"

Maybe the definition won't seem so tautological if I recast it like this:

"English is the language spoken within a certain speech community [or group of speech communities], whose members are the majority population in Britain and former British colonies, and most of whom refer to their primary language as 'English'."

I think your approach would entail that languages per se don't have a history, only lexical items have. I wonder what the term English would refer to given that approach.

Click to expand...

I agree. I don't think this would a very beneficial approach to language classification. It might turn out almost as classifying all the countries with a certain percentage of red-haired population as partly Irish. Almost like classifying India as an Asian-European country (due to a high number of British citizens). Words alone have never been the basis for language classification. (just trying to establish the language's membership in a language group based only on its lexical content).

... Maybe the definition won't seem so tautological if I recast it like this:
"English is the language spoken within a certain speech community [or group of speech communities], whose members are the majority population in Britain and former British colonies, and most of whom refer to their primary language as 'English'."

Click to expand...

I think I understand (to a certain degree) what you want to say, but still I cannot agree with you. If a given comunity (or groups of comunities) refer to their primary language, whatever be it's name, such language has to be definible independetly on the comunity, otherwise what do they refer to?

For example, I can also refer to my primary language (Hungarian), neverthless I can make a clear difference between English and Hungarian, i.e. I can recognize the English per se, whoever and wherever speaks it, be it his primary or secondary language, even if I don't know anything about the supposed "English speach communities", the history/origin of the English, about England and the colonies etc ...

In other words, I think that a language "defines itself" by it's own features or characteristics, regardless of anything else.

In general, I don't think languages are comparable to (for example) biological organisms, which can be clearly defined as (self-contained) systems, and can be clearly distinguished both from other organisms and from their environment. Neither of these things applies, as far as I can tell, to what we normally call "languages", and therefore I don't see a reason to treat languages as more than the sum of their components.

Click to expand...

It may be constructive to consider the system of classifying organisms. You use the phrase "clearly distinguished". Two points may be made. The first is whilst the members of a given taxon will have characteristics in common those characteristics may not necessarily be immediately obvious. The second is that sharing characteristics does not necessarily mean that any two species belong in the same taxon. The plain fact is that the generally accepted (with variations) taxonomy of organisms takes into account evolution and that taxonomists are keenly aware of making mistakes arising from convergent evolution. To take a obvious example, there is no scientific classification of animals based on whether they can fly or not. Equally, not everything which lives in the sea is classed as a fish. Narrowing it down a bit more, not everything with a fin is classed as a fish. Children are taught very early on that whales are not fish. Less obviously perhaps, seals and dolphins, both sleek aquatic mammals, do not belong to the same order. Any system of classification which fails to take account of the way that organisms are related genetically, however useful, will inevitably be unscientific.

Just as when classifying organisms you cannot home in on some aspects and ignore others, so it is with languages. All of lexicon, phonology and morpho-syntax need to be accounted for. Just as with organisms genetic classification turns out to be the most scientific (or in fact to put it another taking everything into account you end up with a genetic classification) so it is with languages.

The tree model is predicated on the assumption that a language subdivides and that the languages into which it subdivides in turn subdivide and so on. In practice it is not that simple. Similarities may arise because languages have at some level a common ancestor, because one has influenced the other or because they just happened to have converged. The tree model is therefore not perfect and can be complemented by the wave model. I think in fact that the wave model is doing more or less what you are interested in, except that it does not concentrate on lexicon. However, the tree model has to come before the wave model because otherwise you do not have the components necessary to set up your wave model.

Maybe the definition won't seem so tautological if I recast it like this:

"English is the language spoken within a certain speech community [or group of speech communities], whose members are the majority population in Britain and former British colonies, and most of whom refer to their primary language as 'English'."

The details about where the speakers live and how they refer to their language are not essential to this definition: they're just "shortcuts" that can be used to identify the relevant speech community. In earlier historical periods, different shortcuts might have been necessary.

Click to expand...

I know this is a language forum but now that you mention Britain/British, I am curious to know how your approach relates to other similar disciplines, say history. Could the same procedure be applied when talking about the history of a nation or ethnic group, meaning that you reject the notion of a "national history" and replace it with a history of individuals? Or should it only be used in (historical) linguistics?

I think I understand (to a certain degree) what you want to say, but still I cannot agree with you. If a given comunity (or groups of comunities) refer to their primary language,whatever be it's name, such language has to be definible independetly on the comunity, otherwise what do they refer to?

Click to expand...

They refer to the set of morphemes (affixes and independent words) and phonetic/morphological/syntactic patterns that they use to communicate with one another.

This set of linguistic items can change drastically over time (through loaning, sound changes etc.), but as long as the community(/communities) of speakers continues to refer to it as "English", there is still a basis for calling the speech of this community "English".

(Also, "primary language" may have been a bad choice of words on my part -- for the time being, you can ignore the word "primary" in what I wrote.)

For example, I can also refer to my primary language (Hungarian), neverthless I can make a clear difference between English and Hungarian, i.e. I can recognize the English per se, whoever and wherever speaks it, be it his primary or secondary language, even if I don't know anything about the supposed "English speach communities", the history/origin of the English, about England and the colonies etc ...

Click to expand...

I think once you know that "English" exists, i.e., a set of words/affixes/etc. used for interpersonal communication, you implicitly understand that there is a group (or multiple groups) of speakers that use (or used) it for communication, regardless of whether you know any further details about this group.

For example, if you learned about English through a textbook, you learned it from a member (or members) of the present-day English speech community: namely, the textbook author(s), or whichever English speaker(s) they used as a basis for their research.

In other words, I think that a language "defines itself" by it's own features or characteristics, regardless of anything else.

Click to expand...

How do you define "its own" in this case? In other words, how do you define the border between what is and isn't English, unless you can point to a group of people who use the English language, and decide what to include or exclude in it?

----------------

Hualessar:

When I wrote that biological organisms are clearly distinguished, I wasn't talking about the historical classification of organisms (perhaps "clearly distinguished" was a bad choice of words on my part). Instead, I was talking about the existence of organisms as independent, self-contained entities.

In other words, for any living organism, there is (generally speaking) a clear definition of where that organism "ends" and the organism's surroundings "begin". For example, you can look at a bird and clearly distinguish it from its environment (the sky, a tree, etc.), from other individual birds, and from other surrounding lifeforms. You can also point to the bird's internal vital systems and clearly distinguish them from the ecosystem and social system the bird is a part of.

In the case of what we call "languages", no equivalent demarcation can (as far as I know) be made. External linguistic behavior consists of the utterance of words and the making of signs / gestures, but the system that the words, signs and gestures fit into (i.e., the system that causes them to be generated, and the system that governs the effect they have once generated) is heavily bound up with other aspects of human behavior and cognition (emotion, social systems, individual learning patterns, etc.). No one, to the best of my knowledge, has succeeded in demonstrating that language constitutes a separate, self-contained system (whether from a behavioral, neurological or sociological point of view).

There is also no proof (as far as I know) that different languages occupy correspondingly different parts of a person's brain/cognition. If a person knows two or more languages, these languages can become mixed through lexical borrowing, code switching, syntactic/phonological influence and so on. The only clear factor keeping two or more languages "separate" (in a multilingual person's speech and thought) are social and pragmatic constraints, and these constraints are not equally strong for all people or in all social contexts.

If you accept that the above is true, I don't think it makes any sense to treat languages as entities with a single origin or history: instead, each word, morpheme and other feature of a person's speech can be analyzed and classified separately. Insofar as I understand the difference between the wave model and the tree model, I think that this approach would eliminate the dichotomy between the two and simplify (in some degree) the task of classification.

Myšlenka: FYI, after reading over Francisgranada's comment, I changed my mind about one of the things you quoted from my post (I deleted the relevant paragraph just as you were making your post above). I'm no longer sure about the part I crossed out below:

The details about where the speakers live and how they refer to their language are not essential to this definition: they're just "shortcuts" that can be used to identify the relevant speech community. In earlier historical periods, different shortcuts might have been necessary.

I know this is a language forum but now that you mention Britain/British, I am curious to know how your approach relates to other similar disciplines, say history. Could the same procedure be applied when talking about the history of a nation or ethnic group, meaning that you reject the notion of a "national history" and replace it with a history of individuals? Or should it only be used in (historical) linguistics?

Click to expand...

I don't know; without thinking it over further, I'm not sure which fields this approach would apply to outside historical linguistics.

...They refer to the set of morphemes (affixes and independent words) and phonetic/morphological/syntactic patterns that they use to communicate with one another ...

Click to expand...

Yes, but these morphemes and patterns interact among them and the result is a "macrosystem" with observable general characteristics. Such "macrosystem" is what we call e.g. “English” or "French". For example, a person who doesn’t speak English and French at all, may still be able to recognize these two languages, even if he cannot separate the individual morphemes from each other in a spoken English or French and doesn’t know anything about the existence of whatever patterns.

It is the case of other "systems", too. E.g. the molecules of water consist of hydrogen and oxygen atoms, but it still makes sense to speak about "water" and not only about the individual atoms, because, due to the interaction of the atoms, the water has it's own recognizable properties that are not identical to those of it's components.

Of course, exact "borders" of the languages do not exist, but it is true also for the molecules of H2O on a subatomary level.

Yes, but these morphemes and patterns interact among them and the result is a "macrosystem" with observable general characteristics.

Click to expand...

Again, I don't think the interaction between elements of a language is a self-contained process: it's closely linked to other factors (e.g., social and pragmatic constraints) that are not purely linguistic.

Such "macrosystem" is what we call e.g. “English” or "French". For example, a person who doesn’t speak English and French at all, may still be able to recognize these two languages, even if he cannot separate the individual morphemes from each other in a spoken English or French and doesn’t know anything about the existence of whatever patterns.

It is the case of other "systems", too. E.g. the molecules of water consist of hydrogen and oxygen atoms, but it still makes sense to speak about "water" and not only about the individual atoms, because, due to the interaction of the atoms, the water has it's own recognizable properties that are not identical to those of it's components.

Click to expand...

The interaction between atoms in an H2O molecule is based on the properties of the atoms themselves. By contrast, the interaction between elements of a language is based on the choices and experiences of the humans who speak it.

Of course, exact "borders" of the languages do not exist, but it is true also for the molecules of H2O on a subatomary level.

Click to expand...

I'm not a physicist, but I have some doubts about this analogy. It may not be possible for us to observe the boundary of an H2O molecule, but that doesn't mean that there is no boundary to be observed. By contrast, I don't think there is anything inherent in a given language (English, Hungarian etc.) that separates it from other languages or from the non-linguistic environment: it all depends on the choices of the people speaking the language.

Speakers can make any addition they want to to their speech, or remove anything they want to, as long as these changes are accepted by the rest of the speech community and communication is still possible.

In general, I don't think languages are comparable to (for example) biological organisms, which can be clearly defined as (self-contained) systems, and can be clearly distinguished both from other organisms and from their environment. Neither of these things applies, as far as I can tell, to what we normally call "languages", and therefore I don't see a reason to treat languages as more than the sum of their components.

Click to expand...

This thread has developed into a debate about reductionism versus holism in linguistics. You have earlier questioned the predictive power of classifying languages in language families, and that makes me curious to know what kind of predictions your approach and your classification of various lingustic properties would make.

This thread has developed into a debate about reductionism versus holism in linguistics. You have earlier questioned the predictive power of classifying languages in language families, and that makes me curious to know what kind of predictions your approach and your classification of various lingustic properties would make.

Click to expand...

In discussions about a given language, I feel that people sometimes "essentialize" the language they're discussing in an unproductive way. I don't know how common this tendency is, but I think that exclusive classification (= languages can only belong to one family tree) contributes to this tendency.

For example, some time back, I was discussing the Finnish plural suffix -t with a linguist who (I believe) specializes in Finno-Ugric etymology. If you add this suffix to any given noun in Finnish, (e.g., mies "man" -> miehet "men") the normal implication is either

1) that the item which appears in the plural has been mentioned before in the current context (thus miehet = "the men")
or
2) that you're making a general statement about the item (thus, miehet = "men (in general)")

There are other possible interpretations of the -t suffix, but I would say that these two are the most common.

However, in the context of this discussion, the Finno-Ugric specialist seemed to deny that #1 was a valid interpretation of the plural suffix -t, and he specifically said that this suffix shouldn't be compared to the definite articles seen in Germanic languages. He tried to give an alternative explanation of this suffix as relating to the implied quantity of an object, an explanation that is certainly valid for *some* uses of the -t suffix, but not generally valid as far as I can see.

Implicit in this person's statement about the meaning of the -t suffix seemed to be the idea that definite/indefinite marking was not a "trait" of Finnish and related languages, since these languages don't have definite/indefinite articles comparable to those seen in some IE (and other) languages.

On the other hand, the interpretation of -t as implying definiteness is supported by statements I have seen in Finnish learning courses and by my own experience in reading, hearing and speaking the language. Since Finnish has been spoken for a long period of time alongside North Germanic languages that have regular definite articles, I don't think it would be surprising if this resulted in Finnish using certain affixes to signify definiteness, though it's also possible that Finnish developed the construction in question independently.

The idea that a given feature in one language "shouldn't be compared" to a feature in another language simply because the two languages have been (historically) classified into different families, results (I think) from the idea that languages "essentially belong" to only one family, and can only be accidentally influenced by other families. I'm not saying that one idea necessarily leads to the other, but if we were to adopt the other approach that I've been talking about, I think that it would be much harder to dismiss similarities between two languages as "not comparable" to one another, especially when there is a plausible historical connection between the two.

Sorry if this was a long-winded explanation, but I hope the general point was clear.

I am not quite sure what your problem is. The classification system you criticize is expressly classifies languages genetically, not by traits. There are different classification systems by traits. Especially in GG, classifications by certain "parameter" setting are frequently used. These classifications serve different purposes. You can criticize the prominence of genetic classification in linguistic discourse but you cannot criticize a genetic classification for being genetic.

I am not quite sure what your problem is. The classification system you criticize is expressly classifies languages genetically, not by traits.

Click to expand...

You're right that the assignment of traits to various language groupings isn't a necessary part of genetic classification. However, this assignment of traits does happen (and sometimes professional linguists are the ones who do it, as possibly in my example above), and I suspect that it's encouraged to some degree by the classification of entire languages as units (as opposed to the classification of individual words, morphemes etc.).

Inasmuch as there's no clear answer to questions such as "What makes English essentially Germanic (rather than Romance, etc.)?", people will sometimes try to supply an answer based on various features that they've seen across the languages in question (for Germanic, they might choose ablaut, voiceless aspirated stops, etc.). This is where the assignment of traits comes in.

For example, some time back, I was discussing the Finnish plural suffix -t with a linguist who (I believe) specializes in Finno-Ugric etymology. If you add this suffix to any given noun in Finnish, (e.g., mies "man" -> miehet "men") the normal implication is either

1) that the item which appears in the plural has been mentioned before in the current context (thus miehet = "the men")
or
2) that you're making a general statement about the item (thus, miehet = "men (in general)")

There are other possible interpretations of the -t suffix, but I would say that these two are the most common.

However, in the context of this discussion, the Finno-Ugric specialist seemed to deny that #1 was a valid interpretation of the plural suffix -t, and he specifically said that this suffix shouldn't be compared to the definite articles seen in Germanic languages. He tried to give an alternative explanation of this suffix as relating to the implied quantity of an object, an explanation that is certainly valid for *some* uses of the -t suffix, but not generally valid as far as I can see.

Implicit in this person's statement about the meaning of the -t suffix seemed to be the idea that definite/indefinite marking was not a "trait" of Finnish and related languages, since these languages don't have definite/indefinite articles comparable to those seen in some IE (and other) languages.

On the other hand, the interpretation of -t as implying definiteness is supported by statements I have seen in Finnish learning courses and by my own experience in reading, hearing and speaking the language. Since Finnish has been spoken for a long period of time alongside North Germanic languages that have regular definite articles, I don't think it would be surprising if this resulted in Finnish using certain affixes to signify definiteness, though it's also possible that Finnish developed the construction in question independently.

Click to expand...

I don't speak Finnish, but I googled your example and wikipedia lists the -t suffix as a "definite, divisible, telic plural". The Finnish phrases I found suggest that it doesn't work in the singular, so I have to agree with the linguist: The Finnish -t suffix is different from definite marking in Germanic languages. You accuse the (exclusive) classification of languages in language families to make linguists "blind". I have a feeling though that your Germanic (or English) speaking background makes you "blind" with respect to grammatical features in other languages by imposing grammatical categories from your own native language on other languages, i.e. it looks like Germanic definiteness because an approximate translation suggests so.

I could provide similar examples based on Slavic case alternations between the genitive and the accusative case (I think) that would give the impression that Slavic languages express definiteness.

And you still didn't mention any predictions that your classification would make

You're right that the assignment of traits to various language groupings isn't a necessary part of genetic classification. However, this assignment of traits does happen (and sometimes professional linguists are the ones who do it, as possibly in my example above), and I suspect that it's encouraged to some degree by the classification of entire languages as units (as opposed to the classification of individual words, morphemes etc.).

Click to expand...

True. This is just an instance of the danger that lies in letting diachronic consideration influence synchronic analysis of a language. On the other hand, analyzing the history of language traits is often a very productive tool for arriving at a better understanding of those traits. We will therefore probably never get rid of the danger of overusing this tool. I agree with you that it is important to be aware of this tool's dangers and limitations.

Inasmuch as there's no clear answer to questions such as "What makes English essentially Germanic (rather than Romance, etc.)?", people will sometimes try to supply an answer based on various features that they've seen across the languages in question (for Germanic, they might choose ablaut, voiceless aspirated stops, etc.). This is where the assignment of traits comes in.

Click to expand...

There is. English is genetically Germanic, i.e. it developed as a Germanic incorporating Romance influence and not the other way round. That's what the classification says. Not more and not less. This is an important piece of information and it has its consequences.

I don't speak Finnish, but I googled your example and wikipedia lists the -t suffix as a "definite, divisible, telic plural". The Finnish phrases I found suggest that it doesn't work in the singular, so I have to agree with the linguist: The Finnish -t suffix is different from definite marking in Germanic languages.

Click to expand...

I never claimed that the -tsuffix was exclusively a marker of definiteness, or that it functioned outside the plural. (The -t is only used in the nominative and accusative. Most case-forms of the plural are also ambiguous with regard to definiteness.: e.g., miehistä can mean "from the men" or "from (some) men", depending on the context.)

What the other linguist in this exchange was denying (as far as I could tell), was that definiteness had *anything* to do with the meaning of the -t plural, which is contradicted by the source you quoted (albeit that source is Wikipedia).

You accuse the (exclusive) classification of languages in language families to make linguistis "blind". I have a feeling though that your Germanic (or English) speaking background makes you "blind" with respect to grammatical features in other languages by imposing grammatical categories from your own native language on other languages, i.e. it looks like Germanic definiteness because an approximate translation suggests so.

Click to expand...

I disagree that the translation of a -tplural as definite is an approximation: I think that definiteness (along with plurality and sometimes other features) is precisely what it expresses in the majority of contexts. E.g., ask any native Finnish speaker which of the highlighted words they would choose in the following context, and I think most if not all of them will say "miehet":

I could provide similar examples based on Slavic case alternations between the genitive and the accusative case (I think) that would give the impression that Slavic languages express definiteness.

Click to expand...

I'm not familiar with the alternations that you mention, but if definiteness is a consistent part of the meaning that results from these alternations, I don't see anything wrong with saying that they express definiteness.

And you still didn't mention any predictions that your classification would make

Click to expand...

I'm not sure what you mean by "prediction", then. I provided an example of a change in practice that I think "my" classification could bring about -- what other kind of prediction did you have in mind?

---------

Berndf:

There is. English is genetically Germanic, i.e. it developed as a Germanic incorporating Romance influence and not the other way round.

Click to expand...

I don't agree with the "i.e." here. Older stages of English had a greater percentage of vocabulary and morphemes that would be called "Germanic" than more recent stages of English do, but there have been plenty of non-Germanic lexemes, morphemes and (I think) phonetic/syntactic patterns in the "DNA" of English for hundreds of years.

That's what the classification says. Not more and not less. This is an important piece of information and it has its consequences.

What the other linguist in this exchange was denying (as far as I could tell), was that definiteness had *anything* to do with the meaning of the -t plural, which is contradicted by the source you quoted (albeit that source is Wikipedia).

Click to expand...

We can only guess why he thought so. I think it makes no sense to dive deeper into this discussion, if we have nothing than you suspicion the person's conviction might have something to do with historical languages traits.