Much of recent research in linguistics has involved the use of experimentation to directly test hypotheses by comparing and contrasting real-world data with that of laboratory results and computer simulations. In a previous post I looked at how humans, non-human primates, and even non-human animals are all capable of high-fidelity cultural transmission. Yet, to apply this framework to human language, another set of experimental literature needs to be considered, namely: artificial language learning and constructed communication systems.

Artificial Language Learning

First devised by Esper (1925), and later expanded to study social transmission (Esper, 1966), Artificial Language Learning (ALL) involves exposing participants to an artificially created, miniature language, which they are then trained, and subsequently tested, on; the goal being to investigate the learning capabilities of individuals. The ALL paradigm is widely used in linguistics, especially when investigating language acquisition and statistical learning abilities of humans (Saffran et al., 1996) and non-humans (Fitch & Hauser, 2004).

For instance, Wonnacott and colleagues (2007) use ALL to explore two related debates surrounding verb generalisation: 1) how some verb-argument structures tend to generalise to new verbs, whilst other verbs are highly resistant to change; and, 2) how verb-specific and more generalised constraints interact in sentence processing, with emphasis on the role of semantics. Across three experiments they found that learners were quite competent in tracking both verb-specific (the likelihood of a particular verb co-occurring with a particular argument structure) and verb-general (the likelihood of a particular argument structure occurring across the verbs of the language) statistical patterns:

Importantly, however, how learners utilized these competing sources of information depended upon additional statistical factors: verb frequency (learners were more likely to ignore verb-specific statistics with low frequency verbs) and the distribution of verb types across the language (learners were more likely to ignore verb-specific statistics in languages with a large alternating verb class).

ALL is also useful in expanding upon computational modelling studies, as demonstrated by Christiansen (2000) when, working from a previous connectionist study into word order universals (Christiansen & Devlin, 1997), created two head-ordered languages, with one containing head-last consistent sentences and the other being inconsistent. In tandem with the modelling results, the ALL study confirmed that head-order inconsistency is too hard to learn, which suggests that the underlying processing mechanisms are not necessarily innately constrained with a head-ordering rule, but are rather the result of “non-linguistic constraints on sequential learning and processing” (Christiansen, 2000).

Another area of investigation using ALL is in the emergence and formation of creoles from pidgin languages – a hybrid language that evolves from its parent pidgin, except it contains a grammar that mirrors the complexity of natural languages. Although we can see creoles emerge within a few generations, – such as the development of a new type of sign language in a deaf community of Nicaraguan children and a similar situation in development of the Al-Sayyid Bedouin Sign Language –the central flaw of these studies is that they lack the necessary experimental controls to test specific predications

Hudson-Kam & Newport (2005) attempt to address creole formation by exposing both adults and children to two artificial languages, specifically focusing on the role of regularization: the process of making irregular forms regular. Importantly, these initial languages contained linguistic features present in pidgins and the early stages of creole formation, such as inconsistent grammatical morphemes — and differed in the presence or absence of a determiner within noun phrases. For the first language (inconsistent condition) the determiner was only present for 60% of the time, whilst it was present 100% in the other language (consistent condition). They found exposure to consistent grammatical patterns resulted in consistent grammatical patterns for both adults and children. The major finding, however, is that when adults are exposed to inconsistent input, they tend to reproduce these inconsistencies in their output; children on the other hand tend to regularize the language through generating patterns that are different from the initial input.

From these results, Hudson-Kam & Newport claim that, through the regularisation of grammatical patterns, children play a vital role in Creole formation. Furthermore, as children and adults do not learn inconsistent input in the same manner, with the latter applying a strategy that merely attempts to reproduce consistency or inconsistency, children act in a way as to influence an emerging language through regularising and stabilising the grammar. As the authors note:

First, it may be the case that children bring to the task of language learning some special expectation and rule-learning processes, utilized specifically in the case of language acquisition… Alternatively, the regularization seen in creolization may result from constraints on more general probability learning mechanisms interacting with a particular kind of complex input in such a way as to lead to very different learning outcomes in young and mature learners.

A more recent study on word learning (Vouloumanos, 2008) investigated the interactions between learning biases and input inconsistency. Specifically, Vouloumanous trained participants on novel word-object pairs consisting of varying frequencies: “some objects were paired with one word, other objects with multiple words with differing frequencies (ranging from 10% to 80%)”. She tested participants by presenting them with two objects while playing a single word, and then asking which of the two objects are best associated with the word. By introducing multiple-referent relations during word learning, Vouloumanous found participants tended to adopt a selection heuristic based on the frequency of the word/object, rather than regularising the inconsistent input. In her conclusion, she argues that the sensitivity to the statistical co-occurrence between words and objects suggests, “[…] learners could entertain overlapping hypotheses about the referents of a word, and assign different likelihoods to each of these candidate mappings”.

Intentional Communication

Sharing similarities with ALL experiments is a comparatively small body of literature pertaining to experiments into the construction of communication systems (Galantucci, 2005; Selten & Warglien, 2007). A common theme running through each of these experiments is how a novel communication can emerge over a short period of time to solve a particular task (Galantucci, 2005) or just through repeated interaction (Selten & Warglien, 2007). Galantucci (2005), for instance, took pairs of participants and placed them in a computer game scenario requiring communication. To make sure they didn’t use language, or any subtler forms of communication, such as body language, members of the pair were physically separated. Also, despite using a magnetic stylus on a small digitizing pad, participants were prevented from making conventional orthographic symbols, e.g. alphabetic letters (as you can see below). As the authors note:

Communication systems emerged and developed rapidly during the games, integrating the use of explicit signs with information implicitly available to players and silent behaviour-coordinating procedures. The systems that emerged suggest 3 conclusions: (a) signs originate from different mappings; (b) sign systems develop parsimoniously; (c) sign forms are perceptually distinct, easy to produce, and tolerant to variations.

More specific to language per se is Selten & Warglien’s (2007) study. Here, the authors use a series of laboratory experiments designed to investigate the inherent costs and benefits of linguistic communication — and how these respective aspects impact upon the emergence of basic languages in a coordination task. There is no common language available to the participants, with them instead needing to create their own communication system in reference to varying lists of geometrical figures composed of up to three features. Importantly, the communication system is limited, as the use of letters incurs a cost. So by varying the geometrical figures and number of letters available, the researchers are able to compare different environments — with stable ones resulting in arbitrary codes, whilst “in an environment with novelty, compositional grammars offer considerable coordination advantages and therefore are more likely to arise.”

One feature of this study, and all of the others for that matter, is how their results enhance upon the conclusions produced by real world data (rather than just providing confirmatory evidence). In the case of Selten & Warglien’s study, their conclusions are broadly divided into four key findings:

(i) The size of the repertoire of elementary linguistic symbols seems to be important in facilitating linguistic coordination[…] We also find successful coordination (although as somehow lower rates), but our experiment I shows that a too small size of the repertoire may be a serious obstacle for the attainment of a common code.

(ii) In an environment in which the same messages occur many times, cost efficiency and role asymmetry are factors enhancing communicative success, whereas grammars do not offer particular advantages under such circumstances. Role asymmetry between a leader and an imitator avoids mismatches by simultaneous adjustments to the code of the other. In dialogue theory, the role of imitation is also emphasized as conductive to the conversational alignment of interlocutors. Our results throw additional light on this phenomenon and suggest looking more closely to how role asymmetry might facilitate alignment processes.

(iii) In stable environments as those considered in point ii, grammar does not matter much, and efficient arbitrary codes often do better. However, compositional grammars have the advantage of being more easily extendable to broader environmental demands. Noncompositional grammars are more fragile and are easily lost if new conditions have to be met.

(iv) In an environment with novelty, in the sense that often the need arises to express something that never has been expressed before, compositional grammars offer considerable coordination advantages. Therefore, under such circumstances, compositional grammars are more likely to arise. In this respect, our findings parallel and complement hypotheses proposed in the literature on language evolution. In our experiments, all subjects have grammatical competence but they make relatively little use of it unless pressure of novelty gives them an incentive to do so.

A vital component of both Selten & Warglien’s and Galantucci’s experiments is that the participants create a system of communication. Thus, the resulting systems are the product of intentional design, and as such may not tell us much about the actual processes of language, which appear to be guided by non-intentional processes resulting from human action and interaction — an invisible hand process. This is what I’ll look at in part two, namely those studies looking at Human Iterated Learning.

Selten, R., & Warglien, M. (2007). The emergence of simple languages in an experimental coordination game Proceedings of the National Academy of Sciences, 104 (18), 7361-7366 DOI: 10.1073/pnas.0702077104

Selten R, & Warglien M (2007). The emergence of simple languages in an experimental coordination game. Proceedings of the National Academy of Sciences of the United States of America, 104 (18), 7361-6 PMID: 17449635