Training agents to communicate with one another given taskbased supervision only has attracted considerable attention recently, due to the growing interest in developing models for human-agent interaction. Prior work on the topic focused on simple environments, where training ...MORE ⇓

Training agents to communicate with one another given taskbased supervision only has attracted considerable attention recently, due to the growing interest in developing models for human-agent interaction. Prior work on the topic focused on simple environments, where training using policy gradient was feasible despite the non-stationarity of the agents during training. In this paper, we present a more challenging environment for testing the emergence of communication from raw pixels, where training using policy gradient fails. We propose a new model and training algorithm, that utilizes the structure of a learned representation space to produce more consistent speakers at the initial phases of training, which stabilizes learning. We empirically show that our algorithm substantially improves performance compared to policy gradient. We also propose a new alignment-based metric for measuring context-independence in emerged communication and find our method increases context-independence compared to policy gradient and other competitive baselines.

Automatic phylogenetic inference plays an increasingly important role in computational historical linguistics. Most pertinent work is currently based on expert cognate judgments. This limits the scope of this approach to a small number of well-studied language families. We used ...MORE ⇓

Automatic phylogenetic inference plays an increasingly important role in computational historical linguistics. Most pertinent work is currently based on expert cognate judgments. This limits the scope of this approach to a small number of well-studied language families. We used machine learning techniques to compile data suitable for phylogenetic inference from the ASJP database, a collection of almost 7,000 phonetically transcribed word lists over 40 concepts, covering two third of the extant world-wide linguistic diversity. First, we estimated Pointwise Mutual Information scores between sound classes using weighted sequence alignment and general-purpose optimization. From this we computed a dissimilarity matrix over all ASJP word lists. This matrix is suitable for distance-based phylogenetic inference. Second, we applied cognate clustering to the ASJP data, using supervised training of an SVM classifier on expert cognacy judgments. Third, we defined two types of binary characters, based on automatically inferred cognate classes and on sound-class occurrences. Several tests are reported demonstrating the suitability of these characters for character-based phylogenetic inference. Background & Summary The cultural transmission of natural languages with its patterns of near-faithful replication from generation to generation, and the diversification resulting from population splits, are known to display striking similarities to biological evolution [1, 2]. The mathematical tools to recover evolutionary history developed in computational biology  phylogenetic inference  play an increasingly important role in the study of the diversity and history of human languages. [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] The main bottleneck for this research program is the so far still limited availability of suitable data. Most extant studies rely on manually curated ar X iv :1 80 2. 06 07 9v 1 [ cs .C L ] 1 7 Fe b 20 18 collections of expert judgments pertaining to the cognacy of core vocabulary items or the grammatical classification of languages. Collecting such data is highly labor intensive. Therefore sizeable collections currently exist only for a relatively small number of well-studied language families. [8, 11, 15, 16, 17, 18] Basing phylogenetic inference on expert judgments, especially judgments regarding the cognacy between words, also raises methodological concerns. The experts making those judgments are necessarily historical linguists with some prior information about the genetic relationships between the languages involved. In fact, it is virtually impossible to pass a judgment about cognacy without forming a hypothesis about such relations. In this way, data are enriched with prior assumptions of human experts in a way that is hard to control or to precisely replicate. Modern machine learning techniques provide a way to greatly expand the empirical base of phylogenetic linguistics while avoiding the above-mentioned methodological problem. The Automated Similarity Judgment Program (ASJP) [19] database contains 40-item core vocabulary lists from more than 7,000 languages and dialects across the globe, covering about 75% of the extant linguistic diversity. All data are in phonetic transcription with little additional annotations.1 It is, at the current time, the most comprehensive collection of word lists available. Phylogenetic inference techniques comes in two flavors, distance-based and character-based methods. Distance-based methods require as input a matrix of pairwise distances between taxa. Character-based methods operate on a character matrix, i.e. a classification of the taxa under consideration according to a list of discrete, finite-valued characters. While some distance-based methods are computationally highly efficient, character-based methods usually provide more precise results and afford more fine-grained analyses. The literature contains proposals to extract both pairwise distance matrices and character data from phonetically transcribed word lists. [20, 21, 22] In this paper we apply those methods to the ASJP data and make both a distance matrix and a character matrix for 6,892 languages and dialects2 derived this way available to the community. Also, we demonstrate the suitability of the results for phylogenetic inference. While both the raw data and the algorithmic methods used in this study are freely publicly available, the computational effort required was considerable (about ten days computing time on a 160-cores parallel server). Therefore the resulting resource is worth publishing in its own right. 1The only expert judgments contained in the ASJP data are rather unsystematic manual identifications of loan words. This information is ignored in the present study. 2These are all languages in ASJP v. 17 except reconstructed, artificial, pidgin and creole languages.

Different structural features of human language change at different rates and thus exhibit different temporal stabilities. Existing methods of linguistic stability estimation depend upon the prior genealogical classification of the worlds languages into language families; these ...MORE ⇓

Different structural features of human language change at different rates and thus exhibit different temporal stabilities. Existing methods of linguistic stability estimation depend upon the prior genealogical classification of the worlds languages into language families; these methods result in unreliable stability estimates for features which are sensitive to horizontal transfer between families and whenever data are aggregated from families of divergent time depths. To overcome these problems, we describe a method of stability estimation without family classifications, based on mathematical modelling and the analysis of contemporary geospatial distributions of linguistic features. Regressing the estimates produced by our model against those of a genealogical method, we report broad agreement but also important differences. In particular, we show that our approach is not liable to some of the false positives and false negatives incurred by the genealogical method. Our results suggest that the historical evolution of a linguistic feature leaves a footprint in its global geospatial distribution, and that rates of evolution can be recovered from these distributions by treating language dynamics as a spatially extended stochastic process.

The availability of large diachronic corpora has provided the impetus for a growing body of quantitative research on language evolution and meaning change. The central quantities in this research are token frequencies of linguistic elements in the texts, with changes in frequency ...MORE ⇓

The availability of large diachronic corpora has provided the impetus for a growing body of quantitative research on language evolution and meaning change. The central quantities in this research are token frequencies of linguistic elements in the texts, with changes in frequency taken to reflect the popularity or selective fitness of an element. However, corpus frequencies may change for a wide variety of reasons, including purely random sampling effects, or because corpora are composed of contemporary media and fiction texts within which the underlying topics ebb and flow with cultural and socio-political trends. In this work, we introduce a computationally simple model for controlling for topical fluctuations in corporathe topical-cultural advection modeland demonstrate how it provides a robust baseline of variability in word frequency changes over time. We validate the model on a diachronic corpus spanning two centuries, and a carefully-controlled artificial language change scenario, and then use it to correct for topical fluctuations in historical time series. Finally, we show that the model can be used to show that emergence of new words typically corresponds with the rise of a trending topic. This suggests that some lexical innovations occur due to growing communicative need in a subspace of the lexicon, and that the topical-cultural advection model can be used to quantify this.

Han Chinese experienced substantial population migrations and admixture in history, yet little is known about the evolutionary process of Chinese dialects. Here, we used phylogenetic approaches and admixture inference to explicitly decompose the underlying structure of the ...MORE ⇓

Han Chinese experienced substantial population migrations and admixture in history, yet little is known about the evolutionary process of Chinese dialects. Here, we used phylogenetic approaches and admixture inference to explicitly decompose the underlying structure of the diversity of Chinese dialects, based on the total phoneme inventories of 140 dialect samples from seven traditional dialect groups: Mandarin, Wu, Xiang, Gan, Hakka, Min and Yue. We found a north-south gradient of phonemic differences in Chinese dialects induced from historical population migrations. We also quantified extensive horizontal language transfers among these dialects, corresponding to the complicated socio-genetic history in China. We finally identified that the middle latitude dialects of Xiang, Gan and Hakka were formed by admixture with other four dialects. Accordingly, the middle-latitude areas in China were a linguistic melting pot of northern and southern Han populations. Our study provides a detailed phylogenetic and historical context against family-tree model in China.

There is growing interest in the language developed by agents interacting in emergentcommunication settings. Earlier studies have focused on the agents symbol usage, rather than on their representation of visual input. In this paper, we consider the referential games of ...MORE ⇓

There is growing interest in the language developed by agents interacting in emergentcommunication settings. Earlier studies have focused on the agents symbol usage, rather than on their representation of visual input. In this paper, we consider the referential games of Lazaridou et al. (2017), and investigate the representations the agents develop during their evolving interaction. We find that the agents establish successful communication by inducing visual representations that almost perfectly align with each other, but, surprisingly, do not capture the conceptual properties of the objects depicted in the input images. We conclude that, if we are interested in developing language-like communication systems, we must pay more attention to the visual semantics agents associate to the symbols they use.

One of the distinguishing aspects of human language is its compositionality, which allows us to describe complex environments with limited vocabulary. Previously, it has been shown that neural network agents can learn to communicate in a highly structured, possibly compositional ...MORE ⇓

One of the distinguishing aspects of human language is its compositionality, which allows us to describe complex environments with limited vocabulary. Previously, it has been shown that neural network agents can learn to communicate in a highly structured, possibly compositional language based on disentangled input (e.g. handengineered features). Humans, however, do not learn to communicate based on well-summarized features. In this work, we train neural agents to simultaneously develop visual perception from raw image pixels, and learn to communicate with a sequence of discrete symbols. The agents play an image description game where the image contains factors such as colors and shapes. We train the agents using the obverter technique where an agent introspects to generate messages that maximize its own understanding. Through qualitative analysis, visualization and a zero-shot test, we show that the agents can develop, out of raw image pixels, a language with compositional properties, given a proper pressure from the environment.

Multi-agent reinforcement learning offers a way to study how communication could emerge in communities of agents needing to solve specific problems. In this paper, we study the emergence of communication in the negotiation environment, a semi-cooperative model of agent ...MORE ⇓

Multi-agent reinforcement learning offers a way to study how communication could emerge in communities of agents needing to solve specific problems. In this paper, we study the emergence of communication in the negotiation environment, a semi-cooperative model of agent interaction. We introduce two communication protocols  one grounded in the semantics of the game, and one which is a priori ungrounded and is a form of cheap talk. We show that self-interested agents can use the pre-grounded communication channel to negotiate fairly, but are unable to effectively use the ungrounded channel. However, prosocial agents do learn to use cheap talk to find an optimal negotiating strategy, suggesting that cooperation is necessary for language to emerge. We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.

The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks. Here we scale up this research by using contemporary deep learning methods and ...MORE ⇓

The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks. Here we scale up this research by using contemporary deep learning methods and by training reinforcement-learning neural network agents on referential communication games. We extend previous work, in which agents were trained in symbolic environments, by developing agents which are able to learn from raw pixel data, a more challenging and realistic input representation. We find that the degree of structure found in the input data affects the nature of the emerged protocols, and thereby corroborate the hypothesis that structured compositional language is most likely to emerge when agents perceive the world as being structured.

We study the stability of two coexisting languages (Catalan and Spanish) in Catalonia (North-Eastern Spain), a key European region in political and economic terms. Our analysis relies on recent, abundant empirical data that is studied within an analytic model of population ...MORE ⇓

We study the stability of two coexisting languages (Catalan and Spanish) in Catalonia (North-Eastern Spain), a key European region in political and economic terms. Our analysis relies on recent, abundant empirical data that is studied within an analytic model of population dynamics. This model contemplates the possibilities of long-term language coexistence or extinction. We establish that the most likely scenario is a sustained coexistence. The data needs to be interpreted under different circumstances, some of them leading to the asymptotic extinction of one of the languages involved. We delimit the cases in which this can happen. Asymptotic behavior is often unrealistic as a predictor for complex social systems, hence we make an attempt at forecasting trends of speakers towards 2030. These also suggest sustained coexistence between both tongues, but some counterintuitive dynamics are unveiled for extreme cases in which Catalan would be likely to lose an important fraction of speakers. As an intermediate step, model parameters are obtained that convey relevant information about the prestige and interlinguistic similarity of the tongues as perceived by the population. This is the first time that these parameters are quantified rigorously for this couple of languages. Remarkably, Spanish is found to have a larger prestige specially in areas which historically had larger communities of Catalan monolingual speakers. Limited, spatially-segregated data allows us to examine more fine grained dynamics, thus better addressing the likely coexistence or extinction. Variation of the model parameters across regions are informative about how the two languages are perceived in more urban or rural environments.

The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense ...MORE ⇓

The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (20032008), and (2) A decade of social media messages posted to Twitter (20082017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books. Regularization is also greater for tweets geotagged in the United States relative to American English books, but the opposite is true for tweets geotagged in the United Kingdom relative to British English books. We also find interesting regional variations in regularization across counties in the United States. However, once differences in population are accounted for, we do not identify strong correlations with socio-demographic variables such as education or income.

Building intelligent agents that can communicate with and learn from humans in natural language is of great value. Supervised language learning is limited by the ability of capturing mainly the statistics of training data, and is hardly adaptive to new scenarios or flexible for ...MORE ⇓

Building intelligent agents that can communicate with and learn from humans in natural language is of great value. Supervised language learning is limited by the ability of capturing mainly the statistics of training data, and is hardly adaptive to new scenarios or flexible for acquiring new knowledge without inefficient retraining or catastrophic forgetting. We highlight the perspective that conversational interaction serves as a natural interface both for language learning and for novel knowledge acquisition and propose a joint imitation and reinforcement approach for grounded language learning through an interactive conversational game. The agent trained with this approach is able to actively acquire information by asking questions about novel objects and use the justlearned knowledge in subsequent conversations in a one-shot fashion. Results compared with other methods verified the effectiveness of the proposed approach.

Inspired by previous work on emergent language in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary ...MORE ⇓

Inspired by previous work on emergent language in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration. The multi-modal multi-step setting allows agents to develop an internal language significantly closer to natural language, in that they share a single set of messages, and that the length of the conversation may vary according to the difficulty of the task. We examine these properties empirically using a dataset consisting of images and textual descriptions of mammals, where the agents are tasked with identifying the correct object. Our experiments indicate that a robust and efficient communication protocol emerges, where gradual information exchange informs better predictions and higher communication bandwidth improves generalization.

Human beings are talkative. What advantage did their ancestors find in communicating so much? Numerous authors consider this advantage to be obvious and enormous. If so, the problem of the evolutionary emergence of language amounts to explaining why none of the other primate ...MORE ⇓

Human beings are talkative. What advantage did their ancestors find in communicating so much? Numerous authors consider this advantage to be obvious and enormous. If so, the problem of the evolutionary emergence of language amounts to explaining why none of the other primate species evolved anything even remotely similar to language. What I propose here is to reverse the picture. On closer examination, language resembles a losing strategy. Competing for providing other individuals with information, sometimes striving to be heard, makes apparently no sense within a Darwinian framework. At face value, language as we can observe it should never have existed or should have been counter-selected. In other words, the selection pressure that led to language is still missing. The solution I propose consists in regarding language as a social signaling device that developed in a context of generalized insecurity that is unique to our species. By talking, individuals advertise their alertness and their ability to get informed. This hypothesis is shown to be compatible with many characteristics of language that otherwise are left unexplained.

A distinguishing property of human intelligence is the ability to flexibly use language in order to communicate complex ideas with other humans in a variety of contexts. Research in natural language dialogue should focus on designing communicative agents which can integrate ...MORE ⇓

A distinguishing property of human intelligence is the ability to flexibly use language in order to communicate complex ideas with other humans in a variety of contexts. Research in natural language dialogue should focus on designing communicative agents which can integrate themselves into these contexts and productively collaborate with humans. In this abstract, we propose a general situated language learning paradigm which is designed to bring about robust language agents able to cooperate productively with humans. This dialogue paradigm is built on a utilitarian definition of language understanding. Language is one of multiple tools which an agent may use to accomplish goals in its environment. We say an agent understands language only when it is able to use language productively to accomplish these goals. Under this definition, an agents communication success reduces to its success on tasks within its environment. This setup contrasts with many conventional natural language tasks, which maximize linguistic objectives derived from static datasets. Such applications often make the mistake of reifying language as an end in itself. The tasks prioritize an isolated measure of linguistic intelligence (often one of linguistic competence, in the sense of Chomsky (1965)), rather than measuring a models effectiveness in real-world scenarios. Our utilitarian definition is motivated by recent successes in reinforcement learning methods. In a reinforcement learning setting, agents maximize success metrics on real-world tasks, without requiring direct supervision of linguistic behavior.

The current mainstream approach to train natural language systems is to expose them to large amounts of text. This passive learning is problematic if we are interested in developing interactive machines, such as conversational agents. We propose a framework for language learning ...MORE ⇓

The current mainstream approach to train natural language systems is to expose them to large amounts of text. This passive learning is problematic if we are interested in developing interactive machines, such as conversational agents. We propose a framework for language learning that relies on multi-agent communication. We study this learning in the context of referential games. In these games, a sender and a receiver see a pair of images. The sender is told one of them is the target and is allowed to send a message from a fixed, arbitary vocabulary to the receiver. The receiver must rely on this message to identify the target. Thus, the agents develop their own language interactively out of the need to communicate. We show that two networks with simple configurations are able to learn to coordinate in the referential game. We further explore how to make changes to the game environment to cause the word meanings induced in the game to better reflect intuitive semantic properties of the images. In addition, we present a simple strategy for grounding the agents code into natural language. Both of these are necessary steps towards developing machines that are able to communicate with humans productively.

An important reason to investigate dolphins is that they exhibit striking similarities with humans. Like us, they use tools: dolphins break off sponges and wear them over their rostrum while foraging on the seafloor (Smolker et al 1997). Dolphins are also capable of recognizing ...MORE ⇓

An important reason to investigate dolphins is that they exhibit striking similarities with humans. Like us, they use tools: dolphins break off sponges and wear them over their rostrum while foraging on the seafloor (Smolker et al 1997). Dolphins are also capable of recognizing their body in front of a mirror (Reiss & Marino 2001). Closely related with their capacity to see through sound is their capacity to form abstract representations that are independent from modality (Herman et al 1998). Dolphins share with us other traits that are appealing from the perspective of language theory. First, they exhibit spontaneous vocal mimicry (Reiss & McCowan 1993) which suggests a predisposition to learn a vocal communication system. Second, they live, in general, in fission?fusion societies and display complex social behaviours (Lusseau et al 2003, Connor & Krützen 2015) while converging research supports that the complexity of a society and the complexity of communication are correlated (Freeberg et al 2012). Third, they can learn a signal to innovate, namely to show a behavior not seen in the current interaction session (Foer 2015). This tells us something about the limits on memory and creativity in dolphins and is challenging from a theoretical perspective: many researchers believe that a crucial difference between humans and other species is our unbounded capacity to generate sequences, e.g., by embedding sentences into other sentences (e.g., Gregg 2013, Hauser et al 2002), or a capacity for large lexicons (Hurford 2004). In short, bottlenose dolphins share many traits we associate as pre?requisite for our complex linguistic abilities. Although possessing such an infinite capacity makes a qualitative difference compared to a species with a finite capacity, the fact is that (a) a species being able to generate a huge number of sentences may not be distinguishable from a species that has infinite capacity (supposing that the latter is really true) and (b), humans have problems with parsing sentences with just a few levels of embedding (Christiansen & Chater 2015). The point is that the problem of infinite vs finite capacity does seem to be well poised and that dolphin capacity to innovate is being overlooked. We humans are fascinated by infinity (perhaps for purely aesthetical reasons) and may have rushed to steal the flag of infinity to keep it in some anthropocentric fortress where other species are not allowed to get in. In a recent book, the parallel in cognitive abilities between