Selective Attention and Discourse: Re-framing the Problem of Coordination

The area of discourse processing involves the study of the processes and events that comprise naturalistic communication. The aim of this area is to understand how individuals build meaning from the extended linguistic events and psychological processes that causally enable naturalistic communication (Sparks & Rapp, 2010). A central concern in discourse processing is to explain how speakers (interlocutors) are able to coordinate their language use to achieve mutual understanding. This is often referred to in the literature as the coordination[1] problem (Garrod & Anderson, 1987). Solutions to the coordination problem are a source of contention among models of discourse processing (Clark, 1996; Horton & Keyser, 1996; Pickering & Garrod, 2004). In many ways, hang-up over this tension is part of the reason models of discourse have been lagging behind the rest of cognitive science in terms of their lack of formalism.

The collaborative model of communication (Clark & Wilkes-Gibbs, 1986; Schober & Clark, 1989; Clark, 1996) frames the problem of coordination as a matter of the explicit representations needed for collaboration. The interactive alignment model frames the problem of coordination as a matter of determining the automatic mechanisms and stages of processing specific to conversation that facilitate coordination (Pickering & Garrod, 2004). These models are able to capture some aspects of discourse, but the way they frame the problem of coordination produces serious problems regarding the use of problematic levels of abstraction as well as assumptions about non-sense mechanisms. A shift in perspective is needed. The current proposal is that the problem of coordination is essentially a problem of selective attention. The underlying cognitive principles that enable interlocutors to exercise their natural capacity for coordination are supported by the overlying cognitive mechanisms that give rise to selective attention. In other words, explanations of selective attention should be brought into the fold of explanations of coordination.

Here are a few preliminary reasons to consider the need for such a change in perspective. Reframing the coordination problem as a matter of selective attention gives rise to the possibility for more formal expressions of the principles of coordination in natural language use. One of the considerations for framing the problem of coordination as a matter of selective attention is to offer an explanation of coordination using non-specialized cognitive mechanisms as opposed to ones that are restricted only to discourse. The benefit of this would be integration between theories of discourse processing and other areas of cognitive science. A similar position is taken by the minimalist account of joint action (Shintel & Keyser, 2009). However, their proposal is that the coordination problem is a matter of memory cues and they only refer to formal expressions of memory cues as an aside.

Another preliminary reason is that the way coordination is currently framed by models of discourse elude formal expression. The call for more formal expressions of models of discourse processing is echoed throughout the discipline yet it remains unclear how to proceed in this endeavor (Brown-Schmidt & Tenenhaus, 2004; Pickering & Garrod, 2004). Selective attention has been expressed formally in a number of different cognitive models and used to explain a variety learning phenomena (Kruschke, 2010). Reframing the problem of coordination as matter of selective attention welcomes this level of formal expression into the study of discourse processing allowing for the possibility of coordination mechanisms that are well specified.

With regards to being well specified, it may be useful at this point to be selective about what is meant by selective attention. Selective attention refers to the cognitive capacity for targeted processing. In other words, it is the mechanism by which the multitude of possible information in our environment is constrained to the relevant information. To frame the problem of coordination as a matter of selective attention is to place selective attention in its rightful place within discourse processing, as the mechanism that gives rise to speaker’s natural ability to coordinate their language use.

The basis for reframing the coordination problem relies on a critique of the current status of modeling in discourse processing. This critique is organized into four sections. In the first section, the coordination problem is described in more detail along with recent findings that enable the coordination problem to still be called a problem. In the second section, an appeal is made to return to the basics. Early theories and principles from the philosophy of language and pragmatics are revisited along with recent experimental evidence that gives reason to consider the role of their foundational assumptions in the proposed framing of the coordination problem as a matter of selective attention. In the third section, the current status of modeling in discourse processing is considered by reviewing the collaborative model (Clark, 1996) and the interactive alignment model (Pickering & Garrod, 2004). In the final section, the position to reframe the problem of coordination is described along with considerations to rethink certain approaches to modeling.

The consequences and constraints of the current proposal are addressed in the discussion section. The coordination problem in conversation does not need to be approached by appealing to specific discourse mechanisms that are divorced from the rest of cognition. On the contrary, selective attention as a general cognitive faculty allows for a more direct and formal approach.

The Coordination Problem

Conversation is easy. In most natural settings, people engage in conversations with great ease. Exercising our ability for conversation is a far simpler task than explaining why it is so easy. Garrod and Anderson (1987) characterized the basic problem that is put before participants in a conversation as the need for speakers to select expressions to convey what is intended while listeners have to select interpretations for those expressions in the hope that they capture the intended meaning. This was regarded as the coordination problem. The coordination problem does present a challenge for dialogue, but not in the way that is suggested by the way it is instantiated here. The challenge is not presented to the participants who are coordinating in their conversation in a lab during an experiment. The challenge is presented to the experimenter who is trying to explain how they are coordinating.

Contrasting effects that have been observed among the issues of partner-specific knowledge (e.g. Brennan & Clark, 1996; Keyser, Barr, Balin, & Paek, 1998) and the disparity between monologues and dialogues (e.g. Fox Tree, 1999; Barr & Keyser, 2004). These issues are at the forefront of contention between views of discourse. In this section, some of the problems involved in these two areas are discussed as well as some of the evidence to support contrasting views regarding these issues.

Partner-Specific Knowledge

The issue of partner-specific knowledge is over the extent to which people rely on information about their partner to obtain mutual understanding. Early evidence that was taken to point to partner-specific knowledge comes from observations of what Garrod and Anderson (1987) referred to as lexical entrainment. In their study, participants had to cooperate in a maze game. They observed that as participants began to use repeated references to objects, their references became shorter as they coordinate they perspective. This grounding of references, as Clark (1996) calls it, has been observed in a number of studies (Clark & Wilkes-Gibbs, 1987; Schober & Clark, 1989; Wilkes-Gibbs & Clark, 1992).

An issue with lexical entrainment is to determine whether or not the coordinated references observed are indicative of partner-specific processing in conversation. The overhearer effect is often taken as evidence for the role of partner-specific processing. Schober and Clark (1989) reported an overhearer effect in a study using a referential communication task. The task was for a director to describe the array of pictures before them so that a matcher could arrange their pictures to match the directors. Directors and matchers (addresses) were able to converse about the tangram pictures while overhearers listened and could not interact. Schober and Clark found a deficit in performance for overhearers compared to addresses. They took this as evidence for the strategic, partner-specific processing involved in conversation.

Common ground, the shared mutual knowledge between interlocutors, is often thought to be an important aspect of reference resolution in conversation. Schober and Clark rely on this notion in their explanation of the overhearer effect in their claim that overhearers are not necessarily working towards building their common ground with either the director or the matcher in this task. Interaction is the key notion here. Other explanations of the overhearer effect highlight the role of repair in conversation. Barr and Keyser (2002) note that overhearers do not have the same opportunity for feedback and since the ambiguous utterances they will need to repair are likely to be different from the ones the matchers repair, they will have problems by virtue of not exercising strategic, effortful feedback. They found that even when such feedback is removed, listeners who believed themselves to be overhearers automatically aligned their semantic representations with the speaker’s to the same degree as listeners who believed themselves to be addresses. The partner-specific knowledge of common ground is not the reason for the overhearer effect in this view, the lack of opportunity for repair is.

The use of conceptual pacts are another area of contention with regards to partner-specific knowledge. Brennan and Clark (1996) looked at the role of partner specific-knowledge in how references are produced between partners. They found that when speakers refer to an object, they are proposing a conceptualization of it to for future use. These conceptual pacts were observed by participants in their study even when there were more efficient ways to refer to objects. Other interpretations of this finding is that conceptual pacts are more indicative of expectations of consistency as opposed to interactive conceptualization (Shintel & Keyser, 2007).
Resolving the issue of partner-specific knowledge will involve accounting for these contrasting observations. Neither of the two ways of regarding partner-specific knowledge deny that it is possible for people to use this knowledge, they can be seen to disagree on if and when. Seriously considering the role of selective attention in coordination involves making strict predictions about the factors that will effect whether or not partner-specific knowledge is used in comprehension. This would be a way to draw out assumptions from these different views and investigate them.

Disparity Between Monologues and Dialogues

Another area of contention in views of discourse regards disparity between monologues and dialogues. Conversation is considered a special instance of natural language use (Grice, 1975; Clark, 1996). What is most central to the problem here is the extent to which conversation is different from other kinds of natural language use. If it is a difference in degree, then conversation involves the same processes as other areas of natural language but to a different extent. If it is a difference of kind, then conversation involves different processes than other areas of natural language use. How this is viewed greatly affects how the coordination problem is viewed.

In the study of discourse processing, the issue described previously is framed as an issue regarding disparity between monologues and dialogues. There is evidence to take seriously the difference between monologues and dialogues. Fox Tree (1999) used a referential communication task and observed differences in performance from when directors provided descriptions when matchers could offer feedback and when they could not. Overhearer’s performance in this task was much better when listening to dialogues then when they listen to dialogues. Fox Tree (1999) attributes this difference to the great number of discourse markers that are present in a dialogue from the interactive processes that are involved in it. Other views from this evidence take this to suggest that the mechanisms used for comprehension in dialogue (conversation) are drastically different than those used in other areas of natural language (Pickering & Garrod, 2004; Garrod & Pickering, 2004).

What is important to consider from both of the issues discussed in this section is the relation of conversation to other areas of natural language use. To try and clarify what is involved in this problem, a historical perspective will be taken in the next section regarding the development of how these issues were regarded from the philosophy of language and pragmatics. The coordination problem is one of how meaning is achieved between interlocutors. The next section will describe some of the ways of talking about meaning along with recent evidence to take seriously some of these claims.

Early Contributions to the Study of Discourse
The relation of conversation and natural language can be explored through the problem of how meaning is achieved in natural language use. Early work in the philosophy of language and pragmatics has dealt extensively with this problem. This early work has been instrumental in how experimental approaches to discourse regard the relation of conversation to natural language use. Moreover, conceptual issues regarding the kinds of representations needed to achieve mutual understanding emerged from this early work and are still a point of contention among researchers today. In order to understand the relation of conversation to natural language, issues regarding the nature of meaning and the different kinds of meanings in language need to be addressed in order to clearly consider what is at stake when the problem of coordination is regarded as a matter of stages as opposed to a matter of selective attention.

Contributions from analytic philosophy of language

The problem of coordination was originally a philosophical issue. The issue was regarded as a matter of untangling possible meanings during comprehension though there was no explicit claim of disparity between production and comprehension initially. Three accounts of meaning from the philosophy of language that will be considered here include theories of reference (Russell, 1905; Frege, 1952; Donnellan, 1966), speech acts (Searle, 1969), and the cooperative principle (Grice, 1975). An important issue concerning how to frame the problem of coordination that is developed in these accounts regards the role of monitoring a speaker’s intentions to coordinate mutual comprehension. The ways of talking about meaning and conversational interaction in this section are instrumental to the way researchers in discourse processing approach and talk about the naturalistic study of conversation.

Reference. The first account of meaning that will be treated here actually includes four distinct treatments of meaning. They are treated together because of the common distinction they each make between two kinds of meaning. Frege (1952) argued for the necessity of distinguishing between an expression’s reference and its sense. The reference of an expression is what the expression points to while its sense is what is pointed to by the grammar. Deviating from his classic example, an example that foreshadows a problem to come in this essay is used. When Pickering & Garrod (2004) argue that a priming mechanism is the cognitive explanation for the problem of coordination, the sense is what is pointed to by the grammar, a priming mechanism. The referent, that which the expression is really pointing to, is selective attention.

Russell (1905) highlights a different dichotomy of meaning. He argues that the only expressions that have definite reference are logically proper names like I, Harold, and now. For other expressions, he describes meaning as emergent through definite descriptions that are denoting phrases. Their meaning comes from quantified logical constructions. As opposed to Frege’s view, expressions are not regarded as being directly referential. They are only referential by means of the propositions they are situated within. Although Russell’s account was put forth before Frege’s, it is easier to raise the issue of intentionality in Russell’s view of denoting in light of Frege’s sense and reference distinction. Applying these two distinctions to the problem of coordination motivates how it is treated by later theorists. The notions of sense and reference do not address intentionality in and of themselves, but when applied to coordination in conversation, drawing on the speaker’s intention can be seen to be a useful way to account for interpreting an utterances sense or reference. A speaker’s intention does not have a role in Russell’s view of denoting.

Donnellan (1966) made explicit use of speaker’s intention to distinguish one kind of view over another. In his view, a definite description can mean something in two distinct ways. He referred to these two ways in which to talk about something as referential use and attributive use. Definite descriptions can be used referentially through the mutual knowledge shared by the speaker and the addressee. For example, a mutual acquaintance of two interlocutors may walk into the room wearing a silly hat and they both notice it. A speaker can exploit this mutual knowledge to use a phrase like “the silly hat just walked in” to referentially mean their mutual acquaintance. What is referred to with referential use depends on explicit mutual knowledge. The attributive use of a definite description refers to something by means of the description itself without requiring mutual knowledge.

The philosophical approaches to meaning that have described so far address the relationship between meaning and reference. The interest in explicating these views here is to provide a background to the issues that concern models of discourse such as the role of mutual knowledge in coordinating meaning. An explicit interest towards motivating the role of selective attention in coordination regards the relation of conversation to natural language use. The views described so provide general accounts of the relationship between meaning and reference as it applies to all forms of natural language use making no specific assumptions about conversation requiring specific principles to govern meaning. The following philosophical accounts of meaning do make specific assumptions about conversational interaction regarding the nature of speech acts (Searle, 1969) and the appeal to a cooperative principle (Grice, 1975) to constrain meaning in conversation. An important thing to consider as they are being described is the extent to which they rely on conversation to explain meaning and the possibility that they may also explain meaning outside of conversation.

Speech acts. Searle’s (1969) early work on speech acts tried to explain speech acts and illocutionary force as constituted by the rules of language. Speech acts refer to the collection of acts that take place when an utterance is produced. Searle described a number of different kinds of speech acts. An utterance act is an act of saying something while an illocutionary act is something accomplished by means of an utterance act. For example, a speaker performs an illocutionary act when they say an addresses’ name to signal the start of a conversation. The utterance act was in the speaker saying the name of the addressee while the illocutionary act was in the signal to start the conversation.

Searle also draws a distinction between the illocutionary force and the propositional content of an utterance. Take the signaling example previously described. If the speaker were to yell the addressee’s name, this would alter the illocutionary force but the propositional content would be the same. Another important speech act that Searle described was propositional acts. Appealing to common ground as developed later by Clark (1996) clarifies what Searle meant by propositional acts. Common ground reflects the shared mutual knowledge between speakers. Propositional acts exploit this common ground. Essentially, they are speech acts that make use of presupposition in their meaning.

While Searle’s early work sought to explain speech acts as by means of the language, some of his later work applied his account of illocutionary force to investigations of intentionality (Searle, 1983). Much of this work involves reliance a construct he refers to as Background. Background refers to a number abilities, tendencies and dispositions in people that are primarily unintentional. Searle’s work on speech acts has been extremely influential in how discourse researchers are able to analytically talk about the number of things that take place in a conversation. Speech acts have also been used in the experimental study of discourse to investigate the role of intentionality in conversation (Holtgraves, 2008).

Cooperative principle. Grice’s (1975) cooperative principle has been influential to understanding and talking about meaning and conversational interaction. The cooperative principle was instantiated as way to explain implicature. Implicature refers to the meaning of an utterance that is communicated beyond what has been said. Like Frege, Grice distinguished between sentence meaning and speaker meaning. He made use of the term conversational implicature to refer to speaker meaning in conversation.

Grice’s explanation for conversational implicature took the form of his cooperative principle and the four maxims he developed in concordance with it. The cooperative principle, roughly put, is the assumption that participants in a conversation make that utterances are produced with the goal contributing to the conversation at the right time and in the right way. The four maxims that coincide with cooperative principle include quantity, quality, relation, and manner. These refer to additional assumptions that participants make in a conversation that contributions will contain the appropriate amount of information, information that is true, information that is relevant, and contributions will be put forth in an expectable way.

As simple as they are, the cooperative principle and these four maxims are able to offer interesting explanations of implicature. Whenever the assumptions involved in the cooperative principle and the maxims are violated, implicatures arise. Grice refers to this as flouting a maxim. Consider an example of flouting the maxim of relevance. Tim asks John how his date last night went and John responds with a question, “Did you get a chance to watch the news last night?” In this scenario, Tim does not take that question seriously but does take the meaning seriously that ensues from the flouting of the maxim of relevance. John’s irrelevant response gave rise to the meaning that he did not want to talk about it. This conversational implicature arises as a result of both participants being aware of the intentional flouting of the maxim. Recent work in experimental pragmatics have investigated a Gricean view of conversation as suggested here providing evidence to take these principles more seriously in how conversation is studied (Novek & Reboul, 2008).

To recap up to this point, the contributions from the philosophy of language that have been reviewed here include theories of reference (Russell, 1905; Frege, 1952; Donnellan, 1966), speech acts (Searle, 1969), and the cooperative principle (Grice, 1975). There are three main considerations that are motivated by what has been discussed in this section. The first is that this early work has produced distinctions of meaning and develop ways of clearly talking about what happens in a conversation influencing approaches to discourse processing. The second consideration is the contention over the role of intention and mutual knowledge in the theories of meaning developed here. This contention is still present among theorists today (e.g. Nadig & Sedivy, 2002; Horton & Gerrig, 2005; Holtgraves, 2008; Shintel & Keyser, 2009; Galati & Brennan, 2010)

A third consideration is an issue raised at the beginning of this section. What is the relation of conversation to natural discourse. Among the views from this view, Grice (1975) is the only one out of the philosopher’s we’ve considered who has made an explicit claim about the relation of conversation to natural language. Grice views conversation as a special instance of natural language use. Is a special instance enough to warrant a difference of degree?

Contributions from Pragmatics

Contributions from philosophy of language have influenced considerations about the nature of meaning and have provided a clear way to talk about meaning. The contributions made by pragmatics have influenced considerations about the nature of language use and how to regard the relation of conversation to the rest of language use. Pragmatics involves studying the mechanisms and processes that allow for more to be communicated than what is said (Green, 1996; Sperber & Noveck, 2004). Relevance theory (Sperber & Wilson, 1982) has had a large impact on how research in discourse processing has thought about what it is speakers are doing when they communicate (e.g. share mutual knowledge Clark & Wilkes-Gibbs, 1986; Schober & Clark, 1989; Clark, 1996; align autonomous representations; Horton & Keyser, 1996; Keyser et al. 1998; Pickering & Garrod, 2004; Shintel & Keyser, 2007).

Relevance theory. Sperber and Wilson (1982) distinguish two ways in which a thought can be communicated from one person to another. One way is through explicit coding and decoding but the other way is to make interpretive inferences. Relevance theory is a proposed explanation for communication that involves implicit inferences. Under this view, people arrive at what the presumption of relevance when they engage in inferential communication. The presumption of relevance is that implicit messages are relevant enough to be worth bothering to process and the speakers will be as economical as they possibly can be in communicating it. An overarching principle in this theory is the Cognitive Principle of Relevance. It represents the claim that human cognition tends to be geared to the maximization of relevance.

Relevance theory has been given much attention and has been successfully applied to a number of pragmatic problems (Sperber & Wilson, 2002). One assumption that relevance theory makes is that of a particularly Gricean view of pragmatic interpretation where inferences about speaker’s intentions are necessary for comprehension in what they refer to as mind reading. This has been an influential theory for discourse processing and can still be thought to present a challenge to models of discourse that make opposite claims. A recent study by Hortgraves (2008) has demonstrated that the intention recognition that is suggestive in relevance theory can occur automatically in conversation. This supports the notion of implicit inference suggested by relevance theory and can be seen to influence the selective attention that occurs in conversation.

At this point, the overall approach of this essay will shift from the philosophical and pragmatic issues that have influenced the study of discourse processing to return to the empirical work that underlies models of discourse. What is at stake in this effort is the claim that the problem of coordination is a matter of selective attention. One of the reasons previously mentioned for framing the problem of coordination as a matter of selective attention is to offer an explanation of coordination using non-specialized cognitive mechanisms as opposed to ones that are restricted only to conversation. The validity of this reason rests on the relation of conversation to natural language use. What has been shown so far is that work on referring (Russell, 1905; Frege, 1952; Donnellan, 1966) and speech acts (Searle, 1969) in natural language are not specific to conversation. The conversational interaction that Grice (1975) sought to explain does give reason to think that conversation is a special instance of natural language use. Relevance theory (Sperber & Wilson, 1982) also suggests that there is something unique to conversation by assuming that people draw inference about intentional states when they engage in conversation. These issues have motivated the way the coordination problem was explored in the studies described in the first section. In the next section, the way models of discourse have applied these issues along with experimental evidence will be discussed.

Models of Discourse

So far, we have seen that the coordination problem involves the processes that enable participants in a conversation to obtain mutual understanding. Two particular models are discussed that make drastically different assumptions about how to frame the coordination problem and even how to phrase it. Both of these models can be seen to rely on the role selective attention in discourse processing. Recent evidence supporting a grounded view of cognition as well as the role of prediction in discourse motivates an emphasis on the role of selective attention in discourse processing.

The Collaborative Model

The collaborative model is advocated for by Clark (1996). It emphasized the role of interaction in discourse. Conversation is seen to unfold as a joint action whereby participants are grounding their utterances to conform to their common ground. Common ground serves to constrain meaning. Under this view, audience design (Isaacs & Clark, 1987) and perspective taking (Wilkes-Gibbs & Clark, 1992; Schober, 1993) play important roles in the partner-specific knowledge that is used to disambiguate references.

The collaborative model frames the coordination problem as a matter of the explicit representations used for mutual understanding. It can be seen to phrase the coordination problem as the need to explain what speakers are doing when they obtain mutual understanding. This is a call for explanation at a high level of abstraction. It explains discourse in light of what speakers are doing when they interact. Given this, the collaborative model is seen as an explanation of how speakers are explicitly building common ground in conversation and how they rely on this common ground to both produce and comprehend utterances.

Criticisms of the collaborative model have typically involved the issue of processing constraints. It has been argued that the representational processing of something like common ground is too costly (Horton & Keyser, 1996; Keyser, Barr, Balin, & Paek, 1998; Keyser, Barr, Balin, & Brauner, 2000; Bard, Anderson, Chen, Nicholson, Havard, & Dalzeljob, 2007). This view notes that the level of representation in common ground could be used to constrain meaning, but this type if inferential process is costly and is only relied on when there is evidence of miscommunication (Horton & Keyser, 1996). It is argued that the joint action that is involved in conversation can be constrained by cues involved in the nature of interaction without reliance on such inferential processes (Shintel & Keyser, 1996).

The Interactive Alignment Model

In light of the processing constraints involved in conversation, Pickering and Garrod (2004) offered a model of discourse that describes the interactive alignment of representation. They define the problem of coordination as a matter of alignment of representation. Explaining how speakers effectively communicate is pursued by explaining the mechanism that allows them to obtain mutual understanding as opposed to what it is they do when they obtain mutual understanding.

Pickering and Garrod (2004) describe a priming mechanism that allows for speakers to align their representations at multiple levels. Priming occurs at the level of phonetic representation, lexical representation, semantic representation, and the situational models that speakers employ in conversation. Evidence of semantic priming (Nicol & Pickering, 1993), syntactic priming (Pickering & Branigan, 1999), and structural priming (Pickering & Ferreira, 2008) are taken to support a priming mechanism as the principle means by which mutual understanding is obtained.

A few assumptions to note that this model makes is that the mechanisms of conversation are distinct from the mechanisms of other natural language use and that interlocutors make use of implicit common ground. The first assumption comes from how they define the priming mechanism. They argue that this priming mechanism evolved primarily for conversation use and is therefore best suited for dialogue. Monologue is taken to rely on other means of inference since priming between interlocutors, by definition, cannot occur.

The second assumption regards their distinction between implicit and explicit common ground. The automatic alignment that occurs at each level is described as producing implicit common ground between interlocutors. When utterances are misaligned, when the implicit common ground fails, the priming mechanism is set aside to be replaced by an interactive repair mechanism. They argue that this repair mechanism involves an explicit check of the ambiguous utterance with one’s own and if alignment does not occur from that, speakers then iteratively rephrase utterances to maintain their implicit common ground.

These two assumptions are the major source of criticism of the interactive alignment model. Barr & Keyser (2004) question whether language processing is different in dialogue. They argue that the difference between monologue and dialogue may be more of a difference in degree that the categorical distinction that can be seen in the interactive alignment model. While Pickering and Garrod (2004) disagree with a categorical distinction being made in their proposal, the claim made of a difference in degree between the ease of coordination in monologue and dialogue is an empirical one that warrants direct testing. They offer evolutionary reasons why this should occur (Garrod & Pickering, 2004), but explanation should return to the experimental study of this like the ones described earlier in this essay.

Regarding the second assumption mentioned here, is it reasonable to propose a priming mechanism? The interactive alignment model proposes a priming mechanism to explain the problem of coordination. Regarding priming as a mechanism itself is a quite different conception of priming from its initial use. In its early use in psychology, priming referred to a reduction in response time observed in a controlled experimental setting. Priming was a descriptive term to describe a certain kind of observed effect. Priming was referred to as a consequence of the mechanisms or processes underlying word recognition as opposed to being treated as the mechanism itself.

Moreover, for Pickering and Garrod (2004) to regard priming as a mechanism itself is for them to take a step away from their call for more formal approaches to the mechanisms underlying discourse processing. Computational expressions of a priming mechanism like the one assumed by the interactive alignment model would be difficult, if not impossible, to implement within a formal model of experimental data. This is because they assume priming as the central mechanism of dialogue, but not monologue. It plays a central role at every level of representation yet is specific to representations activated during dialogue. Constraining this mechanism to be applied to a cognitive model that makes formal predictions of behavioral data is intractable. A more tractable assumption is that priming is an emergent property of the processes underlying dialogue (Brown-Schmidt & Tenenhaus, 2004).

Role of Selective Attention in Discourse

Both of these models make different predictions about what is involved when speakers obtain mutual understanding, either through their common ground or a priming mechanism. Essentially, they can be taken to involve the way interlocutors selectively attend to certain elements in discourse. In a way, selective attention is the common ground between these two models. In other words, both models are putting forth explanations of the principles that govern selective attention in discourse. It may be the case that studying the role of selective attention in discourse is the way to pursue more direct evaluation of these models.

What makes it difficult to directly compare these two models is that they are offering explanations at different levels of abstraction. The collaborative model explains the coordination problem through the explicit processes speakers are doing. The interactive alignment model explains the coordination problem through the implicit priming mechanism speakers are using. This approach to modeling is a hindrance to cognitive approaches suitable within cognitive science. What interlocutors are doing is evident in the problem of coordination itself. They are exercising their natural ability to coordinate. An approach more suitable to cognitive science is to focus on the causally enabling conditions that give rise to the natural capacity of people to coordinate their language use. Instead of trying to offer explanations that define what language users do, cognitive explanations should be about the discovery of what enables language users to do what they do.

The interactive alignment model is a step towards formalism in approaches to discourse processing, however, it posits underspecified mechanisms that make it difficult to relate what speakers are doing when they collaborate and what enables them to do so. The collaborative model is a good instance of an attempt to explain what speakers are doing when they coordinate, however, it also does not posit well specified mechanisms to account for how they do it. What both models share can be found in their appeal to how speakers selectively attend to certain aspects of discourse. In the next section, an appeal to focus on selective attention in discourse modeling is made. Recent evidence for the role of prediction in discourse along with grounded views of cognition support an approach to studying the coordination problem as a matter of selective attention.

Reframing the Problem of Coordination

The main assumption in this essay is that the problem of coordination is a problem of attention. Coordination in discourse involves the constraining of all possible meanings to achieve mutual understanding. This constraining of all possible meanings reflects peoples natural ability to selectively attend to certain things. An approach to coordination as a problem of attention seeks to apply the cognitive formalisms used to explain the processes and mechanisms underlying attention to model the coordination observed in discourse processing.

An assumption of this view is that the same general principles that govern attention as it pertains to abilities studied in cognitive science such as perceptual learning, categorization, and causal learning, are also the principles that govern attention as it pertains to discourse processing. This is by no means a novel premise (e.g. Shintel & Keyser, 2009). The novelty is in the conclusion that the main approach used by the areas of cognitive science mentioned previously can be applied to the study of coordination in discourse processing by focusing on attention. The formalisms that express the unifying principles regarding attention in these areas can be applied to conversational coordination by reconciling experimentation with modeling. This does not require the establishment of a new experimental paradigm. It requires the collection of data that is better suited for modeling.

There are two sources of evidence to support this shift in approach. The first regards a number of recent studies that indicate a prominent role for prediction in discourse. The second is the number of recent studies that support a grounded view of cognition. Interest in studying the role of selective attention in discourse is in describing the underlying principles of prediction in accordance with perceptual experience.

Prediction in Discourse

Recently, a collection of evidence has been accumulating that suggests that the coordination observed in discourse is coupled with how speakers predict and anticipate one another’s responses (e.g. Shintel & Keyser, 2007; Pickering & Garrod, 2009). A few of these studies are discussed here with emphasis on what they suggest about the role of selective attention in discourse processing. In light of this consideration, the issue of partner-specific knowledge will be returned to as an influencing factor in selective attention.

Shintel &Keyser (2007) take up the issue of partner-specific knowledge in observations of lexical entrainment and conceptual pacts. They offer an explanation of these as an aspect of the expectations interlocutors take up in conversation. Specifically, they explain that expectations for simplicity guide interlocutors to minimize their references and have consistency in those references. This can be seen as an influence to selectively attend to certain references and that this also influences predictions about likely references.

Grounded Cognition and Prediction

A different emphasis on the role of prediction in comprehension comes from views that attempt to incorporate action systems into perceptual systems. Pickering and Garrod (2009; 2010) propose that the production systems of language facilitate comprehension through what they describe as emulators. These production-based emulators serve to simulate meanings that are evoked in conversation to make predictions. This “thinking for speaking” notion has been seen before as an idea about how production and comprehension are coupled (Slobin, 1996). The attempt that is made in this view is to demonstrate the integration of multiple cognitive systems in comprehension.

Further support for the role of prediction in comprehension comes from a growing amount of evidence that suggest a grounded view of cognition. Barsalou (2008) reviews some of this evidence pointing out that there is little reason to suspect that amodal representation plays as large a role in cognition as previously thought. More and more evidence from perceptual learning tasks suggests that modal representations and simulation play a large role in the processes involved in the task. These grounded processes can be taken to further suggest that simulation may be a large part of what guides prediction in discourse. This also suggests that cues present in conversation are used to simulate possible meanings resulting in early disambiguation. Studies that have investigated the time course of disambiguation demonstrate that this kind of selective attention is fueled by cues from partner-specific information and that this occurs early on in processing (Hanna & Brennan, 2007; Galati & Brennan, 2010).

Overall, what has been suggested here is that selective attention in discourse involves predictions made during discourse. While the evidence that was described does not directly suggest selective attention as the primary mechanism of coordination, they allow the coordination problem to be framed in a way that allows for formal models to enter the picture. Framing the coordination problem as a matter of selective attention allows for the formal expression of the principles underlying coordination to be used to investigate discourse.
Bayesian models of language comprehension have emerged that already attempt to look at part of this problem. Griffiths, Steyvers, and Tenenbaum (2007) have taken up the topic of semantic representation from a Bayesian perspective showing how adjustment of belief for predicted lexical items based on previous experience with lexical items can be seen to capture certain aspects of language comprehension. Selective attention is a topic that has been well studied and implemented in formal cognitive models (Krushke, 2010). While several aspects of moving in this direction in discourse processing warrant further investigation, the role of prediction in comprehension is suggestive of Bayesian inference being involved in cognitive processes underlying comprehension in discourse.

Discussion

Reframing the coordination problem as a problem of selective attention gives rise to the possibility for more formal expressions of the principles of coordination in natural language use. These formal expressions could then be instantiated as cognitive models used to generate predictions about the coordinated comprehension observed in experimental studies of discourse. This level of explicit theorizing would bring about the synergistic feedback loop between modeling and experimentation currently called for in the field (e.g. Pickering & Garrod, 2004; Brown-Schmidt & Tenenhaus, 2004) and already enjoyed by most areas of cognitive science today (McClelland, 2009). Moreover, an emphasis on the role of selective attention in discourse processing would also allow for more integration between theories of discourse processing and theories of other cognitive capacities such as perceptual learning, categorization, and causal learning, to name a few. This sort of integration is at the heart of the multidisciplinary nature of cognitive science and is tantamount with its aim to explain what gives rise to intelligent behavior.

[1] Throughout this review, the word coordination is conventionally used. To avoid ambiguity, regard all uses of the word coordination to restrictively refer to the coordination that is observed in natural language use unless specifically noted. This should not be taken as a theoretical move to claim that the underlying principles that govern coordination in discourse are divorced from coordination that occurs outside of language use. In fact, the opposite is suggested. However, it is not in the scope of this work to provide a synthesis between experimental data from coordination in language use and behavioral data from coordination outside of language use though it is an interesting and relevant direction to that I will take in future work.