Comments

Tuesday, January 31, 2017

The first two concern a power play by Elsevier that seems serious. The publisher seems to be about to get into a big fight with countries about access to their journals. Nature reports that Germany, Taiwan and Peru will soon have an Elsevier embargo placed on them, the journals that it publishes no longer available to scientists in these countries. This seems to me a big deal, and I suspect that this will be a turning point in open access publishing. However big Elsevier is, were I their consigliere, I would council not getting into fights with countries, especially ones with big scientific establishments.

There is more in fact. It seems that Elsevier is also developing its own impact numbers, ones that make its journals look better than the other numbers do (see here). Here's one great quote from the link: "seeing a publisher developing its own metrics sounds about as appropriate as Breitbart news starting an ethical index for fake news."

Embargoing countries and setting up one's own impact metric; seems like fun times.

Here is a second link that I point to just for laughs and because I am a terrible technophobe. Many of my colleagues are LaTex fans. A recent paper suggests that whatever its other virtues, LaTex is a bot of a time sink. Take a look. Here's the synopsis:

To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors. On most measures, expert LaTeX users performed even worse than novice Word users. LaTeX users, however, more often report enjoying using their respective software.

Here’s a small addition to the previous post prompted by a
discussion with Paul Pietroski. I am always going on about how focusing on
recursion in general as the defining
property of FL is misleading. The interesting feature of FL is not that it
produces (or can produce) recursive Gs but that it produces the kinds of recursive Gs that it does. So
the minimalist project is not to explain how recursion arises in humans but how
a specific kind of recursion arises
in the species. What kind? Well the kind we find in the Gs we find. What kind
are these? Well not FSGs nor CFGs, or at least this is what Syntactic Structures (SS) argues.

Let me put this another way: GG has spent the last 60 years
establishing that human Gs have a certain kind of recursive structure. In SS,
it argued for a transformational
grammar arguing that FSGs (which were recursive) were inherently too weak and
that PSGs (also recursive) were inadequate empirically. Transformational Gs, SS
argued, are the right fit.

So, when people claim that the minimalist problem is to
explain the sources of recursion or observe that there may be/is recursion in
other parts of cognition thereby claiming to “falsify” the project, it seems to
me that they are barking up a red herring (I love the smell of mixed metaphors
in the morning!). From where I sit, the problem is explaining how an FL that
delivers TGish recursive G arose as this is the kind of FL that we have and the
kinds of Gs that it delivers. SS makes clear that “in the earliest days of GG,”
not all recursive Gs are created equal and that the FL and Gs of interest have
specific properties. It’s the sources for this
kind of recursion we want to explain. This is worth bearing in mind when
issues of recursion (and its place in minimalist theory) make it to the
spotlight.

Friday, January 27, 2017

I am in the process of co-editing a volume on Syntactic Structures (SS) that is due
out in 2017 to celebrate (aka, exploit) the 60th anniversary of the
publication of this seminal work. I am part of a gang of four (the other culprits
being Howard Lasnik, Pritti Patel, Charles Yang supervised/inspired by Norbert
Corver). We have solicited about 15 shortish pieces on various themes. The
tentative title is something like The
continuing relevance of Syntactic Structures to GG. Look for it on your
newsstands sometime late next year. It should arrive just in time for the 2017
holidays and is sure to be a great Xmas/ Hanukka/Kwanza gift. As preparation
for this editorial escapade, I have re-read SS several times and have tried to
figure out for myself what its lasting contribution is. Clearly it is an
important historical piece as it
sparked the Generative Enterprise. The question remains: What SS ideas have current relevance? Let me mention five.

The first and most important idea centers on the aims of
linguistic theory (ch 6). SS contrasts the study of grammatical form and the particular internal
(empirically to be determined) “simplicity” principles that inform it with
discovery procedures that are “practical and mechanical” (56) methods that “an investigator (my emph, NH) might
actually use, if he had the time, to construct a grammar of the language
directly from the raw data” (52). SS argues that a commitment to discovery
procedures leads to strictures on grammatical
analysis (e.g. bans on level mixing) that are methodologically and
empirically dubious.

The discussion in SS is reminiscent of the well-known
distinction in the philo of science between the context of discovery and the
context of justification. How one finds one’s theory can be idiosyncratic and
serendipidous, justifying one’s “choice” is another matter entirely. SS makes
the same point.[1]
It proposes a methodology of research in which grammatical argumentation is
more or less the standard of resaoning in the sciences more generally: data plus
general considerations of simplicity are deployed to argue for the superiority
of one analysis over another. SS contrasts this with the far stronger strictures
Structuralists endorsed, principles which if seriously practiced would sink
most any serious science. In practice then, what SS is calling for is that
linguists act like regular scientists (in modern parlance, reject
methodological dualism).

Let me be bit more specific. The structuralism that SS was
arguing against took as a methodological dictum that the aim of analysis was to
classify a corpus into a hierarchy of categories conditioned by substitution
criteria. So understood, grammatical categories are classes of words, which are
definable as classes of morphemes, which are defniable as classes of phonemes,
which are definable as classes of phones. The higher levels are, in effect,
simple generalizations over lower
level entities. The thought was that higher level categories were entirely
reducible to lower level distributional patterns. In this sort of analysis,
there are no (and can be no) theoretical entities, in the sense of real
abstract constructs that have empirical consequences but are not reducible or
definable in purely observational terms. By arguing against discovery
procedures and in favor of evaluation metrics SS is in effect arguing for the
legitimacy of theoretical
linguistics. Or, more accurately, for the legitimacy of normal scientific inquiry into language without methodological
constrictions that would cripple physics were it applied.

Let me put this another way: Structuralism adopted a strong
Empiricist methodology in which theory was effectively a summary of
observables. SS argues for the Rationalist conception of inquiry in which
theory must make contact with observables, but is not (and cannot) be reduced
to them.Given that the Rationalist
stance simply reflects common scientific practice, SS is a call for linguists to
start treating language scientifically and not hamstring inquiry by adopting
unrealistic, indeed non-scientific, dicta. This is why SS (and GG) is
reasonably seen as the start of the modern science
of linguistics.

Note that the discussion here in SS differs substantially
from that in chapter 1 of Aspects,
though there are important points of contact.[2]
SS is Rationalist as concerns the research methodology of linguists. Aspects is Rationalist as concerns the
structure of human mind/brains. The former concerns research methodology. The
latter concerns substantive claims about human neuro-psychology.

That said there are obvious points of contact. For example,
if discovery procedures fail methodologically, then this strongly suggests that
they will also fail as theories of linguistic mental structures. Syntax, for
example, is not reducible to
properties of sound and/or meaning despite its having observable consequences
for both. In other words, the Autonomy of Syntax thesis is just a step away
from the rejection of discovery procedures. It amounts to the claim that syntax
constitutes a viable G level that is not reducible to the primitives and
operations of any other G level.

To beat this horse good and dead: Gs contain distinct levels
that interact with empirically evaluable consequences, but they are not
organized so that lower levels are definable in terms of generalizations over
lower level entities. Syntax is real. Phonology is real. Semantics is real.
Phonetics is real. These levels have their own primitives and principles of
operation. The levels interact, but are ontologically autonomous. Given the
modern obsession with deep learning and its implicit endorsement of discovery
procedures, this point is worth reiterating and keeping in mind. The idea that
Gs are just generalizations over generalizations over generalizations that
seems the working hypothesis of Deep Learners and others[3]
has a wide following nowadays so it is worth recalling the SS lesson that
discovery procedures both don’t work and are fundamentally anti-theoretical. It
is Empiricism run statistically amok!

Let me add one more point and then move on. How should we
understand the SS discussion of discovery procedures from an Aspects perspective given that they are
not making the same point? Put more pointedly, don’t we want to understand how
a LAD (aka, kid) goes from PLD (a corpus) to a G? Isn’t this the aim of GG
research? And wouldn’t such a function be
a discovery procedure?

Here’s what I think: Yes and no. What I mean is that SS makes
a distinction that is important to still keep in mind. Principles of FL/UG are
not themselves sufficient to explain how LADs acquire Gs. More is required. Here’s
a quote from SS (56):

Our ultimate aim is to provide an
objective, non-intuitive way to evaluate a grammar once presented, and to
compare it with other proposed grammars (equivalently, the nature of linguistic
structure) and investigating the empirical consequences of adopting a certain
model for linguistic structure, rather than showing how, in principle, one
might have arrived at the grammar of a language.

Put in slightly more modern terms: finding FL/UG does not by
itself provide a theory of how the LAD actually acquires a G. More is needed.
Among other things, we need accounts of how we find phonemes, and morphemes and
many of the other units of analysis the levels require. The full theory will be
very complex, with lots of interacting parts. Many mental modules will no doubt
be involved. Understanding that there is a peculiarly linguistic component to this story does not imply forgetting that
it is not the whole story. SS makes this very clear. However, focusing on the
larger problem often leads to ignoring the fundamental linguistic aspects of
the problem, what SS calls the internal conditions on adequacy, many/some of
which will be linguistically proprietary.[4]

So, the most important contribution of SS is that it
launched the modern science of linguistics by arguing against discovery
procedures (i.e. methodological dualism). And sadly, the ground that SS should
have cleared is once again infested. Hence, the continuing relevance ot the SS
message.

Here are four more ideas of continuing relevance.

First, SS shows that speaker intuitions are a legitimate
source of linguistic data. The discussions of G adequacy in the first several
chapters are all framed in terms of what speakers know about sentences. Indeed,
that Gs are models of human linguistic behavior over an unbounded domain is quite explicit (15):

…a grammar mirrors the
behavior of speakers who, on the basis of a finite and accidental experience
with language, can produce or understand an indefinite number of new sentences.
Indeed, any explication of “grammatical in L” …can be thought of as offering an
explanation for this fundamental aspect of linguistic behavior.

Most of the data presented for choosing one form of G over
another involves plumbing a native speaker’s sense of what is and isn’t natural
for his/her language. SS has an elaborate discussion of this in chapter 8 where
the virtues of “constructional homonymity” (86) as probes of grammatical
adequacy are elaborated. Natural languages are replete with sentences that have
the same phonological form but differ thematically (flying planes can be dangerous) or that have different phonological
forms but are thematically quite similar (John
hugged Mary, Mary was hugged by John).
As SS notes (83): “It is reasonable to expect grammars to provide explanations
for some of these facts” and for theories of grammar to be evaluated in terms
of their ability to handle them.

It is worth noting that the relevance of constructional
homonymity to “debates” about structure dependence has been recently
highlighted once again in a paper by Berwick, Pietroski, Yankama and Chomsky (see here
and here
for discussion). It appears that too many forget that linguistics facts go
beyond the observation that “such and such a strong…is or is not a sentence”
(85). SS warns against forgetting this, and the world would be a better place
(or at least dumb critiques of GG would be less thick on the ground) if this
warning 60 years ago had been heeded.

Second, SS identifies the central problem of linguistics as
how to relate sound and meaning (the latter being more specifically thematic
roles (though this term is not used)). This places Gs and their structure at
the center of the enterprise. Indeed, this is what makes constructional
homonymity such an interesting probe into the structure of Gs. There is an
unbounded number of these pairings and the rules that pair them (i.e. Gs) are
not “visible.” This means the central problem in linguistics is determining the
structure of these abstract Gs by examining their products. Most of SS exhibits
how to do this and the central arguments in favor of adding transformations to
the inventory of syntactic operations involve noting how transformational
grammars accommodate such data in simple and natural ways.

This brings us to the third lasting contribution of SS. It
makes a particular proposal concerning the kind
of G natural languages embody. The right G involves Transformations (T). Finite
State Gs don’t cut it, nor can simple context free PSGs. T-grammars are
required. The argument against PSGs
is particularly important. It is not that they cannot generate the right
structures but that they cannot do so in
the right way, capturing the evident generalizations that Gs embodying Ts
can do.

Isolating Ts as grammatically central operations sets the
stage for the next 50 years of inquiry: specifying the kinds of Ts required and
figuring out how to limit them so that they don’t wildly overgenerate.

SS also proposes the model that until very recently was at
the core of every GG account. Gs contained a PSG component that generated
kernel sentences (which effectively specified thematic dependencies) and a T
component that created further structures from these inputs. Minimalism has
partially stuck to this conception. Though it has (or some versions have)
collapsed PSG kinds of rules and T rules treating both as instances of Merge,
minimalist theories have largely retained the distinction between operations
that build thematic structure and those that do everything else. So, even
though Ts and PSG rules are formally the same, thematic information (roughly
the info carried by kernel sentences in SS) is the province of E-merge
applications and everything else the province of I-merge applications. The
divide between thematic information and all other kinds of semantic information
(aka the duality of interpretation) has thus been preserved in most modern accounts.[5]

Last, SS identifies two different linguistic problems:
finding a G for a particular L and finding a theory of Gs for arbitrary L. This
can also be seen as explicating the notions “grammatical in L” for a given
language L vs the notion of “grammatical” tout court. This important
distinction survives to the present as the difference between Gs and FL/UG. SS
makes it clear (at least to me) that the study of the notion grammatical in L is interesting to the
degree that it serves to illuminate the more general notion grammatical for arbitrary L (i.e. Gs are
interesting to the degree that they illuminate the structure of FL/UG). As a practical
matter, the best route into the more general notion proceeds (at least
initially) via the study of the properties of individual Gs. However, SS warns
against thinking that a proper study of the more general notion must await the
development of fully adequate accounts of the more specific.

Indeed, I would go further. The idea that investigations of
the more general notion (e.g. of FL/UG) are parasitic on (and secondary to)
establishing solid language particular Gs is to treat the more general notion
(UG) as the summary (or generalization of) of properties of individual Gs. In
other words, it is to treat UG as if it were a kind of Structuralist level,
reducible to the properties of individual Gs. But if one rejects this
conception, as the SS discussion of levels and discovery procedures suggests we
should, then prioritizing G facts and investigation over UG considerations is a
bad way to go.

I suspect that the above conclusion is widely appreciated in
the GG community with only those committed to a Greenbergian conception of
Universals dissenting. However, the logic carries over to modern minimalist
investigation as well. The animus against minimalist theorizing can, IMO, be
understood as reflecting the view that such airy speculation must play second
fiddle to real linguistic (i.e. G
based) investigations. SS reminds us that the hard problem is the abstract one
and that this is the prize we need to focus on, and that it will not just solve itself if we just do concentrate on
the “lower” level issues. This would hold true of the world was fundamentally Structuralist,
with higher levels of analysis just being generalizations of lower levels. But
SS argues repeatedly that this is not
right. It is a message that we should continue to rehearse.

Ok, that’s it for now. SS is chock full of other great bits
and the collection we are editing will, I am confident, bring them out. Till
then, let me urge you to (re)read SS and report back onyour favorite parts. It is excellent holiday
reading, especially if read responsively accompanied by some good wine.

[1]
What follows uses the very helpful and clear discussion of these matters by
John Collins (here):
26-7.

[2]
Indeed, the view in Aspects is
clearly prefigured in SS, though is not as highlighted in SS as it is later on
(see discussion p. 15).

…a grammar mirrors the behavior
of speakers who, on the basis of a finite and accidental experience with
language, can produce or understand an indefinite number of new sentences.
Indeed, any explication of “grammatical in L” …can be thought of as offering an
explanation for this fundamental aspect of linguistic behavior.

[3]
Elissa Newport’s work seems to be in much the same vein in treating everything
as probability distributions over lower level entities bottoming out in
something like syllables or phones.

[4]
Of course, the ever hopeful minimalist will hope that not very much will be
such.

[5]
I would be remiss if I did not point out that this is precisely the assumption
that the movement theory of control rejects.

Wednesday, January 25, 2017

I was reading the latest (February 9/2017) issue of the NYR and the lead article is a review of The revenge of the analog (by David Sax). The reviewer is Bill McKibben and it discusses a mini revolt against the digital world that is apparently taking place. I have no personal experience of this, but the article is a fun read (here). There is one point that it makes that I found interesting and comports with my own impression. It's that MOOCs have disappeared. There was a time when that's all that we heard about, at least in academia. How MOOCs would soon displace all standard teaching and that everyone had to find a way of translating their courses into web accessible/MOOC form or they would be left far behind, the detritus of the forward march of pedagogy. This coincided with the view that education would be saved if only every student could be equipped with a tablet that would serve as conduit to the endless world of information out there. MOOCs were the leading edge of a technological revolution that would revamp our conception of education at all levels beyond recognition.

I was very skeptical of this at the time (see, e.g. here and here). Sax tries to explain what went wrong. Here's McKibben on Sax.

The notion of imagination and human connection as analog virtues comes across most powerfully in Sax’s discussion of education. Nothing has appealed to digital zealots as much as the idea of “transforming” our education systems with all manner of gadgetry. The “ed tech” market swells constantly, as more school systems hand out iPads or virtual-reality goggles; one of the earliest noble causes of the digerati was the One Laptop Per Child global initiative, led by MIT’s Nicholas Negroponte, a Garibaldi of the Internet age. The OLPC crew raised stupendous amounts of money and created machines that could run on solar power or could be cranked by hand, and they distributed them to poor children around the developing world, but alas, according to Sax, “academic studies demonstrated no gain in academic achievement.” Last year, in fact, the OECD reported that “students who use computers very frequently at school do a lot worse in most learning outcomes.”At the other end of the educational spectrum from African villages, the most prestigious universities on earth have been busy putting courses on the Web and building MOOCs, “massive open online courses.” Sax misses the scattered successes of these ventures, often courses in computer programming or other technical subjects that aren’t otherwise available in much of the developing world. But he’s right that many of these classes have failed to engage the students who sign up, most of whom drop out.Even those who stay the course “perform worse, and learn less, than [their] peers who are sitting in a school listening to a teacher talking in front of a blackboard.” Why this is so is relatively easy to figure out: technologists think of teaching as a delivery system for information, one that can and should be profitably streamlined. But actual teaching isn’t about information delivery—it’s a relationship. As one Stanford professor who watched the MOOCs expensively tank puts it, “A teacher has a relationship with a group of students. It is those independent relationships that is the basis of learning. Period.”

The diagnosis fits with my perceptions as well. One of the problems MOOCs would always face, IMO, is that it left out how social an activity teaching and learning is. This is not so for everything, but it is so for many things, especially non-technical things. The problem is not the efficient transfer of information, but figuring out how to awaken the imaginative and critical sensibilities of students. (Note: these are harder to "test" than is the info transfer). The MOOCs conception treated students as "consumers" rather than "initiates." Ideas are not techniques. They need a different kind of exploration. Indeed, half of teaching is getting students to appreciate why something is important and that comes from the personal relation established between teacher and student and students to one another and student to teacher. This, at any rate, is Sax's view and if he is right then the failure of MOOCs as general strategies for education makes sense. Or, this would be a good reason for their failure.

BTW, there is a nice version of the relevant distinction in the movie The Prime of Miss Jean Brodie where Brodie notes that 'education' comes from the latin 'ex ducare' (to lead out) where much of what goes on is better seen as 'intrusion' as in 'thrust in.' Information can be crammed, thrust, delivered. Education must be more carefully curated.

Like I said, I wish this were the reason for the end of MOOCs, but I doubt it. What probably killed them (if they are indeed dead) was likely their cost. They did not really save any money, though they shifted cash from one set of pockets to another. In other words, they were scam-like and the fad has run its day. Will it return? Almost certainly. We just need a new technological twist to all for repackaging of the stuff. Maybe when virtual reality is more prevalent it will serve as the new MOOC platform and we will see the hysteria/hype rise again. There is no end to technological utopias because of their sales value. So expect a rise of MOOCs or MOOC equivalents some time soon at a university near you. But don't expect much more the next time around.

BTW, the stuff on board games was fascinating. Is this really a thing?

Friday, January 20, 2017

Dan Everett (DE) has written once again on his views about
Piraha, recursion, and the implications for Universal Grammar (here).
I was strongly tempted to avoid posting on it for it adds nothing new of
substance to the discussion (and will almost certainly serve to keep the
silliness alive), beyond a healthy dose of self-pity and self-aggrandizement.
It makes the same mistakes, in almost the same way, and adds a few more
irrelevancies to the mix. If history surfaces first as tragedy and the second
time as farce (see here)
then pseudo debates in their moth eaten n-th iteration are just pathetic. The
Piraha “debate” has long since passed its sell-by date. As I’ve said all that I
am about to say before, I would urge you not
to expend time or energy reading this. But if you are the kind of person who
slows down to rubberneck a wreck on the road and can’t help but find the ghoulish
fascinating, this post is for you.

The DE piece makes several points.

First, that there is a debate. As you all know this is
wrong. There can be no debate if the controversy hinges on an equivocation. And
it does, for what the DE piece claims about the G of Piraha, even if completely accurate (which I
doubt, but the facts are beyond my expertise) has no bearing on Chomsky’s
proposal, viz. that recursion is the only distinctively linguistic feature of FL.
This is a logical point, not an
empirical one. More exactly, the controversy rests on an equivocation
concerning the notion “universal.” The equivocation has been a consistent
feature of DE’s discussions and this piece is no different. Let me once again
explain the logic.

Chomsky’s proposal rests on a few observations. First, that
humans display linguistic creativity. Second, that humans are only accidentally
native speakers of their native languages.

The first observation is manifest in the fact that, for
example, a native speaker of English, can effortlessly use and understand an
unbounded number of linguistic expressions never before encountered. The second
is manifest in the observation that a child deposited in any linguistic
community will grow up to be a linguistically competent native speaker of that
language with linguistic capacities indistinguishable from any of the other
native speakers (e.g. wrt his/her linguistic creativity).

These two observations prompt some questions.

First, what underlying mental architecture is required to
allow for the linguistic creativity we find in humans? Answer 1 a mind that has recursive rules able
to generate ever more sophisticated expressions from simple building blocks
(aka, a G). Question 2: what kind of mental architecture must a such a G
competent being have? Answer 2: a mind that can acquire recursive rules (i.e a
G) from products of those rules (i.e. generated examples of the G). Why
recursive rules? Because linguistic productivity just names the fact that human
speakers are competent with respect to an unbounded number of different
linguistic expressions.

Second, why assume that the capacity to acquire recursive Gs
is a feature of human minds in general
rather than simply a feature of those human minds that have actually acquired
recursive Gs? Answer: Because any
human can acquire any G that
generates any language. So the capacity to acquire language in general requires the meta-capacity to
acquire recursive rule systems (aka, Gs).As this meta-capacity seems to be restricted to humans (i.e. so far as
we know only humans display the kind
of recursive capacity manifested in linguistic creativity) and as this capacity
is most clearly manifest in language then Chomsky’s conjecture is that if there is anything linguistically
specific about the human capacity to acquire language the linguistic
specificity resides in this recursive meta-capacity.[1]
Or to put this another way: there may be more to the human capacity to acquire
language than the recursive meta-capacity but at least this meta capacity is part of the story.[2]
Or, to put this another way, absent the human given (i.e. innate) meta-capacity to acquire (certain specifiable
kinds of) recursive Gs, humans would not be able to acquire the kinds of Gs that
we know that they in fact do acquire (e.g. Gs like those English, French,
Spanish, Tagalog, Arabic, Inuit, Chinese … speakers have in fact acquired).
Hence, humans must come equipped with this recursive meta-capacity as part of
FL.

Ok, some observations: recursion in this story is principally a predicate of FL, the meta-capacity. The
meta-capacity is to acquire recursive Gs (with specific properties that GG has
been in the business of identifying for the last 50 years or so). The
conjecture is that humans have this meta-capacity (aka FL) because they do in fact display linguistic creativity
(and, as the DE paper concedes, native speakers of non-Piraha do regularly
display linguistic creativity implicating the internalization of recursive
language specific Gs) and because the linguistic creativity a native speaker of
(e.g.) English displays could have been
displayed by any person raised in an
English linguistic milieu. In sum, FL is recursive in the sense that it has the
capacity to acquire recursive Gs and speakers of any language have such FLs.

Observe that FL must have the capacity to acquire recursive
Gs even if not all human Gs are recursive. FL must have this capacity because all
agree that many/most (e.g.) non-Piraha Gs are recursive in the sense that
Piraha is claimed not to be. So, the following two claims are consistent: (1)
some languages have non-recursive Gs but (2) native speakers of those languages
have recursive FLs. This DE piece (like all the other DE papers on this topic)
fails, once again, to recognize this. A discontinuous quote (4):

If there were a language that chose not to use
recursion, it would at the very least be curious and at most would mean that
Chomsky’s entire conception of language/grammar is wrong….

Chomsky made a clear claim
–recursion is fundamental to having a language. And my paper did in fact
present a counterexample. Recursion cannot be fundamental to language if there
are languages without it, even just one language.

First an aside: I tend to agree that it would indeed be
curious if we found a language with a non-recursive G given that virtually all
of the Gs that have been studied are recursive. Thus finding one that is not
would be odd for the same reason that finding a single counter example to any
generalization is always curious (and which is why I tend not to believe DE’s
claims and tend to find the critique by Nevins, Pesetsky and Rodrigues
compelling).[3]
But, and this is the main take home
message, whether curious or not, it is at right angles to Chomsky’s claim
concerning FL for the reasons outlined above. The capacity to acquire recursive
Gs is not falsified by the acquisition of a non-recursive one. Thus, logically
speaking, the observation that Piraha does not have embedded clauses (i.e. does
not the display one of the standard diagnostics of a recursive G) does not
imply that Piraha speakers do not have recursive FLs. Thus, DE’s claims are
completely irrelevant to Chomsky’s even if correct. That point has been made
repeatedly and, sadly, it has still not sunk in. I doubt that for some it ever
will.

Ok, let’s now consider some other questions. Here’s one: is
this linguistic meta-capacity permanent or evanescent? In other words, one can
imagine that FL has the capacity to acquire recursive Gs but that once it has
acquired a non-recursive G it can no longer acquire a recursive one. DE’s
article suggests that this is so for Piraha speakers (p. 7). Again, I have no
idea if this is indeed the case (if true it constitute evidence for a strong
version of the Sapir-Whorf hypothesis) but this claim even if correct is at
right angles to Chomsky’s claim about FL. Species specific dedicated capacities
need not remain intact after use. It could be true that FL is only available
for first language acquisition and this would mean that second languages are
acquired in different ways (maybe by piggy backing on the first G acquired).[4]
However so far as I know, neither Chomsky nor GG has ever committed hostages to
this issue. Again, I am personally skeptical that having a Piraha G precludes
you from the recursive parts of a Portuguese G, but I have nothing but
prejudicial hunches to sustain the skepticism. At any rate, it doesn’t bear on
Chomsky’s thesis concerning FL. The upshot: DE’s remarks once again are at
right angles to Chomsky’s claims so interesting as the possibility it raises
might be for interesting issues relating to second language acquisition, it is
not relevant to Chomsky’s claims about the recursive nature of FL.

A third question: is the meta-capacity culturally relative?
DE’s piece suggests that it is because the actual acquisition of recursive Gs
might be subject to cultural influences. The point seems to be that if culture influences whether an
acquired G is recursive or not implies that the meta-capacity is recursive or
not as well. But this does not follow.Let me explain.

All agree that the details of an actual G are influenced by
all sorts of factors, including culture.[5]
This must be so and has been insisted
upon since the earliest days of GG. After all, the G one acquires is a function
of FL and the PLD used to construct
that G. But the PLD is itself a function of what is actually gets and there is
no doubt that what utterances are performed is influenced by the culture of the
utterers.[6]
So, that culture has an effect on the
shape of specific Gs is (or should be) uncontroversial. However, none of this
implies that the meta-capacity to build recursive Gs is itself culturally
dependent, nor does DE’s piece explain how it could be. In fact, it has always
been unclear how external factors could affect this meta-capacity. You either
have a recursive meta-capacity or you don’t. As Dawkins put it (see here for
discussion and references):

… Just as you can’t have half a segment, there are no
intermediates between a recursive and a non-recursive subroutine. Computer
languages either allow recursion or they don’t. There’s no such thing as half-recursion. It’s an all or nothing
software trick… (383)

Given this “all or nothing” quality, what would it mean to
say that the capacity (i.e. the innately
provided “computer language” of FL) was dependent on “culture.”? Of course, if
what you mean is that the exercise of the capacity is culture dependent and
what you mean by this is that it depends on the nature of the PLD (and other
factors) that might themselves be influenced by “culture” then duh! But, if
this is what DE’s piece intends, then once again it fails to make contact with
Chomsky’s claim concerning the recursive nature of FL. The capacity is what it
is though of course the exercise of
the capacity to produce a G will be influenced by all sorts of factors, some of
which we can call “culture.”[7]

Two more points and we are done.

First, there is a source for the confusion in DE’s papers
(and it is the same one I have pointed to before). DE’s discussion treats all universals
as if Greenbergian. Here’s a quote from the current piece that shows this (I
leave it as an exercise to the reader to uncover the Greenbergian premise):

The real lesson is that if
recursion is the narrow faculty of
language, but doesn’t actually have to be manifested in a given language, then
likely more languages than Piraha…could lack recursion. And by this reasoning
we derive the astonishing claim that. Although, recursion would be the
characteristic that makes human language possible, it need not actually be
found in any given language. (8)

Note the premise: unless every G is recursive then recursion
cannot be “that which makes human languages possible.” But this only makes
sense if you understand things as Greenberg does. If you understand the claim
as being about the capacity to
acquire recursive Gs then none of this follows.

Nor are we led to absurdity. Let me froth here. Of course,
nobody would think that we had a capacity for constructing recursive Gs unless
we had reason to think that some Gs were so. But we have endless evidence that
this is the case. So, given that there is at least one such G (indeed endlessly many), humans clearly must have the
capacity to construct such Gs. So, though we might have had such a capacity and
never exercised it (this is logically possible), we are not really in that part
of the counterfactual space. All we need to get the argument going for a
recursive meta-capacity is mastery of
at least one recursive G and there is no dispute that there exists such a G and
that humans have acquired it. Given this, the only coherent reason for thinking
a counterexample (like Piraha) could be a problem is if one understood the
claim to universality as implying that a universal property of FL (i.e. a
feature of FL) must manifest itself in every
G. And this is to understand ‘universal’ a la Greenberg and and not as Chomsky
does. Thus we are back to original sin in DE’s oeuvre; the insistence on a
Greenberg conception of universal.

Second, the piece makes another point. It suggests that DE’s
dispute with Chomsky is actually over whether recursion is part of FL or part
of cognition more generally. Here’s the quote (10):

…the question is not whether humans can think
recursively. The question is whether this ability is linked specifically to
language or instead to human cognitive accomplishments more generally…

If I understand this correctly, it is agreed that recursion
is an innate part of human mental machinery. What’s at issue is whether there
is anything linguistically proprietary about it. Thus, Chomsky could be right
to think that human linguistic capacity manifests recursion but that this is
not a specifically linguistic fact
about us as we manifest recursion in our mental life quite generally.[8]

Maybe. But frankly it is hard to see how DE’s writings bear
on these very recondite issues. Here’s what I mean: Human Gs are not merely recursive but exhibit a
particular kind of recursion. Work in GG over the last 60 years has been in
service of trying to specify what kind of recursive Gs humans entertain. Now,
the claim here is that we find the kind
of structure we find in human Gs in cognition more generally. This is empirically
possible. Show me! Show me that other kinds of cognition have the same structures as those GGers have
found occur in Gs. Nothing in DE’s
arguments about Piraha have any obvious bearing on this claim for there is no demonstration that other parts of
cognition have anything like the recursive structure we find in human Gs.

But let’s say that we establish such a parallelism. There is
still more to do. Here is a second question: is FL recursive because our mental life in general is or
is our mental life in general recursive because we have FL.[9]
This is the old species specificity question all over again. Chomsky’s claim is
that if there is anything species
special about human linguistic facility it rests in the kind of recursion we
find in language. To rebut this species specificity requires showing that this
kind of recursion is not the exclusive preserve of linguistically capable
beings. But, once again, nothing in DE’s work addresses this question. No
evidence is presented trying to establish the parallel between the kind of
recursion we find in human Gs and any animal cognitive structures.

Suffice it to say that the kind of recursion we find in
language is not cognitively ubiquitous (so far as we can tell) and that if it
occurs in other parts of cognition it does not appear to be rampant in
non-human animal cognition. And, for me at least, that is linguistically specific
enough. Moreover, and this is the important point as regards DE’s claims, it is
quite unclear how anything about Piraha will bear on this question. Whether or not Piraha has a recursive G will tell us
nothing about whether other animals have recursive minds like ours.

Conclusion? The same as before. There is no there there. We
find arguments based on equivocation and assertions without support. The whole
discussion is irrelevant to Chomsky’s claims about the recursive structure of
FL and whether that is the sole UGish feature of FL.[10]

That’s it. As you can see, I got carried away. I didn’t mean
to write so much. Sorry. Last time? Let’s all hope so.

[1]
Here you can whistle some appropriate Minimalist tune if you would like. I
personally think that there is something linguistically specific about FL given
that we are the only animals that appear to manifest anything like the
recursive structures we find in language. But, this is an empirical question.
See here
for discussion.

[2]
Chomsky’s minimalist conjecture is that this is the sole linguistically special
capacity required.

[3]
Indeed such odd counterexamples place a very strong burden of proof on the
individual arguing for it. Sometimes this burden of proof can be met. But
singular counterexamples that float in a sea of regularity are indeed curious
and worthy of considerable skepticism. However, that’s not my point here. It is a different one: the Piraha facts whatever
they turn out to be are irrelevant to the claim the FL has the capacity to
acquire recursive Gs. As this is what Chomsky has been proposing. Thus, the
facts regarding Piraha whatever they turn out to be are logicallyirrelevant to Chomsky’s
proposal.

[4]
This seems to be the way that Sakel conceives of the process (see here).
Sakel is the person the DE piece cites as rebutting the idea that Piraha
speakers with Portuguese as a second language behave. That speakers build their
second G on the scaffolding provided by a first G is quite plausible a priori (though whether it is true is
another matter entirely). And if this is so, then features of one’s first G
should have significant impact on properties of one’s second G. Sakel, btw, is
far less categorical in her views than what DE’s piece suggests. Last point: a
nice “experiment” if this interests you is to see what happens if a speaker is
acquiring Portuguese and Piraha simultaneously; both as first Gs. What should
we expect? I dunno, but my hunch is that both would be acquired swimmingly.

[5]
So, for example, dialects of English differ wrt the acceptability of
Topicalization. My community used it freely and I find them great. My students
at UMD were not that comfortable with this kind of displacement. I am willing
to bet that Topicalization’s alias (i.e. Yiddish Movement) betrays a certain
cultural influence.

[6]
Again, see note 4 and Sakel’s useful discussion of the complexity of Portuguese
input to the Piraha second language acquirer.

[7]
BTW, so far as I can tell, invoking “culture” is nothing but a rhetorical
flourish most of the time. It usually means nothing more than “not biology.” However,
how culture affects matters and which
bits do what is often (always?) left unsettled. It often seems to me that the
word is brandished a bit like garlic against vampires, mainly there to ward off
evil biological spirits.

[8]
On this view, DE agrees that there is FLB but no FLN, i.e. a UGish part of FL.

[9]
In Minimalist terms, is recursion a UGish part of FL or is there no UG at all in FL.

[10]
There is also some truly silly stuff in which DE speculates as to why the push
back against his views has been so vigorous. Curiously, DE does not countenance
the possibility that it is because his arguments though severely wanting have
been very widely covered. There is some dumb stuff on Chomsky’s politics, Wolfe
junk, and general BS about how to do science. This is garbage and not worth
your time, except for psycho-sociological speculation.

Friday, January 13, 2017

It's file reading season so my time is limited right now. That and the hangover from the holidays has made me slower. Nonetheless, I thought that I would post this very short piece that I read on computational models in chemistry (thx to Bill Idsardi for sending it my way). The report is interesting to linguists, IMO, for several reasons.

First, it shows how a mature science deals with complexity. It appears to be virtually impossible to do a full theoretically rigorous computation given the complexity of the problem, "but for the simplest atoms." Consequently, chemists have developed approximation techniques (algorithms/functions) to figure out how electrons arrange themselves in "bulk materials." These techniques are an artful combination of theory and empirical parameter estimation and they are important precisely because one cannot compute exact results for these kinds of complex cases. This is so even though the relevant theory is quite well known.

I suspect that if and when linguists understand more and more about the relevant computations involved in language we will face a similar problem. Even were we to know everything about the underlying competence and the relevant mental/brain machinery that uses this knowledge, I would expect the interaction effects among the various (many) interacting components to be very complex. This will lead to apparent empirical failure. But, and this is the important point, this is to be expected even if the theory is completely correct. This is well understood in the real sciences. My impression is that this bit of wisdom is still considered way out there in the mental sciences.

Second, the discussed paper warns against thinking that rich empirics can substitute for our lack of understanding. What the reported paper does is test algorithms by seeing how they perform in the simple cases where exact solutions can be computed. How well do those techniques we apply in sophisticated cases work in the simple ones? The questions: How well "different...algorithms approximated the relatively exact solutions."

The discovery was surprising: After a certain point, more sophisticated algorithms started doing worse at estimating the geometry of electrons (this is what one needs to figure out a material's chemical properties). More interesting still, the problem was most acute for "algorithms based on empirical data." Here's the relevant quote:

Rather than calculating everything based on physical principles, algorithms can replace some of the calculations with values or simple functions based on measurements of real systems (an approach called parameterization). The reliance on this approach, however, seems to do bad things to the electron density values it produces. "Functionals constructed with little or no empiricism," the authors write, "tend to produce more accurate electron densities than highly empirical ones."

It seems that when "we have no idea what the function is" that throwing data at the problem can make things worse. This should not be surprising. Data cannot substitute for theoretical insight. Sadly, this trivial observation is worth mentioning given he spirit of the age.

Here is one possible implication for theoretical work in linguistics: we often believe that one tests a theory best by seeing how it generalizes beyond the simple cases that motivate it. But in testing a theory in a complex case (where we know less) we necessarily must make assumptions based less on theory and more on the empirical details of the case at hand. This is not a bad thing to do. But it carries its own risks, as this case illustrates. The problem with complex cases is that they likely provoke interaction effects. To domesticate these effects we make useful ad hoc assumptions. But doing this makes the fundamental principles more opaque in the particular circumstance. Not always, but often.

Friday, January 6, 2017

I recently co-wrote a comment (see here) on a piece by Berlinski and Uriagereka on Vergnaud's theory of case (see here and here), of which I am a fan, much to one of my colleagues continual dismay. The replies were interesting, especially the one by Berlinski. He excoriated the Jacobian position I tried to defend. His rebuttal should warm the hearts of many who took me to task for my thinking that it was legit to study FL/UG without doing much cross G inquiry. He argues (and he knows more about this than I do) that the Jacob idea that the same fundamental bio mechanisms extend from bacteria to butterflies is little more than myth. The rest of his comment is worth reading too for it rightly points out that the bio-ling perspective is, to date, more perspective and less biology.

Reaction? Well, I really like the reply's vigorous pushback. However, I don't agree. In fact, I think that what he admires about Vergnaud's effort is precisely what makes the study of UG in a single language so productive. Here's what I mean.

Vergnaud's theory aimed to rationalize facts about the distribution of nominals in a very case weak language (viz. English). It did this elegantly and without much surface morphology to back it up. Interestingly, as readers of FoL have noted, the cross G morpho evidence is actually quite messy and would not obviously support Vergnaud's thesis (though I am still skeptical that it fails here). So, the argument style that Vergnaud used and that Berlinski really admires supports the idea that intensive study of the properties of a single G is a legit way to study the general properties of FL/UG. In fact, it suggests that precisely because this method of study brackets overt surface features it cannot be driven by the distributions of these features which is what much cross G inquiry studies. Given this logic, intensive study of a single G, especially if it sets aside surface reflexes, should generalize. And GG has, IMO, demonstrated that this logic is largely correct. So, many of the basic features of UG, though discovered studying a small number of languages have generalized quite well cross linguistically. There have, of course, been modifications and changes. But, overall, I think that the basic story has remained the same. And I suspect that not a few biologists would make the same point about their inquiries. If results from model systems did not largely generalize then they would be of little use. Again, maybe they aren't, but this is not my impression.

Ok, take a look at Berlinski's comments. And if you like this, take a look at his, IMO, most readable book Black Mischief.