Comments

Wednesday, October 29, 2014

God there’s a lot of junk out there. And by ‘junk’ I don’t
just mean poorly done and pointless, for even shoddy aimless research can
enlighten inadvertently. What I mean is stuff that looks serious and is taken
seriously by the scientific punditry but is so completely ignorant and off base
that it makes you dumber if you read it. I say this to warn you not to read this
review of a book by one Vyvyan Evans called The Language Myth. The review is by one Alun Anderson (what’s with
this weird spelling? Vyviyan? Alun? Though maybe someone called ‘Norbert’ is in
no position to asperse). As I have not read the book, I will just comment on
the review. For Mr. Evan’s sake, I hope that Mr. Anderson has failed to
understand what Mr. Evans actually wrote. If he got Evans right, then the book
is likely a waste of time. But, as I said, I have not read it (and I am not
sure that I will) so I will concentrate on the review.

The review has that breathless “excited” quality that all
tracts heralding an impending conceptual “revolution” trade in:[1]

After reading The Language Myth, it
certainly looks as if a major shift is in progress, one that will open people’s
minds to liberating new ways of thinking about language.

What’s the novelty? Well, you can guess: the
Chomsky-Fodor-Pinker (Andersen’s triad, presumably following Evans) view that
language is an “instinct” is just plain wrong, a “myth” that has swept the
popular imagination (were it so!).[2]
The myth, it is claimed, is based on the “way that children effortlessly learn
language just by listening to the adults around them, without being aware
explicitly of the governing grammatical rules.” This observation led Chomsky
(and Pinker (his publicist (boy I’m sure that will go down great with Steve!))
and Fodor, another dupe) to argue that there is “a module in the mind… waiting
to be activated…when an infant encounters language. The rules behind language
being built into the genes.” This is not any particular grammar, but a
“universal grammar, capable of generating the rules of any of the 7000 or so
languages that a child might be exposed to, however different they might
appear.”[3]

That’s the myth. As I assume that Andersen does not
challenge the above description concerning the ease with which kids acquire a
natural language (though I cannot be sure, maybe he does), I assume that the
“myth” that needs exposure concerns the existence of an innate capacity, highly
developed (indeed likely unique) to humans, that underlies their linguistic
proclivities.

Now read a certain way, this is not necessarily a bad précis of the modern Chomsky version of GG.Human children bring to the task of acquiring
and deploying a language special species-specific skills that enable them to do
what other clever animals cannot, i.e. acquire and use a natural language.
There are arguments to the effect that some (or much) of this capacity is
linguistic specific (though how much is currently a topic of debate), but given
the obvious difference between human capacities wrt language and that of
anything else we know of, the supposition that there is something special about
us when it comes to language is hardly one of those going-out-on-a-long-thin-limb
sort of assumptions. Indeed, I would go further, everyone assumes something analogous for the obvious fact remains
that nothing does language like humans do, and on the pretty standard
assumption, that what we are mentally capable of has something to do with our
mental capacities, the fact that we can do language easily and nothing else
does it at all implies that we have some mental capacities that other animals
(and plants and rocks) do not. This is not
an exciting view. And to say that humans have a language acquisition device is,
at least minimally, to observe that this fact is obvious and needs explaining.

There are, of course, more interesting conclusions that one
can draw and that have been drawn: e.g. that the mental capacities are in part sui generis, both wrt humans and to
language, that the special capacity we have is qualitative not merely
quantitative, that this capacity is dissociable from other cognitive
capacities, etc.Each of these further
claims comes with substantial discussion and evidence, none of which Mr.
Anderson seems aware of.Or at least he
doesn’t mention or address it. Why, because he thinks that Mr Evans has
provided simple straightforward evidence demonstrating how nutty this nativist
viewpoint is. What’s that evidence?

A key criticism is that the more
languages are studied, the more their diversity becomes apparent and an
underlying universal grammar less probable.

Spot the flaw (duh!). Once again the absence of Greenberg
universals is used to dismiss Chomsky universals.[4]
We’ve seen this
before (right Mr. Everett?). And we will see it again (and, if past is
prologue, again and again and again, sadly). But, to repeat (and repeat and
repeat), these are very (very very) different things. A rich innate language
specific mental module is consistent with a great deal of variation in the
surface properties of individual languages. UG does not imply the existence of
universal manifest patterns in every language. It does not even imply that all
Gs must contain (some of) the same rules (e.g. there is no requirement that
every language have focus movement or aux inversion or any other rule). Chomsky
universals are about types of Gs (the kinds
of rules/principles that they have) and to overthrow it requires showing more
than some Gs have rules that others don’t or some surface patterns that others
don’t.

Truth be told, however, the distinction between the two
conceptions is probably not one that would concern Mr Andersen. Why not?
Because he seems to know nothing at all about any work in the tradition that he
sees as ready to collapse. He seems to believe that the mere existence of “free
word order languages,” languages that “build sentences out of prefixes and
suffixes to create giant words,” or languages that “appear not to have nouns
and verbs” would be news to linguists very snugly in the Chomsky GG tradition
(as if languages like Walpiri, Mohawk, Navaho and Salish were never studied and
analyzed within the Chomsky GG tradition).I’m pretty sure that the existence of the work of Ken Hale, Mark Baker,
Lisa Mathewson, Julie Legate, Henry Davis, Masha Polinsky, Ben Bruening (among many
many others) and the many GG inspired analyses of the Gs of these languages
they have offered would be news to Mr. Andersen, something that the editors of New Scientist might have thought of
before asking him to review Mr. Evans book.

But this is not all. Mr. Andersen also appears to think that
genes require invariant expression, so that change and variation are
inconsistent with information being genetically coded. In additiona, it seems
that he believes that language change is incompatible with the claim that
“grammar is laid out in the genes.” For Mr. Andersen the simple existence of
language change and creolization are sufficient reason for denying that humans
have a special biologically rooted affinity for language. If only we in the
Chomsky tradition had realized that languages differed and changed we would
never have gone down the ill-conceived nativist path we’ve taken. We would not
have been seduced by the Chomsky-Fodor-Pinker myths. As I write, scales are falling
from my eyes! Revolution indeed!

And last but not least: nativists cannot say how languages
carry meaning. Here Mr. Andersen is finally onto something. The problem is that
he appears not to realize that nobody
understands how this is done. Where meaning comes from, how symbols come to
have significance is a real pisser of a problem, but, as they say, it is, at
least for now, everyone’s problem. What we can say is that embedding the
problem into a larger Chomsky like set of considerations has allowed some
progress on some features of it (e.g. antecedence and scope have been
illuminated, though what ‘dog’ and ‘know’ and ‘give’ and ‘London’ and ‘house’ mean
is still pretty murky (a sign of this is that in your favorite semantic theory
the meaning of a word like ‘life’ is ‘life¢’)).[5]

Mr Andersen does give us a hint of what the new age that Mr.
Evans is heralding will look like. It will be firmly anchored in “embodied”
cognition and empiricism (“arising directly in and from experience”) and mirror
neurons (“the same bit of the brain lights up when we see or do hammering”). Yup,
that’s the brave new world out there. If you ever thought that Greg Hickok’s
evisceration of this mental detritus was overkill, read this junk and the send
Greg a thank-you note (I’m sure he will also accept Starbuck’s or Amazon gift
cards). Sadly, we will need many Greg like efforts again and again for this
kind of junk seems to be both very attractive and impervious to criticism.

Let me make one more point and end. What we see in this
review is the resurgence of Empiricism (E). Yes, I know you were expecting me
to say this and I didn’t want to disappoint. But it’s true. It lies behind the
apparent inability of so many to understand the distinction between Chomsky and
Greenberg Universals. Confusing the two lies at the heart of the review (and of
the similarly pundit popular work by Everett) and it is easily explained once
one appreciates its E roots. What are these?

Es are comfortable with generalizations based on patterns in
the data. I’ve discussed this before (here).
If you believe that all generalizations are based on patterns manifest in the
data then the idea that something can be a universal (e.g. structure
dependence) but leaves no imprint in the (positive) linguistic data makes
little sense. So, universals like ‘if L is OV then it will be OP,’ or ‘all
languages distinguish Ns from Vs’ will leave induction friendly footprints in
the linguistic data and are ok for Es (recall, everyone, including Es accept
the idea that one needs to generalize). But other generalizations like ‘all
languages obey islands,’ or ‘rules must be structure dependent,’ or ‘anaphors
cannot be c-commanded by their antecedents,’ may leave little direct evidence
in the positive data of a particular language (especially of one restricts data
to what is expressed in naturalistic linguistic environments (viz. assumes that
negative data is not relevant)). If you are an E, the only legit generalizations
(aka Universals) are of the first kind. That’s why an E will find the Chomsky
conception of universals close to incomprehensible, and, not surprisingly, will
tend to confuse Chomsky and Greenberg universals (more accurately, will not be
able to distinguish them), as in fact happens again and again and again. So if
you are an E, stop it! It’s mentally stifling, and it leads to the kind of junk
thinking this review embodies.

I will end here. To repeat, I have no idea if Mr. Andersen
has correctly conveyed the content of Mr. Evan’s book (and I really do hope
that he got it all wrong and that Mr. Evans is about to sue for liable as one
can do in the UK with its unbelievably lax freedom of speech provisions).I do know, however, that Mr. Andersen doesn’t
know anything about Chomsky or Pinker or Fodor or GG or any work that has been
done in the last 65 years.Asking him to
review a book on linguistics is like asking Sampson (my deceased ex-pet Porty)
to review a book on animal cognition (actually, Sampson would have made fewer
obviously clueless remarks).

What’s sad is that this kind of junk finds its way into
things like New Scientist. This is a
venue that the scientifically interested look at to find out what’s happening
in other scientific domains. I have in fact done so myself in the past. But, from
now on I will be much more wary, for if this is what I find when an area I know
something about is discussed, it leads me to think that New Scientist is quite untrustworthy. And that’s too bad. I really
respect good popularization. It’s really hard to do it well (thx Mr. Pinker for
the Language Instinct) and it is important.
Sadly, it can also be done very badly. If you are looking for an excellent case
study in how bad very bad can be, I know of no better example than Mr.
Andersen’s review of Mr. Evan’s book.

[1]
The excitement seems to be spreading. CUP sent me this message from David
Crystal"...Evans builds a compelling case that will
be difficult to refute." Sounds like Andersen, which if correct does not
bode well for Evans, though to repeat I have not yet read the book.

[2]
The idea that language is an instinct is most clearly Pinker’s conceit. I don’t
actually ever recall Chomsky or Fodor using this terms in describing FL/UG or
LoT. And I think I know why. It has misleading connotations. Here’s the lexical
entry for the word via Google:

This definition does not really cover what GGers of
the Chomsky stripe or LoTers of the Fodor variety have meant. First, no “fixed
pattern of behavior” follows from their discussions. Remember the competence/performance
distinction. Well the innate structures of FoL primarily relate to competence,
not performance (i.e. what you know not what you do). Second, I’m not sure how
to understand “fixed.” FL/UG constrains the class of possible Gs (some rules ok,
others not), but it does not require that any given G have any particular shape
(there is no requirement that rule X be included in every G).

What’s
right about ‘instinct’ is that it is not learned, need not be conscious and is
triggered by input. However, the shades of meaning the word carries can be very
misleading and though I can see good advertising reasons for why Pinker used
the word in his title, I can also see reasons for avoiding it.

[3]
This 7k number keeps coming up. But there is no reason to think that there are
7k languages, at least if one counts these via the Gs that generate them. As
Richie Kayne once said, and I completely agree, there is either one language or
at least 7.125 billion (as of 2013), one for each person.Moreover, there is no reason to believe that
the set of possible languages, again
if G individuated is not several orders greater.

[4]
I should have said “possible” absence of Greenberg universals. As you out there
know, there appear to be some interesting Greenberg universals worth
explaining, and many Chomsky inspired linguists like Cinque, Roberts, Kayne,
Baker (and many others) are in the business of trying to account for them. I
would be surprised if Mr Andersen knows anything about this. The blooming
buzzing confusion that is the 7000 languages is enough for him. I wonder if the
diversity of life or various kinds of stuff in the world would lead Andersen to
conclude that there is no universal genetic code or that the periodic table is
ripe for dissolution. Consistency would suggest that he would, but consistency
is only the hobgoblin of petty minds, no doubt leaving Mr Andersen an exit
strategy.

[5]
John burgess, who is today visiting UMD and giving a talk and whose wonderful
papers I’ve just started reading puts it succinctly: “There is nothing that
could be called a body of accepted scientific conclusions about meaning… that
workers… can draw upon and apply to their concerns” (In his “Quine,
analyticity, and philosophy of mathematics”).

Monday, October 27, 2014

Take a look at this. A Prof is suspended from university duties for sarcasm and bad body language. Of course, this is not the real reason. That's provided in this paragraph:

Professor Docherty is a prominent critic of the marketization of education who has described the Russell Group - of which the University of Warwick is a member – as "a self-declared elite…even exerting a negative influence over others".

So, a critic of the university sighs and "undermines the authority of the chair." For this he is suspended for nine months. A tribunal rules on the "charges" and is cleared. This suggests that they concluded that he did not sigh, use negative body language or use undue sarcasm (this sounds like a Monty Python charge: what, undue sarcasm! Off with his head!!). As a college union rep put it:

"It beggars belief that an academic can be suspended with no contact with students or colleagues for almost a year while charges are finalised.”

Read that carefully. What the union found unconscionable was that the prof was suspended without charges being finalized. This suggests that so far as the union was concerned the charges themselves were fine. It now seems that undue sarcasm and negative vibes is unacademic. If there is a French dept job, Voltaire need not apply, I suppose.

Part of what makes this funny, of course, is that we all think that this is a one off thing. And most likely it is. But I wish I more strongly believed that this was so. Our fearless leaders don't like being made fun off (i.e. undermined). They don't really like to be laughed at. If this is true of politicians and leaders of industry, why not the bureaucrats that are heading up more and more institutions of higher learning. This would actually be funny, again Monty Python wise, were it not so pathetic.

Lila Gleitman sent me this note that I am posting on her behalf. It relates to an earlier post that reacted to a piece on the NYT on voacb acquisition in kids. The relevant links are provided in Lila's comment. So heeeere's Lila!

******

Here is more about the “quality of input” and vocabulary
acquisition matter (see here).Our group has studied this topic (See our
paper published in PNAS, 2013, not connected to the recent Hirsh-Pasek study;
the link for this is here).So I
want to correct the mangled NYTimes article that has generated so much interest
and perhaps will even influence educational policy in future. First, some
facts:Hirsh-Pasek, whose work/speech at
the White House this newspaper article reports, did not specifically look at
possible differences in “quality” (of which, more below) that might vary as a
function of class, wealth, etc.All her
subject learners were of lower class SES, and so could shed no light on whether
or the extent to which “quality input” is unequally distributed across SES
classes, because she had no comparison group (i.e., learners from other than
low SES strata). Yet the implication was left hanging in the NYT air, just by
mentioning that these children were all lower class, that wealthier people
provide classier “communicative foundations” to their offspring. Our own study does in fact make this SES comparison
directly, and the bottom line is that there is no measureable difference in
“quality of input,” if we can define such a thing at all, as a function of
social class.So either ignore what you
read in the NYT, or go read our article and see what you think in the light of
the evidence we presented.There is an
SES-linked difference in the quantity of speech (sheer number of words) infants
hear before the 5th birthday, however.

But now back to what “quality of input” could be, in any
sense relevant to facilitative environments for language learning.The Hirsh-Pasek study isn’t published as yet,
as far as I know, but her “White Paper” summary suggests that she “coded”
maternal speech and behavior for the extent to which it is “symbol infused,” and
related categories that are, perhaps themselves hard to understand or apply
generally. Despite the real
difficulties of such hand coding over highly variable naturalistic
interactions, there are some facts about
nonlinguistic environmental variance in relevant regards that are strong enough
to shine through. Specifically, as our
studies showed, there is a very powerful influence of referential transparency (that’s what “quality” largely comes down
to, when you peel away abstract labels like “foundations of communication” that
appear in this literature).Referential
transparency is simply your good-old commonsensical notion:there are times, during conversational
interaction, when a listener is attentionally focused on a particular thing,
action, etc., and the speaker via gesture, manipulation of the object,
gazing/pointing at it, also mentions it. A blatant case (pace Quine) is saying “This is
a squirrel” while pointing to and gazing at a squirrel in the presence of child’s
close attention to a squirrel.Turns
out that there are stable, measurable, and strikingly large familial differences
in the proportion of time that such informative extralinguistic contexts are
provided to infants (as I said, their presence/frequency unrelated to the
wealth or class of the family), and these differences (already observed in our
sample populations at child age 14 months) predict vocabulary size when
measured 3 years later as these children enter kindergarten.Colin in an earlier blog ably described why
this very large early vocabulary-size difference matters, over the longer run,
in the Real World of school and future job, so I won’t belabor that point.But a few words now on the lexicon and
whether it has any linguistic interest.

Of course it does. As has been evident since the seminal
work of Carol Chomsky on ask/tell/promise/easy/eager, the business of acquiring
the language-specific grammar is inextricably (I hate that word, but it’s right
here) tied up with acquiring the meanings of terms whose interpretation is not so
transparent to referential observation as is, say, “cat.”For instance, imagine a blind child
acquiring “look” and “see.”Or anyone
trying to acquire “think.”This can’t be
done if the input is solely referentially consistent cases, i.e., everyday scenarios
in which thinkers are thinking, but requires in addition (or even instead) access
to predicate-specific licensed syntactic structures.The issues here are not only relevant but
pretty central to linguistic theorizing, in my opinion (see the huge literature
on “syntactic bootstrapping”).But how
about “cat”?After all, learning must
begin with such homely cases, for which referential information provides the
bulk of the basis for identification. They begin to be understood as early as the 6th
month of life.These words provide the
scaffolding for at least rudimentary acquisition of the grammar (mainly:where, structurally speaking, is the
sentential subject in the exposure language) so their acquisition function is
of some preliminary interest for linguistic-developmental theorizing.This first lexical stock forms crucial grounding
information, the first step to enable all the later fancy footwork, i.e., the
gateway to the linguistic-computational achievements that can then – and
therefore – proceed. I hastily remind
that how the concepts themselves (not the words for them) are acquired is an
abiding mystery, but one that necessarily is engaged outside Linguistics,
notably by people who study perceptual development, theory of mind, and the
like (see, e.g., important lines of research from from Spelke; Csibra,and many others).

Saturday, October 25, 2014

In a previous post (here)
I discussed two possible PoS arguments. I am going to write about this again,
mainly to clarify my own thinking. Maybe others will find it useful. Here goes.
Oh yes, as this post got away from me lengthwise, I have decided to break it
onto two parts. Here’s the first.

The first PoS argument (PoS1) aims to explain why some Gs
are never attested and the other (PoS2) aims to examine how Gs are acquired
despite the degraded and noisy data that the LAD exploits in getting to its G. PoS1
is based on what we might call the “Non-Existing Data Problem” (NEDP), PoS2 on
the “Crappy Data Problem” (CDP). What I now believe and did not believe before
(or at least not articulately) is that these are two different problems each
raising their own PoS concerns. In other words, I have come to believe (or at
least think I have), that I was wrong, or had been thinking too crudely before
(this is a slow fat ball down the middle of the plate for the unkind; take a
hard whack!). On the remote possibility that my mistakes were not entirely
idiosyncratic, I’d like to ruminate on this theme a little and in service of
this let me wax autobiographical for a moment.

Long long ago in a galaxy far far away, I co-wrote (with
David Lightfoot) a piece outlining the logic of the PoS argument (here,
see Introduction). In that piece we described the PoS problem as resting on
three salient facts (9):

(1)The
speech the child hears is not “completely grammatical” but is filled with various
kinds of debris, including slips of the tongue, pauses, incomplete thoughts
etc.

(2)The
inference is from a finite number of G products (uttered expressions) to the G
operations that generated these products. In other words, the problem is an
induction problem where Gs (sets of rules) are projected from a finite number
of examples that are the products of these rules.

(3)The
LAD attains knowledge of structures in their language for which there is no evidence in the PLD.

We summarized the PoS problem as follows:

… we see a rich system of knowledge
emerging despite a poverty of the linguistic stimulus and despite being
underdetermined by the data available to the child. (10)

We further went on to argue that of these three data under-determination
problems the third is the most important for it logically highlights the need
for innate structure in the LAD. Or, more correctly, if there are consistent generalizations native speakers make that
are only empirically manifested in
complex structures that are unavailable to
the LAD then these generalizations must
reflect the structure of the LAD rather than that of the PLD. In other words,
cases where the NEDP applied can be used as direct probes into the structure of
the LAD and, as there are many cases where the PLD is mute concerning the
properties of complex constructions (again, think ECP effects, CED effects,
Island Effects, Binding effects etc), these provide excellent (indeed optimal)
windows into the structure of FL (i.e. that component of the LAD concerned with
acquiring Gs).

I still take this argument form to be impeccable.However, the chapter went on to say (this is
likely my co-authors fault, of course! (yes, this is tongue in cheek!!!)) the
following:

If such a priori knowledge must be
attributed to the organism in order to circumvent [(3)], it will also provide a
way to circumvent [(1)] and [(2)]. Linguists need not concern themselves with
the real extent of deficiencies [(1)] and [(2)]; the degenerateness and
finiteness of the data are not real problems for the child because of the fact
that he is not totally dependent on his linguistic experience, and he knows
certain things a priori; in many areas, exposure to a very limited range of
data will enable a child to attain the correct grammar, which in turn will
enable him to utter and understand a complex range of sentence types. (12-13).

And this is what I no longer believe. More specifically, I
had thought that solving the PoS based solely on the NEDP would also suffice to
solve the acquisition problem that the LAD faces due to the CDP.I very much doubt that this is true.Again, let me say why. As background, let’s
consider again the idealizations that bring PoS1 into the clearest focus.

The standard PoS makes the following very idealized
assumptions:

(4)a. The LAD is an ideal speaker-hearer.

b.The
PLD is perfect PLD: from a single G, presented “all at once,”

c.The
PLD is “simple.” simple clauses more or less

What’s this mean? (4a) abstracts away from reception
problems.The LAD does not “mishear” the
input, its attention never wavers, its articulations are always pristine,
etc.In other words, the LAD can extract
whatever information the PLD contains.(4b) assumes that the PLD on offer to the LAD is flawless. Recall that
the LAD is exposed to linguistic utterances from which it must look for
grammatical structure. The utterances may be better or worse vehicles for these
structures. For example, utterances can be muddy (mispronunciation), imperfect
(spoonerisms, slips of the tongue), incomplete (hmming and hawing and
incomplete thoughts). Moreover, in the typical acquisition environment, the
ambient PLD consists of utterances of linguistic expressions (not all of them
sentences) generated by a myriad of Gs. In fact, as no two Gs are identical
(and even one speaker typically has several registers) it is very unlikely that
any single G can cover all of the actual PLD.(4b) abstracts away from this. It assumes that utterances have no
performance blemishes and that all
the PLD is the product of a single G.

These assumptions are heroic, but they are also very useful.
Why? Because together with (4c) they serve to focus attention on PoS1, which
recall is an excellent window (when available) into the native structure of FL.
(4c) restricts the PLD to “simple” input. As noted (here)
a good proxy for “simple” is un-embedded main clauses (plus a little bit,
Degree 0+).[1]
In effect, assumptions (4a,b) abstract away from the CDP and (4c) focuses
attention on the NEDP and what it implies for the structure of LADs.

As indicated, this is an idealization. Its virtue is that it
allows one to cleanly focus on a simple problem with big payoffs if one’s
interest is in the structure of FL.[2]
The real acquisition situation however is known
to be very different. In fact, it’s much more like (5):

(5)a. Noisy
Data

b. Non-homogeneous
PLD

Thus, the actual PLD
is problematic for the LAD in two important ways in addition to it being
deficient in NEDP terms. First, there is lots of noise in input as there is
often a large distance between pristine sentences and muddy utterances. On the
input side, then, the PLD is hardly uniform (different speakers, registers),
contains unclear speech, interjections, slips of the tongue, incomplete and
wayward utterances, etc. On the intake side, the actual LAD (aka: baby) can be
inattentive, mishear, have limited intake capacity (memory) etc.Thus, in contrast to the idealized data
assumed for PoS1, the actual PLD can be very much less than perfect.

Second, the PLD consists of expressions from different Gs.
In the extreme, as no two people have the exact same G, every acquisition situation
is “multi-lingual.” In effect, standard acquisition is more similar to cases of
creolization (i.e. multiple “languages” being melded into one) than to the
ideal assumed in PoS1 investigations.[3]
Thus there is unlikely to be a single G that fits all the actual PLD. Moreover,
the noisy data is presented incrementally, thus, not all-at-once. Therefore, the
increments are not only noisy but with respect to the LADs as a whole, the
actual PLD is quite variable. It is very likely that no two actual LADs get the
same sequence of input PLD.

These two features it is reasonable to believe can raise
their own PoS problems. In fact, Dresher and Fodor/Sakas have shown that
relaxing the all-at-once assumption makes parameter setting very challenging if
the parameters are not independent (which there is every reason to believe is
the case). Dresher, for example, demonstrated that even a relatively simple
stress LAD has serious problems incrementally setting its parameters. I can
only imagine the problems that might accrue were the PLD not only presented
incrementally, but was drawn from different stress Gs 10% of which were
misleading.

And that’s the point I tried to take away from the Gigerenzer
& Brighton (G&B) paper: it is unlikely that the biases required to get
over the PoS1 hurdle will suffice to get actual LADs over PoS2. What G&B
suggests is that getting through the noise and the variance of the actual PLD
favors a very selective use of the input data. Indeed, given what we suspect,
if you can match the data too well
you will likely not be tracking a real G given that the PLD is not homogeneous, noise free and closely
clustered around a single G. And this is due both to performance considerations
(sore throats, blocked noses, “thinkos,” inarticulateness, inattention, etc.)
and non-homogeneity (many Gs producing the ambient PLD).In the PoS2 context things like the
Bias-Variance-Dilemma might loom large. In the first they don’t because our
idealizations abstract away from the kinds of circumstances that can lead to
them.[4]

So, I was wrong to run together PoS1 problems and PoS2
problems. The two kinds of investigations are related, I still believe, but
when the PoS1 idealizations are relaxed new PoS problems arise. I will talk
about some of this next time.

[1]
In modern terms this would be something like the top two phases of a clause (C
and v*).

[2]
This kind of idealization functions similarly to what we do when we create
vacuum chambers within which to drop balls to find out about gravity. In such
cases we can physically abstract away from interfering causal factors (e.g.
friction). Linguists are not so lucky. Idealization, when it works, serves the
same function: to focus on some causal powers to the exclusion of others.

[3]
In cases of creolization, if the input is from pidgins then the ambient PLD
might not reflect underlying Gs at all, as pidgins may not be G based (though
I’m not sure here). At any rate, the idea that actual PLD samples from products
of a single G is incorrect. Thus every case of real life acquisition is a problem in which PLD springs from
multiple different Gs.

[4]
In fact, Dresher and Fodor&Sakas present ways of ignoring some of the data
to enforce independence on the parameters thus allowing them to incrementally
set parameters.Ignoring data and having
a bias seem (impressionistically, I admit) related.