Comments

Monday, September 30, 2013

As I have noted in the past, Andrew Gelman, a statistician
at Columbia (and someone whose work I both follow and have learned from) has a
bee in his bonnet about Marc Hauser (and, it seems, Chomsky) (see here). In many of his posts he has asserted that
Hauser fraudulently published material that he knew to be false, and this is
why he takes such a negative view of Hauser (and those like Chomsky who have
been circumspect in their criticisms).
Well, Hauser has a new book out Evilicious
(see here
for a short review). Interestingly, the book has a blurb from Nicholas Wade,
the NYTs science writer that covered the original case. More interestingly the
post provides links to Wade’s coverage for the NYT of the original “case” and
because I have nothing better to do with my time I decided to go back and read
some of that coverage. It makes for very
interesting reading. Here are several
points that come out pretty clearly:

1.The
case against Hauser was always quite tenuous (see here
and here). Of the papers for which he as accused of
fabrication, two were replicated very quickly to the satisfaction of Science’s referees. The problem for these
was not fabrication, but not having the original tapes readily available. Sloppy? Perhaps. Fraud? No evidence here.

2.Of
the eight charges of misconduct, five involved unpublished material. This
is a very high standard. I would be curious to know how many other scientists
would like to be held responsible for what they did not publish. A Dr. Galef
from McMaster University (here) notes
incredulously in the NYT (rightly in my view): “How in the world can they get
in trouble for data they didn’t publish?”

3.L’Affaire
Hauser then devolves down to one
experiment published in Cognition
that Galef notes “was very deeply flawed.” However, as I noted (here)
the results have since been replicated. That, like the Science replications, suggests that the original paper was not as
flawed as supposed. Sloppy? Yes. Flawed? Perhaps (the replication suggests that
what was screwed up did not matter much). Fraud? Possible, but the evidence is
largely speculative.

4.The
NYT’s pieces provide a pretty good case that (at least one) outside scientists,
who reviewed the case against Hauser, thought that “the accusations were unfounded.”
Maybe Galef is a dupe, but his creds look ok to me. At any rate, someone with
expertise in Hauser’s field reviewed the evidence against him and concluded
that there was not enough evidence to conclude fraud, or anything close to it.

5.Galef
further noted that the Harvard investigating committee did not include people
familiar with “the culture of an animal behavior laboratory,” which has “a
different approach to research and data-keeping” than what one finds in other
domains, especially the physical sciences from which the members of the Harvard
investigating committee appeared to come from. I’m pretty sure that few
behavioral or Social scientists would like to be subject to the standards of
experimental hygiene characteristic of the work in the physical sciences. Is it
possible that in the investigation that
set the tone for what followed, Hauser was not judged by scientists with
the right background knowledge?

6.The
final ORI report concentrates on the retracted (subsequently replicated) Cognition article (here).
The claim is that “half the data in the graph were fabricated.” Maybe. It would
be nice to know what the ORI based this judgment on. All involved admit that
the experimental controls were screwed up and that some of the graphs, as
reported, did not reflect the experiments conducted. I have no trouble
believing that there was considerable sloppiness (though to repeat, it seems
not to have been fatal given the subsequent replication), but this would not
support ORI’s assertion of fabrication, a term that carries the connotation of
intentional deceit. I suspect that the ORI
judgment rests on the prior Harvard findings. This leaves me thinking: why did
this particular set of data get through the otherwise pretty reliable vetting process
in Hauser’s lab, one that nixed earlier questionable data? Recall, the data from five of the other investigated
papers were vetted before being sent out and as a result the reported data were
changed. What happened in this case? Why did this one slip through? I can
understand ORI’s conclusion if tons of published data was fabricated. But this
is precisely what there is no evidence
for. Why then this one case? Is it so
unlikely that some goof up casued the slip? As Altmann, one of Hauser’s early
accusers, notes in the NYT: “It is conceivable that the data were not
fabricated, but rather that the experiment was set up wrong, and that nobody
realized this until after it was published.” In the end, does the whole case
against Hauser really devolve to what happened in this one case? With
effectively one set of controls? Really?

I suspect that the real judgment against Hauser rests on the
fact that he resigned from Harvard and that the Harvard committee initially set
up to investigate him (but whose report, so far as I know, has never been
publically made available) decided he was guilty. Again, as Altmann notes in the
NYT: his earlier accusation was “heavily dependent on the knowledge that
Harvard found Professor Hauser guilty of misconduct.” This, coupled with the
thought that Hauser would not have quit were he innocent. But, it’s not a
stretch to think of many reasons why a non-guilty person might quit, being fed
up with one’s treatment perhaps topping the list. I am not saying that this is
why Hauser resigned. I don’t know. But to conclude that he must be guilty
because he left Harvard (who but a criminal would leave this august place,
right? Though on second thought…) is hardly an apodictic conclusion. In fact,
given all the evidence to date, it strikes me that the charges of fraud have
very little evidential support, and that there are decent alternative
explanations for what took place absent the intention to deceive. In fact, I
would suggest that given the gravity of the charge, we should set a pretty high
evidential bar before confidently declaring fraud. As far as I can tell from
reading what’s in the NYT, this bar has not been remotely approached, let alone
cleared.

I think that this will be the last post on this topic for
me. I have harped on it because,
frankly, I think Hauser got a raw deal, from what I can tell, and that the
condemnations he drew down on himself struck me as a tad too smug, uninformed,
and self satisfied. As I’ve said before: fraud does not seem to me to be the
biggest source of data pollution, nor the hardest one to root out. However, such charges are serious and should
not be leveled carelessly. Careers are at stake. And from what I can tell, in
this particular case, Hauser was not given the benefit of the doubt, as he
should have been.

Thursday, September 26, 2013

Writing a paper is hard. Getting others to read it seriously
is often harder. Writers often make the
second task harder for readers by overwriting and the review process often encourages
it, with authors trying to bury reviewers’ critical comments under pages of
caveats that serve mainly to obscure the main point of the paper. Here are a
couple of posts that I’ve found that deliver some useful advice to authors (here
and here).

The first post is on Elmore Leonard and his
ten rules for writing. If you have never heard of him or never read any of his
novels, let me recommend them to you wholeheartedly. He is a terrific “crime”
novelist whose books are one of life’s guilty wonderful pleasures. As you will
notice, not all the suggested rules will be all that applicable to linguistics
writing (though if the paper is on expletives (i.e. it’s raining) then maybe the first one should be ignored. However,
I agree with Taylor about 2 and 10 with one caveat. Throat clearing should be
avoided but a concise description of the problem of interest and why it is of interest is often very
helpful. This more or less is what
Krugman is highlighting in his deliciously nasty piece.

David Poeppel used to emphasize the importance of explaining
why anyone should care about the work
you are presenting. Why is the paper worth anyone’s time? Why is the problem
important? Why is the data worthy of note?
Why should anyone who has access to a good Elmore Leonard novel spend it
reading your paper instead? You’d be
surprised (or not) how often this simple question stumps, and if it stumps an
author, then chances are it will have baleful effects on a reader.

Tuesday, September 24, 2013

MP makes the working assumption that whatever happened to allow FL to emerge in its current state happened recently in evo time. This, in turn relies on assuming that precursors of us were without our UG (though they may have had quite a bit of other stuff going on between the ears, in fact, they MUST have had quite a bit of stuff going on there). This assumption was recently challenged by Dediu and Levinson (D&L). Here's an evaluation of their paper by Berwick, Hauser and Tattersall (BHT) (here). BHT argue that there is no there there, a feature, it appears, of much of Levinson's current oeuvre (see here). They observe that the evidence for the quick time frame is sorta/kinda supported by the archeological record, but that such evidence can hardly be dispositive as it is not fine grained enough to address the properties of "the core linguistic competence" of our predecessors as this "does not fossilize." However, such that exists does appear to (weakly) support the envisaged timeframe proposed (roughly 100,000 years). Indeed, as BHT note, D&L misrepresent an important source (Somel et. al) which, concludes, contrary to D&L that: "There is accumulating evidence that human brain development was fundamentally reshaped through several genetic events within the short time space between the human-Neandertahl split and the emergence of modern humans."

So take a look. The D&L paper got a lot of play, but if BHT are right (that's where my money is) then it's pretty much a time sink with little to add to the discussion. You surprised? I'm not. But read away.

Monday, September 23, 2013

Ewan (here)
provides me the necessary slap upside the head, thereby preventing a personality
shift from stiff-necked defender of the faith to agreeable, reasonable
interlocutor. Thanks, I needed that. My recognition that Alex C (and Thomas G)
had reasonable points to make in the context of our discussion, had stopped me
from thinking through what I take the responsibilities of formalizations to
consist in. I will try to remedy this a bit now.

Here’s the question: what makes for a good formalization. My
answer: a good formalization renders perspicuous the intended interpretation of the theory that it is formalizing. In other words, a good formalization (among
other things) clarifies vagaries that, though not (necessarily) particularly
relevant in theoretical practice, constitute areas where understanding is
incomplete. A good formalization, therefore, consults the theoretical practice
of interest and aims to rationalize it through formal exposition. Thus
formalizing theoretical practice can have several kinds of consequences. Here
are three (I’m sure there are others): it might reveal that a practice faces
serious problems of one kind or another due to implicit features of its
practice (see Berwick’s note here)
(or even possible inconsistency (think Russell on Frege’s program)), or it
might lay deeper foundations (and so set new questions) for a practice that is
vibrant and healthy (think Hilbert on Geometry), or it may attempt to clarify
the conceptual bases of a widespread practice (think Frege and
Russell/Whitehead on the foundations of arithmetic). At any rate, on this
conception, it is always legit to ask if the formalization has in fact captured the practice
accurately. Formalizations are judged against the accuracy of their depictions
of the theory of interest, not vice versa.

Now rendering the intended interpretation of a practice is
not at all easy. The reason is that most
practices (at least in the sciences) consist of a pretty well articulated body
of doctrine (a relatively explicit theoretical tradition) and an oral
tradition. This is as true for the Minimalist Program (MP) as for any other
empirical practice. The explicit tradition involves the rules (e.g. Merge) and
restrictions on them (e.g. Shortest Attract/Move, Subjacency). The oral
tradition includes (partially inchoate) assumptions about what the class of
admissible features are, what a lexical item is, how to draw the
functional/lexical distinction, how to understand thematic notions like
‘agent,’ ‘theme,’ etc. The written
tradition relies on the oral one to launch explanations: e.g. thematic roles
are used by UTAH to project vP structure, which in turn feeds into a
specification of the class of licit dependencies as described by the rules and
the conditions on them. Now in general, the inchoate assumptions of the oral
tradition are good enough to serve various explanatory ends, for there is wide
agreement on how they are to be applied in
practice. So for example, in the thread to A Cute Example (here)
what I (and, I believe David A) found hard to understand in Alex C’s remarks
revolved around how he was conceptualizing the class of possible features.
Thomas G came to the rescue and explained what kinds of features Alex C likely
had in mind:

"Is the idea that to get
round 'No Complex Values', you add an extra feature each time you want to
encode a non-local selectional relation? (so you'd encode a verb that selects
an N which selects a P with [V, +F] and a verb that selects an N which selects a
C with [V, +G], etc)?"

Yes, that's pretty much it.
Usually one just splits V into two categories V_F and V_G, but that's just a
notational variant of what you have in mind.

Now, this really does clarify things. How? Well, for people
like me, these kinds of features fall outside the pale of our oral tradition,
i.e. nobody would suggest using such contrived items/features to drive a
derivation. They are deeply unlike
the garden variety features we standardly invoke (e.g. +Wh, case, phi, etc.)
and, so far as I can tell, restricted to these kinds of features, the problem
Alex C notes does not seem to arise.

Does this mean that all is well in the MP world? Yes and No.

Yes, in the sense that Alex C’s worries, though logically on
point, are weak in a standard MP (or GB) context for nobody supposes the kinds
of features he is using to generate the problem exist. This is why I still find
the adverb fronting argument convincing and dispositive with regard to the
learnability concerns it was deployed to address. Though I may not know how to
completely identify the feature malefactors that Thomas G describes, I am
pretty sure that nothing like them is part of a standard MPish account of
anything.[1]
For the problem Alex C identifies to be actually worrisome (rather than just be
possibly so) would require showing that the run of the mill, every day, garden
variety features that MPers use daily could generate trouble, not features that
practitioners would reject as “nutty” could.[2]

No, because it would be nice to know how to define these possible “nutty” features
out of existence and not merely rely on inchoate aspects of the oral tradition
to block them. If we could provide an explicit definition for what counts as a
legit feature (and what not) then we will have learned something of theoretical
interest even if it fails to have much of an impact on the practice as a result
the latter’s phobia for such kinds of features to begin with. Let me be clear
here: this is definitely a worthwhile thing to do and I hope that someone
figures out a way to do this. However, I doubt that it will (or should)
significantly alter the conclusion concerning FL/UG like those animated by
Chomsky’s “cute example.”[3]

I completely agree with you that if we reject P&P,
the evaluation measure ought to receive a lot more attention. However, in the
case of trivial POS argument such as subject/aux inversion, I think the
argument can be profitably run without having a precise theory of the
evaluation measure.

[2]
Bob Berwick’s comment (here)
shows how to do this in the context of GPSG and HPSG style grammars. These
systems rely on complex features to do what transformations do in GB style
theories. What Bob notes is that in the context of features endowed with these
capacities, serious problems arise. The take home message: don’t allow such
features.

I am convinced by what I think is a slightly different
argument, which is that you can use another technology to get the dependency to
work (passing up features), but that just means our theory should be ruling out
that technology in these cases. as the core explanandum (i.e. the
generalization) still needs explained. I think that makes sense…

Thursday, September 19, 2013

Recall the idea: mental computation involves DNA/RNA manipulations. The idea is that these complex molecules have the structure to support classical computations, as opposed to neural nets, and are ideal candidates for memory storage, again as opposed to nets. The Gallsitel-King conjecture is that brain computations exploit this structure (e.g. here). One place quite ripe for this kind of process is long term memory. There is progressively more evidence that the genome plays an important role in this area (see here). Here are two more reports (here and here) on recent memory research highlighting the role of genes in fixing and extinguishing memories.

Andrew Gelman, a statistician at Columbia (and one whose
opinions I generally respect (I read his blog regularly) and whose work, to the
degree that I understand it, I really like), has a thing about Hauser (here).[1]
What offends him are Hauser’s (alleged) data faking (yes, I use ‘alleged’
because I have personally not seen the evidence, only heard allegations, and given how easy these are to make, well, let's just try not to jump to comfortable conclusions). Here
he explains why the “faking” is so bad, not rape, murder or torture bad, but science
wise bad. Why? Because what fake data does is “waste people’s time (and some lab
animals’ lives) and slow down the progress of science.” Color me skeptical.

Here’s what I mean: is this a generic claim or one specific
to Hauser’s work? If the latter, then I
would like to see the evidence that his
alleged improprieties had any such effect. Let me remind you again (see here, here,) that the results of all of Hauser’s papers that
were questioned have since been
replicated. Thus, the conclusions of these papers stand. Anyone who relied
on them to do their research did just fine.
Was there a huge amount of time and effort wasted? Did lab animals get
used in vain? Maybe. What’s the evidence? And maybe not. They all replicated. Moreover, if the measure of his crime is
wasted time and effort, did Hauser’s papers really lead down more blind alleys
and wild goose chases then your average unreplicable psych or neuro paper (here).

As for the generic claim, I would like to see more evidence
for this as well. Among the “time
wasters” out there, is faked data really the biggest problem, or even a very big problem? Or is this sort of like Republican "worries" about fake voters inundating the polls and voting for Democrats? My impression is that the misapplication of
standard statistical techniques to get BS results that fail to replicate are
far more problematic (see
here and here).
If this is so, then Gelman’s fake data worries, by misdirection, may be leading us away from the more serious time wasters, i.e. it diverts
attention from the real time sinks, viz. the production of non-replicable “results,”
which, so far as I can tell is closely tied to the use of BS statistical
techniques to coax significance in one out of every 20 or so experiments. We should be
so lucky that the main problem is fakery!

So that I am not misunderstood, let me add, that nobody I know condones faking data. But
this is not because it in some large measure retards the forward march of
science (this claim may be true, but it is not a truism), but because faking is
quite generally bad. Period (again, not unlike voter fraud). And it
should not be practiced or condoned for the same reason that lying, bullying,
and plagiarism should not be practiced or condoned. These are all lousy ways to
behave. However, that said, I have real
doubts that fake data is the main problem holding back so many of the “sciences,”
and claiming otherwise without evidence can misdirect attention from where it
belongs. The main problem with many of the “sciences” is absence of even a
modicum of theory, i.e. lack of insight (i.e. absence of any idea about what’s
going on), and all the data mining in the world cannot substitute for one or
two really good ideas. The problem I have with Gelman’s obsession is that in
the end it suggests a view of science that I find wrongheaded: that data mining
is what science is all about. As the posts noted above indicate, I could not
disagree more.

[1]
This is just the latest of many posts on this topic. Gelman, for some reason I
cannot fathom, also has a thing about Chomsky, as flipping through his blog
will demonstrate (e.g. here,
and here).