Comments

Monday, August 31, 2015

I ran across this nice little discussion today (here) that others might find interesting. It identifies a plausible cost behind getting too methodologically stringent. There is a tradeoff between generating false positives and false negatives. What we want are methods that get to the truth when it is there (sensitivity) and tell us when what we think is so is not (specificity). There is, not surprisingly, a tradeoff here and tightening our "standards" can have unfortunate effects. This is especially true in domains where we know little. When you know a lot, then maybe generating exciting new ideas is less important than not being mislead experimentally. But when we know little, then maybe a little laxity is just what the scientists ordered. At any rate, I found the framing of the issues useful so I pass the piece onto you.

Friday, August 28, 2015

As you no doubt all know, there is a report today in the NYT about a study in Science that appears to question the reliability of many reported psych experiments. The hyperventilated money quote from the article is the following:

...a painstaking yearslong effort to reproduce 100 studies published in three leading psychology journals has found that more than half of the findings did not hold up when retested.

The clear suggestion is that this is a problem as over half the reported "results" are not. Are not what? Well, not reliable, which means to say that they may or may not replicate. This, importantly, does not mean that these results are "false" or that the people who reported them did something shady, or that we learned nothing from these papers. All it means is that they did not replicate. Is this a big deal?

Here are some random thoughts, but I leave it to others who know more about these things than I do to weigh in.

First, it is not clear to me what should be made of the fact that "only" 39% of the studies could be replicated (the number comes from here). Is that a big number or a small one? What's the base line? If I told you that over 1/3 of my guesses concerning the future value of stocks were reliable then you would be nuts not to use this information to lay some very big bets and make lots of money. If I were able to hit 40% of the time I came up to bat I would be a shoe-in inductee at Cooperstown. So is this success rate good or bad? Clearly the headline makes it look bad, but who knows.

Second, is this surprising? Well, some of it is not. The studies looked at articles from the best journals. But these venues probably publish the cleanest work in the field. Thus, by simple regression to the mean, one would expect replications to not be as clean. In fact, one of the main findings is that even among studies that did replicate, the effects sizes shrank. Well, we should expect this given the biased sample chosen from.

Third, To my mind it's amazing that any results at all replicated given some of the questions that the NYT reports being asked. Experiments on "free will" and emotional closeness? These are very general kinds of questions to be investigating and I am pretty sure that these phenomena are the the results of the combined effects of very many different kinds of causes that are hard to pin down and likely subject to tremendous contextual variation due to unknown factors. One gets clean results in the real sciences when causes can be relatively isolated and interaction effects controlled for. It looks like many of the experiments reported were problematic not because of their replicability but because they were not looking for the right sorts of things to begin with. It's the questions stupid!

Fourth, in my shadenfreudeness I cannot help but delight in the fact that the core data in linguistics gathered in the very informal ways that it is is a lot more reliable (see Sprouse and Almeida and Schutze stuff on this). Ha!!! This is not because of our methodological cleverness, but because what we are looking for, grammatical effects, are pretty easy to spot much of the time. This, does not mean, of course, that there aren't cases where things can get hairy. But over a large domain, we can and do construct very reliable data sets using very informal methods (e.g. can anyone really think that it's up for grabs whether 'John hugged Mary' can mean 'Mary hugged John'?). The implication of this is clear, at least to me: frame the question correctly and finding effects becomes easier. IMO, many psych papers act as if all you need to do is mind your p-values and keep your methodological snot clean and out will pop interesting results no matter what data you throw in. The limiting case of this is the Big Data craze. This is false, as anyone with half a brain knows. One can go further, what much work in linguistics shows is that if you get the basic question right, damn methodology. It really doesn't much matter. This is not to say that methodological considerations are NEVER important. Only that they are only important in a given context of inquiry and cannot stand on their own.

Fifth, these sorts of results can be politically dangerous even though our data are not particularly flighty. Why? Well, too many will conclude that this is a problem with psychological or cognitive work in general and that nothing there is scientifically grounded. This would be a terrible conclusion and would affect linguistic support adversely.

There are certainly more conclusions/ thoughts this report prompts. Let me reiterate what I take to be an important conclusion. What these studies shows is that stats is a tool and that method is useful in context. Stats don't substitute for thought. They are neither necessary nor sufficient for insight, though on some occasions they can usefully bring into focus things that are obscure. It should not be surprising that this process often fails. In fact, it should be surprising that it succeeds on occasion and that some areas (e.g. linguistics) have found pretty reliable methods for unearthing causal structure. We should expect this to be hard. The NYT piece makes it sound like we should be surprised that reported data are often wrong and it suggests that it is possible to do something about this by being yet more careful and methodologically astute, doing our stats more diligently. This, I believe, is precisely wrong. There is always room for improvement in one's methods. But methods are not what drive science. There is no method. There are occasional insights and when we gain some it provides traction for further investigation. Careful stats and methods are not science, though the reporting suggests that this is what many think it is, including otherwise thoughtful scientists.

Wednesday, August 26, 2015

As many of you know, when Neil Armstrong landed on the moon he forgot to bring along enough indefinite articles. On landing he made the following very famous statement:

That's one small step for man, one giant leap for mankind

The semanticists in the crowd will appreciate that what he meant to say was not this contradiction but "that's one small step for A man…"

For a while I thought that I found this lost indefinite. I found its way into Kennedy's famous Berlin speech, where he identified with the city's residents by uttering the famous sentence "Ich bin Ein Berliner." However, my German friends assured me that though Kennedy's shout out was well received and was very moving, it did not quite mean what it said. Apparently what he wanted to say was "Ich bin Berliner," the added Ein, the indefinite article leads to the following translation: "I am a jelly donut," a Berliner being a scrumptious city specific treat. At any rate, move the indefinite from Kennedy to Armstrong and historical linguistics is set right.

Ok, this is all just preamble for the real point of this post. I heard a great trivial story today that I thought I would relay to you. I know that I don't usually do this sort of stuff, but heh, WTH. I got this from a friend and it puts Armstrong into a very positive light. Here is what I got verbatim.

IN CASE YOU DIDN'T ALREADY KNOW THIS LITTLE TIDBIT OF WONDERFUL TRIVIA..............

ON JULY 20, 1969, AS COMMANDER OF THE APOLLO 11 LUNAR MODULE, NEIL ARMSTRONG WAS THE FIRST PERSON TO SET FOOT ON THE MOON.

HIS FIRST WORDS AFTER STEPPING ON THE MOON,

"THAT'S ONE SMALL STEP FOR MAN, ONE GIANT LEAP FOR MANKIND," WERE TELEVISED TO EARTH AND HEARD BY MILLIONS.

BUT, JUST BEFORE HE RE-ENTERED THE LANDER, HE MADE THE ENIGMATIC REMARK "GOOD LUCK, MR. GORSKY."

MANY PEOPLE AT NASA THOUGHT IT WAS A CASUAL REMARK CONCERNING SOME RIVAL SOVIET COSMONAUT.

HOWEVER, UPON CHECKING, THERE WAS NO GORSKY IN EITHER THE RUSSIAN OR AMERICAN SPACE PROGRAMS ..

OVER THE YEARS, MANY PEOPLE QUESTIONED ARMSTRONG AS TO WHAT THE 'GOOD LUCK, MR. GORSKY' STATEMENT MEANT, BUT ARMSTRONG ALWAYS JUST SMILED.

ON JULY 5, 1995, IN TAMPA BAY , FLORIDA , WHILE ANSWERING QUESTIONS FOLLOWING A SPEECH, A REPORTER BROUGHT UP THE 26-YEAR-OLD QUESTION ABOUT Mr.GorskyTO ARMSTRONG.

THIS TIME HE FINALLY RESPONDED BECAUSE HIS MR. GORSKY HAD JUST DIED,SO NEIL ARMSTRONG FELT HE COULD NOW ANSWER THE QUESTION. HERE IS THE ANSWER TO

"WHO WAS MR. GORSKY":

IN 1938, WHEN HE WAS A KID IN A SMALL MID-WESTERN TOWN , HE WAS PLAYING BASEBALL WITH A FRIEND IN THE BACKYARD.

HIS FRIEND HIT THE BALL, WHICH LANDED IN HIS NEIGHBOR'S YARD BY THEIR BEDROOM WINDOW. HIS NEIGHBORS WERE MR. AND MRS. GORSKY.

AS HE LEANED DOWN TO PICK UP THE BALL, YOUNG ARMSTRONG HEARD MRS. GORSKY SHOUTING AT MR. GORSKY,

"SEX! YOU WANT SEX?! YOU'LL GET SEX WHEN THE KID NEXT DOOR WALKS ON THE MOON!"

It broke the place up. NEIL ARMSTRONG'S FAMILY CONFIRMED THAT THIS IS A TRUE STORY.

2.Linguistic
creativity: “a mature native speaker can produce a new sentence on the
appropriate occasion, and other speakers can understand it immediately, though
it is equally new to them’ (Chomsky, Current
Issues: 7). In other words, a native speaker of a given L has command over
a discrete (and for all practical and theoretical purposes) infinity of
differently interpreted linguistic expressions.

3.Plato’s
Problem: Any human child can acquire any language with native proficiency if
placed in the appropriate speech community.

These four facts have two salient properties. First, they
are more or less obviously true. That’s why nobody will win a Nobel prize for “discovering”
any of them. It is obvious that nothing does language remotely like humans do,
and that any kid can learn any language, and that there is for all practical
and theoretical purposes an infinity of sentences a native speaker can use, and
that the kind of linguistic facility we find in humans is a recentish
innovation in biological terms (ok, the last is slightly more tendentious, but
still pretty obviously correct). Second, these facts can usefully serve as
boundary conditions on any adequate theory of language. Let’s consider them in
a bit more detail.

(1) implies that there is something special about humans
that allows them to be linguistically proficient in the unique way that they
are. We can name the source of that proficiency: humans (and most likely only humans) have a linguistically
dedicated faculty of language (FL) and “designed” to meet the computational
exigencies peculiar to language.

(2) implies that native speakers acquire Gs (recursive /procedures
or rules) able to generate an unbounded number of distinct linguistic objects
that native speakers can use to express their thoughts and to understand the
expressions that other native speakers utter. In other words, a key part of
human linguistic proficiency consists in having an internalized grammar of a
particular language able to generate an unbounded number of different
linguistic expressions. Combined with (1), we get to the conclusion that part
of what makes humans biologically unique is a capacity to acquire Gs of the
kind that we do.

(3) implies that all human Gs have something in common; they
are all acquirable by humans. This strongly suggests that there are some
properties P that all humans have that allow them acquire Gs in the effortless
reflexive way that they do.

Indeed, cursory inspection of the obvious facts allows us to
say a bit more: (i) we know that the data available to the child vastly
underdetermines the kinds of Gs that we know humans can acquire thus (ii) it
must be the case that some of the limits on the acquirable Gs reflect “the
general character his [the acquirers NH] learning capacity rather that the
particular course of his experience” (Chomsky, Current Issues; 112). (1), (2) and (3) imply that FL consists in
part of language specific capacities that enable humans to acquire some kinds
of Gs more easily than others (and, perhaps, some not at all).

Here’s another way of saying this. Let’s call the
linguo-centric aspect of FL, UG. More specifically, UG is those features of FL
that are linguistically specific, in contrast to those features of FL that are
part of human or biological cognition more generally. Note that this allows for
FL to consist of features that are not
parts of UG. All that it implies is that there are some features of FL that are linguistically proprietary. That some
such features exist is a nearly apodictic conclusion given facts (1)-(3). Indeed,
it is a pretty sure bet (one that I would be happy to give long odds on) that
human cognition involves some biologically
given computational capacities unique to
humans that underlie our linguistic facility. In other words, the UG part of FL
is not null.[1]

The fourth fact implies a bit more about the “size” of UG: (4)
implies that the UG part of FL is rather restricted.[2]
In other words, though there are some
cognitively unique features of FL (i.e. UG is not empty), FL consists of many
operations that FL shares with other cognitive components and that are likely
shared across species. In other words, though UG has content, most of FL consists of operations and
conditions not unique to FL.

Now, the argument outlined above is often taken to be very
controversial and highly speculative. It isn’t. That humans have an FL with some
unique UGish features is a trivial conclusion from very obvious facts. In
short, the conclusion is a no-brainer, a virtual truism! What is controversial,
and rightly so, is what UG consists in.
This is quite definitely not obvious and this is what linguists (and others
interested in language and its cognitive and biological underpinnings) are (or
should be) trying to figure out. IMO, linguists have a pretty good working
(i.e. effective) theory of FL/UG and have promising leads on its fundamental
properties. But, and I really want to emphasize this, even if many/most of the
details are wrong the basic conclusion that humans have an FL with UGish
touches must be right. To repeat, that FL/UG is a human biological
endowment is (or should be) uncontroversial, even if what FL/UG consists in
isn’t.[3]

Let me put this another way, with a small nod to 17th
and 18th century discussions about skepticism. Thinkers of this era
distinguished logical certainty from moral certainty. Something is logically
certain iff its negation is logically false (i.e. only logical truths can be
logically certain). Given this criteria, not surprisingly, virtually nothing is
certain. Nonetheless, we can and do judge that many more or less certain that
are neither tautologies nor contradictions. Those things that enjoy a high
degree of certainty but are not logically certain are morally certain. In other words, it is worth a sizable bet with
long odds given. My claim is the following: that UG exists is morally certain.
That there is a species specific dedicated capacity based on some intrinsic linguistically specific
computational capacities is as close to a sure thing as we can have. Of course,
it might be wrong, but only in the
way that our bet that birds are built to fly might be wrong, or fish are built
to swim might be. Maybe there is nothing special about birds that allow them to
fly (maybe as Chomsky once wrly suggested, eagles are just very good jumpers).
Maybe fish swim like I do only more so. Maybe. And maybe you are interested in
this beautiful bridge in NYC that I have to sell you. That FL/UG exists is a
moral certainty. The interesting question is what’s in it, not if it’s there.

Why do I mention this? Because in my experience, discussions
in and about linguistics often tend to run the whether/that and the what/how
questions together. This is quite obvious in discussions of the Poverty of
Stimulus (PoS). It is pretty easy to establish that/whether a given phenomenon is subject to PoS, i.e. that there
is not enough data in the PLD to fix a given mature capacity. But this does not
mean that any given solution for that problem is correct. Nonetheless, many
regularly conclude that because a
proposed solution is imperfect (or worse) that there is no PoS problem at all
and that FL/UG is unnecessary. But this is a non-sequitur. Whether something
has a PoS profile is independent of whether any of the extant proposed
solutions are viable.

Similarly with evolutionary qualms regarding rich UGs: that
something like island effects fall under the purview of FL/UG is, IMO, virtually
uncontestable. What the relevant mechanisms are and how they got into FL/UG is
a related but separable issue. Ok, I want to walk this back a bit: that some
proposal runs afoul of Darwin’s Problem (or Plato’s) is a good reason for
re-thinking it. But, this is a reason for rethinking the proposed specific
mechanism, it is not a reason to reject the claim that FL has internal
structure of a partially UGish nature. Confusing questions leads to
baby/bathwater problems, so don’t do it!

So what’s the take home message: we can know that something
is so without knowing how it is so. We know that FL has a UGish component by
considering very simple evident facts. These simple evident facts do not
suffice to reveal the fine structure of FL/UG but not knowing what the latter
is does not undermine the former conclusion that it exists. Different
questions, different data, different arguments. Keeping this in mind will help
us avoid taking three steps backwards for every two steps forward.

Monday, August 24, 2015

I just returned from an excellent two weeks eating too much, drinking excessively, getting tanned and sweaty, reading novels and listening to weird opera. In the little time I had not so productively dedicated, I ran across four papers that readers might find interesting as they related to themes that have been discussed more than glancingly in FoL. Here they are.

The
first deals with publishing costs and what can be done about it. Here’s a
quote that will give you an excellent taste of the content:

“The
whole academic publishing industry is a gigantic web of avarice and
selfishness, and the academic community has not engaged to the extent it
perhaps should have to stop it,” Leeder said. “Scholarly publishing is a bit
like the Trans-Pacific
Partnership Agreement. It’s not totally
clear what the hell is going on, but you can be sure someone is making a hell
of a lot of money from it.”

The
second and third discuss the reliability of how successful pre-registration
has been in regulating phising for significant results. It seems that the
results have been striking. Here’s a taste again:

The launch of the clinicaltrials.gov registry
in 2000 seems to have had a striking impact on reported trial results,
according to a PLoS ONEstudy1 that many
researchers have been talking about online in the past week.

A 1997
US law mandated the registry’s creation, requiring researchers from 2000 to
record their trial methods and outcome measures before collecting data. The
study found that in a sample of 55 large trials testing heart-disease treatments,
57% of those published before 2000 reported positive effects from the
treatments. But that figure plunged to just 8% in studies that were conducted
after 2000.

The
third is a piece on one problem with peer review. It seems that some like
to review themselves and are often very impressed with their own work. I
completely understand this. Most interesting to me about this, is that this problem
arose in Springer journals. Springer is one of the larger and more expensive
publishers. It seems that the gate-keeping function of the prestige journals is
not all that it is advertised to be.

The self-review issue, I suspect, though dramatic and fun
(Hume provided an anonymous favorable review of the Treatise, if memory serves) is probably the tip of a bigger
iceberg. In a small field like linguistics like-minded people often review one
another’s papers (like the old joke concerning the New York Review of Each Other’s Books) and this can make it more
difficult for unconventional views (those that fall outside the consensus) to
get an airing. I believe that this partially explains the dearth of purely
theoretical papers in our major journals. There is, as I’ve noted many times,
an antipathy for theoretical speculation, an attitude reflected in the standard
review process

The
last article is about where “novel” genes come from.Interestingly they seem to be able to just
“spring into existence.” Moreover, it is claimed that this process might well
be very common and “very important.” So, the idea that change might just pop up
and make important contributions seems to be gaining biological respectability.
I assume that I don’t need to mention why this possibility might be of interest
to linguists.

Sunday, August 9, 2015

I am on vacation and had hoped not to post anything for two weeks. But just as I was settling into a nice comfortable relaxing sojourn I get this in my inbox. It reports on a universal that Ted Gibson discovered that states that words that semantically go together are more often than not linearly close to one another. The answer, it seems, is that this makes it easier to understand what is being said and easier to express what you have in mind, or so the scientists tell us. My daughter and sister-in-law read this and were surprised to find that this kind of thing passes for insight, but it seems that it does. More interesting still, is the desire to see this surprising new finding as bearing on Chomsky's claims about universal grammar. It seems that we finally have evidence that languages have something in common, namely the tendency to put things that semantically go together linearly close to one another. Finally a real universal.

The astute reader will note that once again universal here is understood in Greenberg's terms, not Chomsky's. Happily, Ted Gibson notes as much (to, what I take to be the apparent disappointment of the interviewer) when asked how this stuff bears on Chomsky's claims (here). As he notes, it is largely at right angles to the relevant issues concerning Chomsky universals.

I now think that there is really no way to flush this misconception out of the intellectual cogno-sphere. For reasons that completely elude me, people refuse to think about Universals in anything but Grennbergian terms. I suspect that this is partly due to strong desire to find that Chomsky is wrong. And given that his big claim concerns universal grammar and given that it is not that hard to find that languages robustly differ in their surface properties, it seems pretty easy to show that Chomsky's claims were not only wrong, but obviously so. This fits well into a certain kind of political agenda: not only is Chomsky a political naif, but he is also a scientific one. If only he had bothered looking at the facts rather than remain blinkered by his many preconceptions. Of course, if Chomsky universals are not Greenberg universals then this simple minded dismissal collapses. That's one reason why, IMO, the confusion will be with us for a long time. It's a confusion that pays dividends.

There are several others as well. You all already know that I believe that Empiricism cannot help confusing the two kinds of universals, as Greenberg universals are the only kinds that the Eishly inclined can feel comfortable with. But this may be too fancy an account for why this stuff gets such widespread coverage in the popular press. For the latter, I think the simple desire to sink Chomsky suffices.

At any rate, for those that care, it's hard to believe that anyone will find this "discovery" surprising. Who would have thought that keeping semantically significant units close to one another could be useful. Sure surprised me.