2I think that depends immensely on which framework you are talking about. HPSG seems to be used a fair bit in CL and practitioners seem fairly sensitive to the computational tractability of their grammars. If you talk to a big HPSG person it is only a matter of time before they pull out their laptop and show you the latest HPSG parser they've been working on. Ok, maybe that's an exaggeration, but you get the idea. If you're interested more in the minimalist side, then I think there's much less interest in issues of computability, or in integrating insights from CL into the theory. – Alan H. – 2011-09-14T23:54:11.823

I don't think @johanbev: was talking about any framework but about the entire of field(s). – hippietrail – 2011-09-15T08:20:06.497

@hippietrail: Well, then, I think the question is much too broad for a site like this. – Alan H. – 2011-09-17T00:08:45.470

Do you have in mind a particular area? Like phonology, syntax, morphology? The answer will differ somewhat based on what level you are interested in. And, of course, if you have in mind a particular theoretical framework, that would help as well. – Alan H. – 2011-09-17T00:16:08.300

2@Alan: Perhaps we could narrow the question to make it more like a "yes/no with examples" question rather than a "give me a full list" question. Like Have NLP/CL brought anything to the table of pencil-and-paper linguistics? – hippietrail – 2011-09-18T08:46:20.503

Answers

10

I'd say the very idea of using a corpus instead of picking examples out of thin air and pronounce them sound or unsound has affected all parts of linguistics.

What was the status of corpus linguistics before NLP? Surely it existed. I'm assuming computers were used for linguistics research before and besides NLP, which has a lot more difficult stuff than corpora, which are pretty easy. – hippietrail – 2011-09-15T08:18:51.803

2

@hippietrail It existed but it was... in some circles not quite proper to be seen mucking about with performance instead of concentrating on competence. It's all part of the linguistics wars.

Sounds like a great source of questions! – hippietrail – 2011-09-15T08:46:27.937

7

If we are speaking really generally, Chomsky spoke at my school years ago and I thought he had an interesting insight into CL. He said few understand it for its potential to empirically and computationally verify different language models paper and pencil linguists, like himself, were creating. So UG for example, if it was a valid way to describe syntax, should be able to used in a computer application to emulate human processing of language, and should be good enough to show that model does work. A lot of the community, in my view when I studied CL in a school where paper and pencil ruled the land, saw it as way out there, unto itself. He believed really solid language models had to be re-constructed and tested, and that CL was where rubber hit the road, not the place where people were using heuristics to cheat at language cognition, as a lot of it has become in terms of economic viability.

...which is to emphasize by the lack, if Chomsky notes that it would be a good idea to do it, it means it ain't being done. – Mitch – 2011-09-24T23:21:09.690

6

An important — if often overlooked/ignored — contribution of CL/NLP to theoretical linguistics is implemented demonstrations of insufficiency. More specifically, implementing "pencil and paper" theoretical models derived from a small set of examples, and testing them on suitably large quantities of sentences can (and has!) reveal failures.

A particular example — not widely quoted in the theoretical community — is a paper available on the Rutgers Optimality Archive by Lauri Karttunen, entitled "The Insufficiency of Paper-and-Pencil Linguistics: the Case of Finnish Prosody", which demonstrates via computational implementation that published Optimality Theoretic analyses of the Finnish prosodic system fail unexpectedly on a certain class of inputs.

It might be argued that this isn't a contribution of CL/NLP per se, but the implementation of OT is only rendered possible via heavy use of Finite-State Machines, a core tool of modern CL/NLP.

When I spoke to Prof. Pollard, who, I suppose, can be considered to be a paper-and-pencil linguist, I believe he said that the reason he did HPSG, and not the more logic-driven approach he is pursuing now (CVG) was simply that he did not know logic at the time. The slides for his course on such a formalism are at http://www.coli.uni-saarland.de/courses/logical-grammar/page.php?id=materials. In any case, doing derivations for sentences in such a grammar is exactly like doing derivations in mathematical logic.

Coming to my point now, from what I've seen, which isn't much, it's not NLP/CL that's the cutting edge of linguistics, it's mathematics. But this does address the aspects you raised about computability and formal specification.

A note about Pollard's logical grammar: it addresses 3 aspects of sentences: phonology, syntax and semantics, all with exactly the same, logic-based methods.

Most theoretical linguists understand the value of the computer in modelling language as a means of verification. However, most computational linguists are focused on statistical models, which are productive for real-world application, but run counter to the field of theoretical linguistics. Statistics simply plays no role in human language.

The mismatch in approaches results in both fields largely ignoring one another.

Be that as it may, many theoretical models, such as LFG and HPSG, incorporate discoveries and approaches from computational linguistics. The fields are not entirely mutually exclusive, just for the most part.

1I think that "Statistics simply plays no role in human language" runs against they way many descriptive linguists consider some grammatical or ungrammatical solely based on how many people say/write things in that particular way. Consider the expressions "different from", "different than", "different to", "different of", "different by", etc. – prash – 2011-09-18T16:05:30.700

One way in which statistics plays a role in human language comes when you consider on-line processing of language. A lot of the evidence we have about how the lexicon is structured and accessed comes from examining correlations between frequency of occurrence of an expression (e.g., in a corpus) and patterns in neural or behavioral measures. – Alexis Wellwood – 2011-09-19T23:24:05.317

1In CL, "statistics" are usually n-gram models, which don't seem to play a role in the derivation of a sentence. Granted, there is some mystery as to which words the speaker selects to build a sentence (i.e. the numeration), which perhaps is statistical (or can be modelled as such). Of course, statistics are a useful tool to learn about language, the lexicon, derivation, etc. – Ethan – 2011-09-20T19:06:09.863