January 2009

Site Statistics

January 09, 2007

Have lexicon, will travel

At the recently-concluded LSA 2007 conference, Mark Liberman gave a plenary speech titled "The Future of Linguistics". SC hopes that Prof. Liberman will make his slides available online for the perusal of those who didn't attend -- aside from being a bracing challenge to the field to do a better job of publicizing itself and staying relevant, it featured a hysterical Photoshopped cover of the third issue of the best popular linguistics magazine never published.

Humor value aside, Prof. L. raised a number of points that should be of concern to anyone with a linguistics degree and sans tenured position. To sum up the key points: basic linguistic skills like the technical analysis of sentence structure, formerly key components of education, are taught late (if at all) to most people, and not by linguists. Despite the centrality of linguistics to much of 20th century science (particularly analytic philosophy and anthropology), linguistics departments are not present at many (most?) universities, and training in the key findings of the field is far from systematically available. In particular, he made an interesting comparison between the 20th century histories of linguistics and psychology, and questioned why it was that linguists are outnumbered by almost 40:1 by psychologists (as measured by your choice of professional society affiliations or introductory course enrollments). Along the way, he suggested that perhaps part of the problem facing the field was the relatively late development of its modern incarnation, with much of the work taking place in the 1940s and '50s, as opposed to the decades earlier establishment of modern psychology.

This immediately struck SC as the one false note in an otherwise outstanding analysis, and during the Q&A session that followed, your host raised an alternative comparison, to the development of computer science. In the 1940s and '50s, computer science was an activity largely carried out in the head of John von Neumann, and the ENIAC researchers at Penn. Yet today, you can walk into any Barnes & Noble ([wait, you walk into bookstores? what about Amazon? -- ed.]) and pick from a dozen shelves' worth of books on how to do various things with computers, no small number of which carry the economically beneficial effect of helping you earn a living. Indeed, just in the past year, your host has acquired books on: PostgreSQL (his database of choice), Java Server Faces (the web interface tool of other people's choice on his project), Java Server Pages (related to Faces), JDBC (Java connectivity to databases), multithreaded programming, and statistical analysis using databases. All of these are directly related to your host's ability to earn a paycheck, a statement lamentably untrue of most of his recent linguistics-related purchases. So the question put to Prof. Liberman was: what is the linguistic equivalent to the professionalization of computer science that will make it attractive to more people? Prof. L.'s reasonable response was that this was a complicated question which could spark a lot of further dialogue, and so later in the day, SC promised him that said dialogue would be coming. Here goes:

Let's start out by defining our terms a little more carefully. Following numbered instructions from a book is not quite the same as doing basic research on difficult theoretical questions, and so we should distinguish the jobs we are talking about. If a researcher is a computer scientist, and the guy who picks up a how-to book is a programmer, then maybe a similar distinction is in order here. Since someone seeking to do professional linguistic work probably will have some programming tools in their resume, we'll need a different word. SC likes "mercenary linguist", or maybe "linguist of fortune", but since the PR value of those names is low, how about "linguistic technician"? Your host is unsure where this places master's degree holders like himself -- more knowledgeable than someone doing paint-by-numbers, but not a full Ph.D. -- but for the moment, we're trying to imagine the career path of someone whose knowledge is limited to an as-yet hypothetical LSA/ACL certification guide.

In order to make the analogy work, a linguistic technician's tasks ought to mirror those of a generically qualified programmer's in some abstract respects. Without pretending to an exhaustive consideration of the desiderata, it seems to SC that a linguistic technician's tasks need the following properties:

They must be repeatable -- a dozen equally trained technicians should all come up with the same answer, broadly construed, to the same problem

They must need to be done frequently -- the problem of connecting a database to a web server is long-since solved, but individual programmers write code to implement specific solutions for it every day

They must be subject to statistical quality control -- while language defects might be more subjective than software defects, money is only going to be made available if defects can be corrected efficiently with finite resources

This is not merely the declaration of linguistics to be a
specialization of computer science. Practically every field that isn't
only academic comes in both researcher and technician flavors; one can
be a full medical doctor, or break into the field much more quickly
(albeit with a much lower ceiling) as a nursing assistant or licensed
vocational nurse. One can be a Ph.D. chemist or biologist and run a lab
-- or one can be a lab technician and know how to carry out the various
procedures without necessarily being qualified to direct original
research. One can be a lawyer -- or a paralegal.

Objections might be raised that various classes of programmer are more skilled than this; web designers need to be graphic artists of at least better talent than the average randomly-selected person, but insofar as this is the case, that only demonstrates that technicians come in varying degrees of certification and competence. Another objection might be that almost any task that passes the above test is likely to be something involving computer programming, and thus that all I'm about to demonstrate is that a little extra linguistics coursework should be part of programmer certifications. Perhaps so, but if that's really the case, then we might have to face up to the possibility that Prof. Liberman's speech is grounded in a mistaken premise, and that the best we should hope for is getting more would-be lawyers and journalists to take courses in grammar so as to employ a few more Ph.D.s than is possible at present. Your host would not be writing this if he believed such to be the case.

We return again to the question -- what does a linguistic technician do? If genuine standards for representing grammars or ontologies existed, perhaps full-time grammar analysts would spend their time reviewing the results of low-scoring parses in a company's internal document management system, and determining company-specific improvements to be made. Maybe future theory checkers could be employed to manually translate legal arguments into predicate calculus, then run them against automatic theorem provers to make sure that cases are constructed soundly (yes, SC realizes this is precisely the sort of thing the research community would like to automate, but constructing a suitably broad-coverage application might be more expensive than having a few individuals responsible for this within a firm). Perhaps some future senior translation engineer, not actually capable of speaking the languages they are overseeing, will be responsible for overseeing metrics of bulk translation efforts, and farming out specific high error-rate translation pairs for correction to specialists (perhaps using services like Amazon's Mechanical Turk to do it). These are all ideas that meet the criteria above: there are demonstrably better and worse answers, if not always uniquely correct answers, they're all tasks that would need to be done daily in any enterprise, and they're all subject to quality controls using metrics that either already exist or can easily be conceived of with no great leap from the present state of the art.

Of course, in order for jobs like these to exist, there need to be applications. Of all the software applications that SC has seen to date which give end-users or administrators access to any kind of linguistic data tuning (names will not be forthcoming, to protect the guilty), the manuals usually come with the suggestion that all you need is some generic software engineer/system administrator to do the necessary fiddling. Aside from the fact that many of these systems are not yet at the point where much linguistic sophistication would do a lot of good, this represents a failure of marketing on the part of linguistics as a profession. Why shouldn't an application specializing in language technologies do better when managed by a specialist in language?

As noted above, though, conceiving of applications purely in terms of software reduces the notion of the language technician to a software engineer with language skills. This is a mistake. As Prof. Liberman noted in his speech, English departments have by and large abrogated the responsibility of teaching grammar and composition. Why shouldn't technical writers be certified as competent drafters of language by linguists? It might produce better results than Phil. Why not train a new breed of paralegals to be the "theory checkers" described above? Speaking of legal applications, why not train someone to specialize in analyzing the sort of terrible ambiguities that come up in courts over grammatical issues as trivial as comma placements? Couldn't the cost savings in litigation fees alone support paying to have a few such people on staff in every legislative chamber in the country? We already pay for "legislative analysts" to outline the expected legal and economic effects of bills before they become laws; there's no reason that language analysis shouldn't be part of the package.

SC does not pretend to have come up with anything like an exhaustive list of potential jobs for language technicians, but the thrust should be clear. In an increasingly professionalized society, obsessed with credentialing, one of the great failures of the linguistics field is that it has not managed to lay claim to accrediting those purporting to be expert in it. As Prof. Liberman astutely observed, other fields have been allowed to take over the intellectual domain of linguistics, and the quality of the training that results is disturbingly subpar. Taking some responsibility for fixing the problem would be enormously beneficial -- pure research, fieldwork, endangered language preservation, and other non-commercializable work are luxuries that need to be paid for, and the perception that linguistics needs to be funded will only come when administrators and industry see money in it. In order to do this, we need to figure out what linguistic technicians can do, and how to make the non-academic world want to hire them.

Comments

Must read this in depth when I'm not at the department waaay too late. I had some thoughts of my own, but the one you raised in the question period was one that hadn't escaped me.

Or perhaps, I thought of it from a different angle, a within-the-institution one: suppose that so many thousands take psych classes with the hope or intent of becoming psychologists, and that so many more thousands take chemistry with the hope or intent of becoming chemists (that is, employed by Dow or the like), or more likely a lot are pre-med...is it any surprise that people don't tend to take linguistics classes? What does it even mean to be a "professional linguist", other than working for a NLP or other software company?

Now, not everyone takes psych so they can become a psychologist; some, perhaps, take it because they're going into advertising or the like and they think psych can help those chosen careers. And in that case, linguistics can as well, and we need more of those people getting basic linguistic training. But as they say, "Follow the money."