26 September 2012

I'll warn in advance that this is probably one of the more controversial posts I've written, but realize that my goal is really to raise questions, not necessarily give answers. It's just more fun to write strong rhetoric :).

The thing I especially like about this paper is that it's not a complaint (like this blog post!) but rather a thoughtful critique of how one could do this sort of research right. This includes actually looking at what has been done before (political scientists have been studying this issue for a long time and perhaps we should see what they have to say; what can we do to make our results more reproducible; etc).

For me, personally, this goes back to my main reviewing criteria: "what did I learn from this paper?" The problem is that in the extreme, cartoon version of a data porn paper (my 1-4 list above), the answer is that I learned that machine learning works pretty well, even when asked to do arbitrary tasks. Well, actually I already knew that. So I didn't really learn anything.

Now, of course, many data porn-esque papers aren't actually that bad. There are many things one can do (and people often do do) that make these results interesting:

Picking a real problem -- i.e., one that someone else might actually care about. There's a temptation (that I suffer from, too) of saying "well, people are interested in X, and X' is kind of like X, so let's try to predict X' and call it a day." For example, in the context of looking at scientific articles, it's a joy in many communities to predict future citation counts because we think that might be indicative of something else. I've certainly been guilty of this. But where this work can get interesting is if you're able to say "yes, I can collect data for X' and train a model there, but I'll actually evaluate it in terms of X, which is the thing that is actually interesting."

Once you pick a real problem, there's an additional challenge: other people (perhaps social scientists, political scientists, humanities researchers, etc.) have probably looked at this in lots of different lights before. That's great! Teach me what they've learned! How, qualitatively, do your results compare to their hypotheses? If they agree, then great. If they disagree, then explain to me why this would happen: is there something your model can see that they cannot? What's going on?

On the other hand, once you pick a real problem, there's a huge advantage: other people have looked at this and can help you design your model! Whether you're doing something straightforward like linear classification/regression (with feature engineering) or something more in vogue, like building some complex Bayesian model, you need information sources (preferably beyond bag of words!) and all this past work can give you insights here. Teach me how to think about the relationship between the input and the output, not just the fact that one exists.

In some sense, these things are obvious. And of course I'm not saying that it's not okay to define new problems: that's part of what makes the world fun. But I think it's prudent to be careful.

One attitude is "eh, such papers will die a natural death after people realize what's going on, they won't garner citations, no harm done." I don't think this is all together wrong. Yes, maybe they push out better papers, but there's always going to be that effect, and it's very hard to evaluate "better."

The thing I'm more worried about is the impression that such work gives from our community to others. For instance, I'm sure we've all seen papers published in other venues that do NLP-ish things poorly (Joshua Goodman has his famous example in physics, but there's tons more). The thing I worry is that we're doing ourselves a disservice as a community to try to claim that we're doing something interesting in other people's spaces, without trying to understand and acknowledge what they're doing.

NLP obviously has a lot of potential impact on the world, especially in the social and humanities space, but really anywhere that we want to deal with text. I'd like to see ourselves set up to succeed there, by working on real problems and making actual scientific contributions, in terms of new knowledge gathered and related to what was previously known.

Since my job as a blogger is to express my opinion about things you don't want to hear my opinion about, I wish they'd select fewer workshops. I've always felt that NIPS workshops are significantly better than *ACL workshops because they tend to be workshops and not mini-conferences (where "mini" is a euphemism for non-selective :P). At NIPS workshops people go, really talk about problems and it's really the best people and the best work in the area. And while, yes, it's nice to be supportive of lots of areas, but what ends up happening is that people jump between workshops because there are too many that interest them, and then you lose this community feeling. This is especially troubling when workshops are already competing with skiing :).

Anyway, with that behind me, there are a number that NLP folks might find interesting:

Personalizing Education With Machine Learning: I don't think there's much in the way of NLP in personalizing education, but I see this as a potential big avenue for large breakthroughs in NLP in the coming 5 years, especially as we start talking about automated grading in MOOCs and whatnot. ETS folks, I hear you!

With the deadlines so close I don't imagine anyone's going to be submitting stuff that they just started, but if you have things that already exist, NIPS is fun and it would be fun to see more NLPers there!