Sunday, January 24, 2010

On Formatting

Matt Welsh's recent amusing-but-also-sad post on having two recent conference submissions rejected for violating format requirements reminded me how much I hate wasting time dealing with formatting. Matt calls for a standardized template -- as he writes:

But isn't it time that we define a single standard for conference paper formatting that everyone uses?

(With such a system, conferences could still vary the length of the paper they accepted -- but the style file would be the same.)

I think that would be a great step. In the same spirit, I'd also like to see less focus on page limits -- both for submissions and for the final version. (Ostensibly, the final version should be closely related to the actual submission -- so if you're going to relax page limits for the final version, it seems best to do so for the submission version as well.) I think the recent change for SODA -- allowing up to 20 page conference papers -- is a great idea.

I understand that, in some cases (where printing is involved), some sort of page limit may be needed -- although with less and less printing of full proceedings, it's less clear that this is an important consideration. I also believe that page limits can force people to improve their writing -- bad writing and excessively wordy writing generally go together. (If we remove page limits, we'll have to tell people in reviews more frequently when they need to cut things down or even out in their papers.)

But the effort spent on meeting arbitrary page limits -- just the time spent on formatting -- seems silly at this point. And often it forces you to cut content. I can't count the number of times I've had a review say, "You should have included this...", where my response would be, "We did include this, but had to cut it to make it fit into the page limit..." (Yes, you can create multiple versions, and post the full versions online -- except for double-blind conferences, like SIGCOMM, of course -- but that involves yet further overhead, when you have to update to deal with reviewer comments, and...) And if your paper is rejected from one conference, to submit to another, you have to re-format and create another version to meet their arbitrary format and page limit standards.

In some ways, I'm glad that Matt is out there, talking about the papers being rejected for formatting. And I hope it starts happening to more people, more often. Because I worry that the only way for change to happen is for bad things to start happening to good papers so that people will start to realize that the current system is just bizarre and broken, and a revolution needs to occur. Perhaps I'm wrong, and slow evolutionary change -- with 20 page papers a SODA being an example -- will lead us the right way. If so, I encourage PC chairs to experiment with flexibility in their formatting requirements.

19 comments:

Anonymous
said...

I don't agree. Think of it from the point of view of the paper audience. They are not going to spend the time carefully reading every word of your 20-page paper; ACM's 9-point font already makes papers appear deceptively short. Part of the contribution of a paper is explaining something novel and complex in a concise and easy-to-understand fashion.

Authors sometimes have an unfortunate tendency to strive for completeness at the cost of clarity. If all papers could be 20 pages long, I can easily imagine long appendices listing out all possible details that might matter, giving the author the chance to complain that, for any detail a reviewer might have mentioned, it was of course covered in the appendix.

And, selfishly, from the point of view of the PC, having a page limit makes reviewing more practical.

However, I have to agree that not having a page limit might finally put a premium on sharp, concise, well-written papers. Today it's not surprising to read a review that says, you had an extra 2 pages left below the limit, why didn't you explain X Y and Z irrelevant details? And shorter papers do, sadly, get treated with prejudice, as if the authors didn't have enough ideas to talk about.

I just got done submitting a four page SIGGRAPH paper with some people. Why four pages? Because SIGGRAPH has the wonderful (sarcasm) policy that papers merit should be weighed relative to their length, making us weigh the relative benefit of an easier review vs more complete paper.

I don't think this is a problem in STOC/FOCS. I pay a little attention to the formatting requirements, but don't care about the letter of the law. I have never heard a complaint. I really do *not* want a standardized latex template. In my experience, these templates are often quite poorly thought not, for example lacking even proof environments and incompatible with amsmath.

I don't understand why you think creating multiple versions is a serious problem. You have to another version anyway, for the final journal paper. Post a full version on the arXiv and pare down the conference version. If the final conference version allows more pages than the submitted version, just ignore it. Since the full version is freely available, nobody will ever read the conference version anyway.

In my experience, these templates are often quite poorly thought not, for example lacking even proof environments and incompatible with amsmath.

This is not a reason to be against a standard template. If anything, it's a reason to be for it. If we are forced to adopt a single standard, human attention on improving the templates we use will be focused on improving one template, as opposed to whatever large number are out there now.

OK. So the PC and referees should suffer and waste hours refereeing a paper, because the authors are too busy to spend 1-2 hours organizing their paper in a way that it can be refereed most easily and fairly. This is selfish, inconsiderate, and not necessarily in the authors best interest. Well formatted papers have better probability to get accepted to a conference. Forcing the authors to figure out what is important and what is not in their paper is a good thing.

There are enough people that have obsessive fixations on things like spacing, fonts, math displays, etc, that forcing them to use a single style would be more irritating to them than productive.

If one insists on a uniform style file, I hope it would use white color for the foreground, and white color to the background. This would really facilitates double blind refereeing...

Is it really so difficult for, say, IEEE, ACM or AMS to pay a (La)TeX expert to create a sufficiently flexible but standardized template?

(I think that a large upper bound on the cost for this is $5000... or they can just start a contest for geek graduate students ;))

Maybe it can be "tested" by the different communities to make it "fits all" before putting it "into production", thus fixing all the problems arising (i.e., missing environments, conflicts with widespread packages, etc.) with the current publishers-based templates like the LNCS one.

I suggest not to confuse the typographical aspect (which can be solved by "imposing" a common template to all the venues) with the "political" aspect (i.e., which is the most suitable paper length for each conference) of the problem.

Is it really so difficult for, say, IEEE, ACM or AMS to pay a (La)TeX expert to create a sufficiently flexible but standardized template?

I think that both Michael's goal and your impression that flexibility is the answer are both misguided. What Michael and you really want is, in some sense, the opposite of flexibility. These organizations have already made very flexible templates, often by quite expert people.

Most of us write papers using pretty bare-bones latex to start with. Bare bones latex is not very flexible - it is often very complicated to do things that are a little different. (It is not that one can't get it to do other things - it is just very difficult to do.)

These alternative systems allow much more convenient variation but they do this at the cost of having to read documentation and extra command names, which is exactly the sort of thing that authors like to avoid.

I got some experience with the current IEEE package when I ended up having to modify the .cls files that IEEE publishing had set for FOCS 2009. The basic package was quite well constructed but it happened that the default tuning of the package that was sent to authors had some serious differences from the usual STOC/FOCS styles. The package is flexible enough also to conveniently cover the IEEE Information Theory conferences and journals where the standard is radically different from STOC/FOCS. (Section numbers set in capital Roman numerals, section names in small caps, etc.)

Michael starts with the question of formatting from conference submission but then suggests that the formatting of accepted conference papers and submissions should be as similar as possible but in some standard style.

With a requirement that submitted papers use anything beyond a fairly bare bones formatting, all authors are forced to spend time on formatting - isn't this a bigger waste of time for everyone? For submissions, one should be flexible and impose the minimum of requirements on the formatting.

However, there can be big differences between what PCs are capable of doing with a conference submission and what is best for readers of the final papers, especially for conferences with a large volume of submissions. The niceties of formatting are pretty much irrelevant to the PC's job (which makes rejecting a paper for formatting problems unfortunate). However, the PC does not have the time to read very long papers and, in fairness, the amount of attention the PC gives a paper should not be a function of its length.

I find it strange that the committees that rejected Matt's papers were not more flexible and really rejected a paper because it was not quite upto their exacting requirements. I would understand rejection if someone is really trying to seriously game the system, say by setting the margins to render the paper very hard to read. But if the paper was formatted according to another conference's requirements, it probably was readable. If instead of 14 pages, the reviewer thought the style file difference gave the author 15 pages worth of stuff, it would be ok to judge the first 13 pages of the paper. I think following the letter of the law, instead of its spirit does not serve the advancement of science.

I think there's a reasonable argument for page limits for submissions (to make the reviewing process more efficient), but I'd like to see more leeway for the final camera-ready paper and for appendices to the submission.

My ideal system:

1) submissions have a strict page limit (say, 10-12 pages or so), but authors are allowed to include appendices not included in this limit (say, up to 20 pages including appendices); reviewers are not required to read the appendices, but authors can include additional detail there (e.g., proofs)

2) camera-ready papers have looser page limits (say, 15-20 pages).

I can't count the number of times that I've gotten comments from the reviewers that say "I would have liked more detail on X" without also explaining what piece they think should have been cut. While I can understand where that is coming from, and it can be helpful feedback, at the same time it does feel a little bit unreasonable to ding a paper on that basis when there are strict page limits on submissions.

I agree with anonymous who said there are two issues here: standardized templates, and page limits, and we shouldn't necessarily confuse and conflate the two.

Paul, I just don't understand your argument (perhaps you didn't read Matt's post). Many/most conferences already have a standardized template and detailed formatting guidelines, it's just different across conferences -- even in the same subarea -- because they're run by different publishers (or by some historical accident). This doesn't make sense; it has all the negatives of dealing with formatting you seem to agree is a waste of time, with the added complication that you have to check what specific rules you're dealing with for each conference (which is what got the papers in question in trouble in the first place).

With respect to page limits, I'm with AnonProf. I can understand needing to keep submissions limited in length to help reviewers. On the other hand, people like the commenter who say things like "Authors sometimes have an unfortunate tendency to strive for completeness at the cost of clarity" I disagree with. Some people are just bad writers; they'll be unclear whether you give them 10 or 20 pages, but bad writers will tend to write 20 when they could write 10 just as well, and I think this is the effect you're seeing. In many cases, page limits hurt clarity, precisely because one has to cut out useful details or helpful (but possibly slightly redundant) guides or hints to the readers to fit in the (arbitrary) page limit. And I don't get the negative about "completeness". To me, completeness is a big virtue, and should be what we aim for in publications. We may not know in advance exactly what details will be important to spur future work; and as scientists we should be aiming for reproducibility, which generally means giving attention to fine details. That may take some pages.

Indeed, I'm surprised none of the arguments given so far recognize the issue that we may be hurting the science with page limits.

The more flexible system suggested by AnonProf is a good example of what I would think would be a better system. It might take some work by reviewers to help authors figure out what things (appendices) can be safely left out of the camera ready versions, but that seems a small downside for a potentially big upside.

Michael, maybe you misunderstood me. My argument was that standardized templates for conference submissions are a waste of people's time (and this is why Matt's papers ran afoul of things). I know that they are prevalent in some areas - I have had the annoyance of dealing with AAAI and IJCAI's - but that doesn't mean that we should have any of them.

In some part I was agreeing with you. A natural form for submission should be N pages in some minimally readable typeface plus an unlimited # of pages of appendices in which one can include a full version of the paper. The only conference-dependent part should be N. The N ensures that the PC will not be expected to devote differential amounts of time based solely on a paper's length. Even if a paper happens to deviate and have its main body extend longer than N pages, then all that should happen is that the PC should be expected to stop reading at N pages.

Note that this is not a standardized template or style file for conference submission, which is what I was objecting to. Who cares if someone 'tweaks' the format in a submission to get in an extra line ot two? I just don't think that this level of difference matters.

I was also arguing against the philosophy that conference submissions should only be judged based on a proposal of what will appear in the final conference version. (This is the kind of assumption that leads to the standardized templates in the first place.)

As for "hurting the science" with conference page limits, I am of two minds about them. Without the limits of print I don't see anything sacrosanct about any particular page limit; it would be nice to have papers be complete. However, without proper refereeing I don't see that this is better than having each paper give a link to a posted full version on the ArXiv.

I agree that there is a balance between forcing authors to make their points more directly and forcing them to eliminate intuitive descriptions. The good thing is that the page limits in the submission process, as a side effect, are likely to push authors towards greater clarity in those limited pages of the final conference version.

We run into problems when we think of the final conference version as the final version and it hurts the science even more when we do so. Unlimited pages push things further in that direction. The conference process gives a certain added value to a paper versus a preprint version. However, the PC typically has not checked a lot of the details in a long version of a paper. Only if these were fully refereed would there be a strong argument for including full versions rather than leaving them to their publicly available preprint form.

Hi Paul. I see now, and I think I did misunderstand you. If I understand right, you're arguing for a lack of formal requirements -- essentially "Look, the reviewer is really only going to read 10 pages, so turn in 10 readable pages -- if you like, put stuff in appendices, maybe it will get read -- and that's it." I think that's how paper submissions used to be when I was in grad school. (Nobody cared if your paper was -- gasp -- 11 pages because you had a particularly long bibliography.) I think that's a perfectly reasonable way to go -- but I'm not sure if others would go for it.

I'm on the PC for one of the conferences where Matt's paper was rejected for the formatting. I'll say that we saw a much larger set of papers with minor violations, and we only rejected a small fraction (a couple of papers) that significantly violated the formatting standard, where we believed this gave an unfair advantage to the authors of that paper.

I agree. In these research areas most conferences' organizers are much more worried with size and formating than with content and novel ideas. An interesting and innovative idea can be explained and defended in 1 or 2 pages. The page limit becomes a page threshold that you must reach or your paper will be conspired weak. Therefore every body fills the paper with unnecessary related work and pages and pages of results that don't prove anything to an aware reader.I already reviewed tens of papers for INFOCOM and had several discussion with other reviewers that make "one line" reviews saying "don't like because so...and this is not useful" without any valid argument. And in other conferences, I saw general and equal copied reviews for multiple papers.Nowadays, scientific conferences are mostly pure business for the "friends of the friends". If you can, try to watch some of the IEEE's VIP conferences receptions.Sometimes smaller conferences are much better, at least you can meet people with common and close scientific and research interests without any of the hassle... then publish your work in journals.To help I host a small open-site that lists all that small conferences without huge marketing budgets.