Which one should you choose? You don't have to, you can freely submit
to both ECCC and CoRR. But how do they compare? [Disclosure: I am on
the ECCC Scientific Board.]

ECCC focuses on computational complexity though often contains
papers across theoretical computer science. CoRR broadly covers all of
computer science (with tags for subfields) and is part of the arXiv project covering physics and math as
well.

An article has to be approved by an ECCC board member to
meet a minimum standard before it can appear. CoRR only checks for
relatedness to the topic area.

Both plan to have papers posted forever. ArXiv is currently run by
the Cornell Library that gives stronger backing to this
promise. However every paper on the ECCC and CoRR should later appear
in a conference proceedings and/or journal.

ECCC has a (not-so-user-friendly) discussion system and email
announcements of new papers. CoRR has RSS feeds for each subject
class. Both systems plan to continually update
their interfaces and features.

12 comments:

An article has to be approved by an ECCC board member to meet a minimum standard before it can appear. CoRR only checks for relatedness to the topic area.

And here is where I would decide to only submit to ECCC. The main failing of arXiv is that there is ZERO check against the run-of-the-mill crackpot. I can see where arXiv serves a purpose: to have a place where papers can be made publicly available. But then, isn't that what our personal webpages can be used for? If I'm browsing for the latest results in an area, I'd like to have at least a little bit of a guarantee that I won't be wasting my time.

I agree with the previous Anon.arXiv is great, but somewhat useless (at least for Computational Complexity); since every few weeks new exciting proofs that P=NP and P\neq NP appears in it. These papers seem to be written without any care and in an haphazard manner, so the most basic filtering could spot them quickly. On the other hand, most complexity people do not put their work on arXiv (or am I wrong?). So ECCC is much better.

I'm not sure I see the point of supporting two repositories. It doubles the work for author and reader. Why not just come out in favor of one or the other? (And I agree with previous posters that the filtering used for ECCC makes it the prefered choice. Also it seems the de facto choice for complexity theory other than quantum stuff.)

The main failing of arXiv is that there is ZERO check against the run-of-the-mill crackpot. I can see where arXiv serves a purpose: to have a place where papers can be made publicly available. But then, isn't that what our personal webpages can be used for?

Well one other purpose that arXiv serves is that it is an "announcement" of a paper. In fields where conferences aren't attended by a large portion of the community (like CS theory), the arXiv serves as an announcment of the paper. It also has the further advantage that you can post your result when it is ready, as opposed to this strange guarding of results until a conference that seems to occur in CS (which to me feels very unscientific, but understandable, of course, given the publish or perish world of academia.)

Finally the convenience of the arXiv, at least in quantum computing, is incredible. The noise factor on the quant-ph arXiv is incredibly high, but it takes me exactly one minute every morning to scan through the listings and sort out the signal (if one exists!) This is balanced by the fact that 98 percent of quantum computing is on the arXiv. I'd also say that it isn't really the crackpots who are a problem on the arxiv (you can spot a crackpot in about a half a second) but the sheer volume of low quality submissions.

It's interesting to me that the field that you would most expect to be open to online preprint archives, computer science, isn't so open to the idea. I think it is understandable, given the central role that conferences play in CS (as an announcement and noise filter.) If you're an expert in your area of CS, this is fine, but I suspect that it actually presents a significant barrier to entry into the field that has been effectively demolished in field that are almost entirely on the arXiv.

arXiv may be good for areas like quantum information processing, where you can find many papers each day. It's easier to check the latest results, compared to going to many personal homepages. Plus, what if the result doesn't appear at his/her homepage, or he/she doesn't even have a homepage?

The backdrop of this discussion is that you shouldn't have to choose. It would advance research in computer science if ECCC contributed its papers to the arXiv. We have seen this situation in mathematics many times.

For example, the "signal-to-noise" ratio in the number theory category of the arXiv, math.NT, was not all that great until the competing Algebraic Number Theory archive was folded into it. Within months of that reform, math.NT had more submissions, and more good submissions, than the total of the ANT archive and what math.NT had before.

No one is questioning the usefulness of preprint archives; the only questions were with regard to the value of arXiv, and whether there is a need for two archives covering the same subject matter (or containing the same sets of papers).

The situation in theoretical CS with ECCC is simply crazy. I believe in the following principles:

1. All research papers should be archived permanently in the same collection. There are enormous economies of scale here (and creating a truly permanent archive is much harder than most people who've never tried think it would be).

Putting papers on your own web page, or a departmental page, is nice but it does not permanently archive them. ECCC is better but still not good. For example, accepting postscript submissions is outright stupid. If you care about preserving papers forever in the most usable form, then you should collect all available information, including latex source. You never know when you will want to offer downloads in new formats, for example (either widespread new formats or niche ones like spoken text for blind users). Converting postscript may be possible, but it is almost never the best solution: too much information has been lost by the time the paper ends up in postscript. Any archive that encourages postscript submissions is run by people who are either incompetent or irresponsible.

2. Filtering should be layered on top of that. If you want to create a nicely organized web site to publicize carefully filtered papers, that's great. It substantially improves the usefulness of any archive; most users care more about the filtering than about the permanence of the archive. However, there's no logical reason whatsoever why the people doing the filtering should also undertake a half-hearted attempt at doing the archiving themselves. You can seamlessly link to files at the arXiv while keeping whatever organization and interface you like. You could even copy files from the arXiv and just leave the permanent archiving to them while keeping your web site completely independent (if you care, although I can't see why you would).

My impression is that most of these issues are complicated by interest in credit. There are a lot of small disciplinary archives whose creators take great (and often well-deserved) pride in serving their own research communities. Doing your own archiving sounds like a bigger contribution than simply filtering a larger archive, even though it's the intellectually trivial part. Most researchers won't know or really care whether you're doing a good job on the archiving side, so there's a strong incentive to keep doing it yourself rather than turning into an arXiv overlay. However, I think becoming an arXiv overlay is the only intellectually defensible route, given the current options.

For now, we should put pressure on the ECCC board members to focus on what they are good at (filtering the papers and publicizing the good ones) and let the papers be archived by experts.

I'm biased, of course, since I'm the administrator of CoRR, but it seemsto me that it's possible to have the best of both worlds quite easily ifECCC would automatically post all its papers on CoRR, tagging them asbeing approved by ECCC. You can keep the ECCC home page as is. The only difference that the papers themselves would be stored on CoRR.The advantages are (at least to me) clear: ECCC can focus on what itdoes best, namely filtering, and leaving archiving to CoRR. I believethat bigger is better in the archiving world --- as others have pointedout, there are economies of scale here. Larger archives are more likelyto be preserved, and they have the resources to keep up faster withchanges in technology. In addition, by having complexity papers onCoRR, they can be better linked to other papers in CS as well as papersin quantum complexity, which often appear in the physics section of thearxiv. At the same time, if you come in through the ECCC website, orjust search for papers with the ECCC tag, you can have all the benefitsof filtering.

For what it's worth, while it's true that CoRR has papers claiming toprove P=NP, it's actually quite easy to ignore them. Crackpot papersare typically easy to spot. The hard part is to filter out the low quality but essentially correct papers. In this regard, organizationslike ECCC could really help.