Computational Complexity and other fun stuff in math and computer science from Lance Fortnow and Bill Gasarch

Friday, September 30, 2011

Bibliographies

Lots of buzz about Princeton's new policy that prevents faculty from giving away the right to publish papers on their own web pages. Never seen faculty so happy to have their rights restricted.

Princeton is behind the game for Computer Science. All the major publishers I use explicitly give the right to authors to publish their own versions of papers on their web pages including ACM, IEEE, Springer and Elsevier. Even before, publishers rarely went after authors' pages. I have posted FOCS and Complexity papers for years and IEEE just updated their policy last November. Also see this discussion about ACM giving authors links so their visitors can freely download the official ACM versions of their papers.

We have these rights so EXERCISE THEM. No computer scientist has any excuse not to maintain a page of their papers with downloadable versions.

It gets harder to maintain these pages. I have to keep an old bibtex-to-html system running and at some point I should redo the whole page. Sites like DBLP and Google Scholar let people find my papers easily without my intervention. But if I want anyone to see all of my papers I have little choice but to keep the page going.

28 comments:

This is particularly relevant with --ahem-- older researchers, who published before the internet era, or even the TeX era. I remember needing a paper from the 70s for my Computational Complexity class, which proved impossible to find. I sent an email to the author and he scanned it and uploaded to his web page.

Nevertheless, even in this era of connectivity it has become difficult to get content unless one is willing to pay ridiculous amounts of money for a PDF. So, a piece of advice: when looking for a paper, try the author's home page first, which will work most of the time. If you're not lucky --and only then-- try the conference/journal/editorial web site. And if they try to charge you, just send an email to the author asking for a copy.

1) People from Comp Sci mostly post their papersto their webite and/or Archives. Other fieldsare STILL catching up- Sometimes people in Mathdon't post. One of them emailed me pdf for his paper so he really did have it in E-form but did not post.

2) We've all gone to websites where it says things like `you can buy this paper for $30.00'Does anyone ever buy papers that way? I can pictureit for medical journals with cutting edge cancer research if people are not allowed to post,but for a math paper from the 1840's (I am not kidding).

3) IF 20 years ago the journals began charging$2.00 for the paper, of which $1.00 went to the author, then the ethic may have set in that you should pay and Journals may have been able to serve the community. I want to say``and wouldn't be in the crisis they are in now''but, alas, somehow they seem to still be afloatand making money, though I don't quite understand why since many articles ARE online.

Just to clarify, I know that CompSci journals typically allow you to put your papers on your own website, but the Princeton policy is also asking for permission to post on institutional repositories and other free archives (such as arXiv). Do the CompSci journals typically allow this as well?

Also, it is not clear to me from the announcement whether Princeton is asking for permission to distribute the final version, or a preprint from prior to the journal editing and peer review process. Some journals allow the latter but not the former.

Finally, it is not clear to me whether there is any implied policy about embargoes (e.g. Science and Nature typically only allow papers to be posted elsewhere several weeks after they have appeared in print).

It gets harder to maintain these pages. I have to keep an old bibtex-to-html system running

There really is no good excuse for not keeping a decent list of posted pubs. I think that I use the same system you do - it doesn't seem to be getting harder to maintain, though the output isn't the prettiest around. It runs off the same data I use for my CV, activity reports, etc. For each paper, I have to add one bibtex cite to a file and periodically execute one command to get the pubs page updated and it's done.

Going through the "complete list of rights retained by authors under ACM's copyright transfer agreement" described in the October'11 issue of CACM, I do not see the right to publish the work on arxiv or similar open access repositories. But perhaps I am missing something.

I haven't understood what all these links to google should mean? One can easier catch manuscripts papers from home pages, nothing more. Journal papers are closed as before.

The possibility I pointed to is: publishers in Russia make their papers FREE (really free!) after 1-2 years embargo. Why Elsevier, ACM, Springer, Wiley and others couldn't do the same? Libraries will by (if they can) their journals as before (2 years delay is unacceptable for them). But after 2 years ALL fellows would have a FREE access! We give up our rights (bad enough!), but at least these artificial (profit oriented) borders for free science would be down.

My impression is that the problem is not that we (scientists) cannot do anything against these robbers. The problem (apparently) is that most of us take the current silly situation as normal.

STACS moved away from publishers and sooner or later more conferences will follow. One day maybe even US based ones. As for journals, just submit your papers to theory of computing. We have alternatives. It is up to the authors to use them.

The counterargument is that our taxpayer dollars are being used to fund scientific research and it should be freely available to the public. Possibly we could do something to help researchers post papers to have them available to the public?

Making finding publicly available to others is definitely relevant! I do often check out other researchers' pages just to have a look at some papers. Even if the papers are not that relevant to my research, making so allows me to be well-informed about efforts in other related areas. If I (or my University/Organization) should pay for every single paper I read, I would be forced to read less than what I do ...

Just a sidemargin comment. For all of you interested in making your papers available use BibBase (http://bibbase.org/). Thanks to it, maintaining those pages is not as hard as Lance originally observed!

By the way, I follow you through Twitter Lance! Keep up the good work! :)

All this - making the manuscripts of your papers available on your home pages, google-like engines helping to find them, etc. - is fine. But the core problems remains unresolved:

(1) Why publishers require to give up all rights to our own intellectual property?

(2) Why we do not protest against this? I know only several individuals who did this, and lost.

(3) Why publishers (in the West) do not make ALL papers available for free at least after 1-2 years embargo?

(4) Why we (scientists) play this stupid game?

As Joshua Herman pointed "taxpayer dollars are being used to fund scientific research". So, these fat guys are robbing not just us, but also all tax-payers.

B.t.w. when making our papers available at our home pages, we turn in a "dark zone" - the robbers (publishers) could even sent us to a jail for this: these papers are no more our property! Putting the FINAL version of the manuscript (after the refereeing process) on your home page is illegal, according to the rules, they established.

I think this "embargo-time" problem should be discussed in the whole scientific community very seriously.

It is great if authors start posting their papers on their own pages, but it's still not a perfect solution: I am worried about what happens when an author dies. Will the department preserve his page forever? If not, will his papers be "lost" down the black hole which is Springersevier?

Universities in most cases remove researchers' homepages when they leave. My last several homepages have been lost. I am not yet dead, but, for example, had I taken a non-research industry job the pages would still have been removed.

I used to find it hard to believe that serious computer scientists, accustomed to worst-case analysis, would even dream that author-maintained individual homepages is a serious way of distributing research into the future. Now I understand that it is mostly a generational issue.

Well, I am pretty sure that the way the professors involved look at the subject is not as "losing rights", but more as "getting an excuse to not give away a right that otherwise they would give away because they don't care enough about it to make the effort to go against the system". And it takes more effort to go against the system in most fields that are not CS, I believe.

http://publicationslist.org/ seems to be a decent manager for publication lists.

For the more techsavy, consider generating your publication list dynamically and directly from BibTeX (if you have it, and you maybe should), e.g. with bib2tpl (shameless self-plug: http://lmazy.verrech.net/bib2tpl -- new version and Wordpress plugin upcoming)

As for the long term, Google seems to index publically available material just fine. Do they cache? My own work seems to be linked directly.

It seems we are speaking about different things: access to publications THEMSELVES (important!), and access to their bibliographic details (important, but much less). The question here is: WHO is the owner of his own intellectual property - the author or these "fat guys"? The problem is how to make ALL papers available to ALL fellows, not just those who (happily) have rich bosses (institutions).

The very strange "give up all your copyright" rule of publishers is the first thing to attack.

And why they cannot make papers freely available at least after 1-2 years embargo?

Well, this cannot be done on this blog alone. But the more publishers hear about our dissatisfaction, the better. Thanks, Lance, for touching such hot questions.