Recently I encountered a new phenomenon when I tried to submit a paper to arXiv. The paper was an erratum to another, already published, paper and will be published separately. I got a message from arXiv saying that I need to join the erratum with the original file. I was a little surprised receiving a reply from, obviously, a human being. Although I thought the request was a bit silly, I did what was requested, submitted the joint paper (the original union the errata), and forgot about it. But today I got a call from another mathematician. She tried to submit a paper with a title "... II". The paper "... I" was already in the arXiv and submitted to a (very good) journal. Both papers solve similar but different problems. One of these problems is at least 40 years old. Her submission was denied: she got a request from the arXiv to submit a union of that new paper and the old paper instead. This is quite silly. Is there now a special person in the arXiv who is making these decisions? It looks like there has been a change in how arXiv is managed. I understand that this is not a research question, and I make it a community Wiki. I post it here because several frequent MO users are affiliated with arXiv.

I vote that this question remain open. Certainly arXiv policy is directly related to the work of research mathematicians.
–
Theo Johnson-FreydMay 13 '12 at 2:44

19

@Felipe: arXiv is undoubtedly the most important tool for mathematicians after a computer, and any changes in its management are important too. If arXiv is slowly turning into a mega-journal, we, the mathematicians, need to know about it, and that change cannot occur without an open discussion.
–
Mark SapirMay 13 '12 at 2:47

@Mark Umm...I feel pretty strongly on this point, I'm not "in charge of the arXiv". I'm a math moderator, and chair of the math advisory committee. That is very far from being in charge.
–
Greg KuperbergMay 14 '12 at 3:36

25

It is not appropriate to use the word "bitching" in a professional setting.
–
Noah SnyderMay 14 '12 at 5:16

@Mark: I believe the text overlap is checked (at least at the first stage) by machine. Paul Ginsparg (the original person behind arXiv) and his collaborators had a paper about it, see arxiv.org/abs/cs.DB/0702012 .
–
Yuji TachikawaMay 13 '12 at 2:18

2

I took a quick glance at the two papers and the overlap appears to be that both discuss the same result due to "E. Study." This discussion is exactly the same (word for word), and goes a little beyond a formal statement of the theorem. But I do wonder how sensitive their software is - will statements of standard theorems be flagged if the theorem has been stated the same way in another arXiv paper?
–
Dan RamrasMay 13 '12 at 2:29

@Will: Thank you, it is interesting! So I guess the process is this: the software find out that papers "...I" and "... II" overlap, and sends this to a human being. The human being then decides that this is not good and asks the author to join the two papers. In that case I would like to know who that human being is (because (s)he makes important decisions concerning important papers).
–
Mark SapirMay 13 '12 at 3:21

10

It also just occurred to me that if arXiv decides that there is an overlap when there isn't one, it may ruin careers of the authors, and (at least if the authors are in the US) result in a large lawsuit against the arXiv. So the admin that makes these decisions should be quite good at it and probably well paid.
–
Mark SapirMay 13 '12 at 3:32

I'm still the chair of the math arXiv advisory committee, which admittedly hasn't done a whole lot lately, and one of the global math moderators. No, there has not been any dramatic change in the management of the arXiv at Cornell. If anything, I wish that by now more might have changed. The arXiv has always had the bare minimum funding, sometimes less than the bare minimum. They have never had polished public relations to properly explain small changes in policy. (Actually even wealthy Internet companies sometimes stir up confusion when they make changes.)

At some informal level, they/we have always worried about duplicate submissions, and near duplicates, and errata posted as new papers. And yes there is a new text overlap tool to detect both plagiarism and self-plagiarism. There is no good, rigorous way to draw the line for any of these issues. (Just as there isn't at MathOverflow --- what exactly is an "exact duplicate" of a previous question?) Regardless, if your submission is rejected, you do have the right to "file" an appeal with the Cornell staff. If it is a plausibly sane appeal, then they should show it to the math moderators and/or the math advisory committee, more likely the former these days.

One perfectly valid consideration is to have the arXiv correspond to what is published in journals. Although there are cases where strict adherence to that rule is untenable. For instance, my mother and I have a joint paper in the Annals of Mathematics that appeared twice just because the first time, the paper had TeX symbol encoding errors.

Also, I personally think that this posting is reasonable for MathOverflow. However, it would have been better with a less suspecting tone. The arXiv doesn't always make the best impression, but long-time users know that actually it has gotten better over the years. For a long time it had a reputation as a "user belligerent" web site. Even then, it was still a force for good, obviously.

You could claim that your paper with your mother was so good the Annals published it twice.
–
Chandan Singh DalawatMay 13 '12 at 7:06

16

@Greg Kuperberg : Will an author be notified and given time to appeal before an accusation of plagiarism is attached to their posting? I've seen the accusations in the daily mailings, and usually when I check the two papers in question they don't look (on a quick glance) to be instances of plagiarism. An accusation in a public forum like that could be a real disaster for someone's career. And certainly there is sometimes a bit of shared text in papers of mine (eg how many different ways can I discuss the Birman exact sequence in the "preliminaries" section?).
–
SusanMay 13 '12 at 13:41

10

@Greg: It turns out that the author did write to the arXiv staff and received a truly remarkable answer. Basically they say that she posted "too many" articles recently "with similar ideas" (!!), and that a moderator (anonymous!) suggests joining the papers into one. The author, by the way, is a Distinguished Professor in Mathematics and does not have as many papers in the arXiv as, say, Shaharon Shelah (or even myself). About plagiarism (or "overlaps without references" as arXiv puts it), I agree with Susan. That is a really dangerous thing - both for the authors and for arXiv.
–
Mark SapirMay 13 '12 at 15:51

9

@Greg : Thanks for the answer. If the software is going to leave a comment about there being shared text, is the author notified in time to withdraw the paper? Frankly, I'd rather not use the arXiv if I don't have control over things like this or don't have an opportunity to protest before the (non)accusation goes out on the daily mailings.
–
Andy PutmanMay 13 '12 at 18:13

9

@Henry and Greg: what I was thinking above is that people may write several papers that each review some of the same definitions and key past results in a preliminary section, and I was concerned arXiv might now be picking that up as "substantial overlap".
–
Patricia HershMay 13 '12 at 23:19

I talked to the arXiv staff about Olga Kharlampovich's submissions and I now have some answers. The letter that Olga posted here is a form letter that doesn't fit the facts. The text overlap tool reported that the new submission substantially overlapped with the old submission. After that, as far as I know, no moderator and no advisory committee was ever contacted. Instead, an arXiv employee sent this stock response just to keep things moving. After that, I was told, her case was added to the to-do list. I was assured that as of last week, before this question was posted to MathOverflow, her submission was already slated to be reverted in her favor on Monday.

Obviously this is not satisfactory. I am one of the moderators (and not the only one) who should have seen the appeal. The e-mail said that someone like me had seen it and rejected her appeal, but apparently no such thing happened. It seems that the submitted version (which I think is now version 3) had something like 75% text overlap with the previous version (version 2) of arXiv:1111.0577. It's not so unreasonable to flag such a submission. After that it wasn't handled properly. I do not want to name names and lead people to pour opprobrium on the overworked arXiv staff. (There are only two of them who handle daily submissions.) But I want to make this story sound accountable, so I can say that some of my information came directly from Paul Ginsparg.

To go back to the title question, no there has not been any great change in arXiv management. You could certainly argue that there is insufficient management, but that's not the same thing.

People are also asking about the policy by which papers are labelled as having text overlap with other papers. A clearer statement of that policy would be useful, but that is a separate question from Olga's case.

According to e-mail that I just saw, this morning Olga was given the option of reverting the previous arXiv paper to Part I and submitting Part II separately. Her answer, according to what I saw, was that she elected to keep it as a replacement after all. I am mentioning this so that readers who see arXiv postings this week won't think that injustice continues.

I stand by my explanation that the stock e-mail that she was sent didn't fit the facts, and that her appeal should not have been stonewalled. (In fact her appeal was soon seriously considered internally, but that was not explained.) However, in the original posting, Olga's name was withheld supposedly to protect her interests. Although I understand that anonymity is sometimes vital even in a public accusation, in this case I don't see how it helped matters.

@Greg: I have explained to you why I did not disclose the name. I repeat the reason: I did not ask for permission to do so. There is nothing in it about "protecting interests". It is basic human behavior (taught in preschools, right after potty training). In your answers, you did not say what is the formal procedure to appeal arXiv's decision. The first step is clear: a message to the admins. What is the second step after an automatically generated reply is received? A message to you? Or is there an intermediate step?
–
Mark SapirMay 14 '12 at 18:30

8

... But here's the thing: most of the arXiv problems I hear about result from the staff/mods doing too much, not too little. I'm talking about things like the case under discussion, or admins making changes to metadata against the author's will, or moderators reclassifying submissions. I'd have thought that in a situation where everyone's overworked, the default would be "don't intervene unless there's a very good reason", because intervention takes time - as does justifying it to authors, who may have valid objections.
–
Tom LeinsterMay 14 '12 at 23:26

5

@Tom Exactly as you say, the problems that you hear about. This impression is a result of selection bias. There is usually no public discussion when they intervene or don't intervene when there is a very good reason. But if they ever make a mistake and overreact, then specific people have an incentive to air their grievances in public. It's the same way with for instance, flight attendants. (Or MathOverflow maintainers.)
–
Greg KuperbergMay 15 '12 at 3:51

3

... is it a new arXiv policy that material can no longer be moved (and not just copied) from one arXiv preprint to another (whether this involves division of one preprint into several or not)? If so, this is a huge change of policy in my eyes: I would then seriously consider refraining from any further submissions to the arXiv, and perhaps even "withdrawing" the existing ones in favor of homepage copies or some still surviving arXiv competitors, and hopefully I'm not the only one concerned with this. Could you please address these issues more explicitly in your answers in this thread?
–
Sergey MelikhovMay 15 '12 at 11:14

3

(1) Mark is absolutely right, I can and do advise the people who run the arXiv, but I can't really speak for them. (2) There isn't any "bug" in the text overlap program, which was written by Ginsparg himself. There was a "bug" in how humans made use of it. (3) Olga was asked on Monday what she preferred, by a human and not a robot, and she said that version 3 should be posted.
–
Greg KuperbergMay 16 '12 at 14:33

Hello,
I have paper 1 in the arxiv (that is submitted to the journal) and submitted paper 2 with completely new results (with similar formulations and refereeng to paper 1. I didn't want to change paper 1 because it is submitted, people refer to it, and it makes bad impression when new and new revisions are made, also the submission date is changed), the second paper was returned by the arxiv, I appealed, and this is their response:

Dear Olga Kharlampovich,

Our moderators have considered your appeal and maintain that your article is not appropriate as a new submission to arXiv. The new ideas should be incorporated into a replacement of your existing article.

In general the maintainers of arXiv choose to exercise very limited control over submissions; however, we do want arXiv to be as useful as possible for all of the various communities publishing here.

A moderator noticed that you have submitted several articles in a short period with similar ideas and content. After a discussion of your submissions among the other moderators and members of the advisory committee, we have decided to ask you to consolidate articles with similar content, or which are variations on the same theme into single articles.

This will be more efficient for the whole arXiv community, and may be beneficial to you as well. In consolidating your work you may find that you can more clearly elucidate the connections and expose the underlying principles so that your ideas will be more useful to others.

--
arXiv moderation

Let me add, that "several articles in a short period " were these Article 1 and 2". The first one was submitted in the Fall, and the second in May. I incorporated them into the same article now, but I think this is silly. What is going to happen if we get new results on a similar topic?

I'll look into it. It would be helpful to know the date of this correspondence.
–
Greg KuperbergMay 13 '12 at 16:27

34

I find this decision by the arXiv, as set out in their response, somewhat alarming.
–
Yemon ChoiMay 13 '12 at 16:30

3

@Yemon Maybe so, but I would like to first find out what really happened.
–
Greg KuperbergMay 13 '12 at 16:33

8

If actual humans are involved, they should be presumably sufficiently subject-aware to know that, for example, @Olga Kharlampovich is a distinguished mathematician, and so give her the benefit of the doubt. This is particularly weird since in physics (which probably has two orders of magnitude more volume) they are plenty of papers which are identical to first order.
–
Igor RivinMay 13 '12 at 20:41

4

@Igor In fact, the math arXiv has had impressive growth and now receives more than 1/4 of total new submissions to the arXiv, so it is certainly not the case that physics has two orders of magnitude more volume. Also, as recounted elsewhere, Olga was quietly given the benefit of the doubt despite miscommunication with her. She is indeed distinguished, but that's not the reason that her case was reviewed last week. The ideal would be to give everyone the benefit of the doubt.
–
Greg KuperbergMay 15 '12 at 4:59

Let me add, that "several articles in a short period " were these Article 1 and 2". The first one was submitted in the Fall, and the second in May. I incorporated them into the same article now, but I think this is silly. What is going to happen if we get new results on a similar topic?

Thank you for mentioning the date ranges, that's helpful. Otherwise "answer" in the thread should be combined with the other one.
–
Greg KuperbergMay 13 '12 at 16:35

3

Welcome to MathOverflow, Prof. Kharlampovich. Any question or answer you post (but not comments) should be editable by you. For matters like this, you are encouraged to edit your answer rather than submitting a new answer. In future (and because MathOverflow submissions are much shorter than arXiv submissions) I hope you will make appropriate use of the edit feature. Gerhard "I Know; Still, Please Edit" Paseman, 2012.05.13
–
Gerhard PasemanMay 13 '12 at 16:38

1

Since this is CW, I appended this remark to the other answer. This one should be removed.
–
Greg KuperbergMay 13 '12 at 21:20

I hope Prof. Kharlampovich has noticed the respect with which the request for future edits on MathOverflow was made. I even put in a sympathetic signature. If she finds it humorous as well, so much the better for her. Gerhard "And Better For Us All" Paseman, 2012.05.13
–
Gerhard PasemanMay 14 '12 at 2:24