Inspired by a discussion elsewhere, I’d like start an open thread about the pros and cons of posting a paper to the arXiv before it’s accepted by a refereed journal. To get the convo going, here’s my summary of what came out of the previous discussion:
Pros to posting before acceptance:
- Problems and omissions get caught before publication.
- More people have a chance to “referee” and give feedback and the published paper might be better and richer as a result.
- Results get out to the community faster.
Cons to posting before acceptance:
- Wrong results get circulated and could possibly never be corrected or retracted.
- Could end up with several very different versions of same paper in circulation resulting in confusion.
- It’s possible, but not confirmed, that NASA HQ will not issue a press release about a paper that has been put on arXiv and later accepted by the journal. The reasoning is that since the the paper is in the public domain, the story is already out there. This essentially results in an official policy that precludes one from posting before acceptance. (Can anyone confirm this?)
- Some people will not referee a paper if they see that it’s already been posted to the arXiv.
I personally would like to post to the arXiv at the same time as submitting in order to get the additional scrutiny and feedback from my colleagues. However, maybe an alternative is to simply circulate the draft to the ~10 people who are most likely to have something to say?
What do you think? When do you think we should post papers to the arXiv and why?
(As a sidenote, I have noticed a near universal switch to writing “arXiv”, but saying “astro-ph”. Neither here nor there, just something I noticed that might confuse the young’ns.)
Provided that you are also planning to submit your article to some of the refereed publications, you should post in astro-ph ONLY after the acceptance of it in the refereed journals. Alternatively it’s kind of cheating to increase your paper count. If you want to circulate you paper to some people just put it in some “dropbox” and send the link to your loved ones.
Ideally I would opt for abandoning all the journals and make astro-ph submission coupled with a refereeing process. In that way we submit to only one place, get evaluated say with bunch of “nice” scientist and get published electronically.
Circulating to people you consider interested was the way “preprints” were done in the pre-internet days. All in all not a bad system. I am pretty certain that it is not done to post if you want any kind of media blitz to go with it since most journalists a) get your blurb in advance but with b) an embargo deadline and a disclaimer as to when they can publish this.
So if any kind of media attention is required, it is poor form to post. Otherwise I’m starting to lean towards posting before acceptance as hopefully, one will be able to incorporate more feedback from the community.
*still posting only after acceptance. Was told as a grad student it was bad form to do so before*
I agree completely with the ‘pros’ here – and I sometimes find it frustrating to spot (small) errors in people’s papers, but know that it’s futile to point it out to them since the paper’s already been accepted and/or published by the journal.
A lot of papers nowadays have large coauthor lists – and have most likely gone through an initial set of internal peer-reviewing – so it’s generally a lot less likely that problems will be raised with them in the journal’s peer review process. Single- or few-author papers are a bit different, in that they won’t have gone through this process, so might be a bit more suspect. That’s the sort of information you can take into account when looking at the paper and deciding whether it’s trustworthy, though.
It’s a shame that the arXiv doesn’t have an option so that you can choose whether to read papers only after acceptance, or when they are posted on the arXiv – or at least make it so that you can clearly mark the paper as pre-acceptance – such an option would make the whole argument null and void.
I don’t get Tigran’s point about “cheating to increase your paper count” – since people should either look at the number of peer-reviewed papers you have, or take into account “in prep” papers (as are so often listed on people’s CVs when applying for jobs..) analogously to those available on arXiv…
There are many reasons we read papers. Sometimes we want to learn about a new subject in which we are not experts, other times we want to place our work in context with a larger field, and often, we read papers directly relating to our area on which we can comment expertly*.
I have no problem with papers in the last category being posted to arXiv. I can read those papers with the most critical eye, spot mistakes, contact the authors and alert them that they have not cited me 😉 etc.
But for all the other categories, I want my papers nicely vetted, thank you very much.
Unfortunately, not everyone works in my specific area so my selfish desires cannot be applied to everyone.
That is why I firmly believe in the peer review process — however flawed. And it *is* flawed. Perhaps we can add to this discussion (or discuss in parallel) ways to improve the peer review process so that we do not feel the need to bypass it via wiki-referee on arXiv.
*Not an exhaustive list for why we read papers, but I think those are the main categories.
let’s hold off on a discussion of ways to improve the peer review process. I’ll put up a new post next week about it. Someone actually just sent me a nice article to help motivate that discussion…
I think the cons in this situation vastly outweigh the pros. I think it is very important that the paper that comes out on arXiv is free from incorrect data / plots / tables and free from biases that should be removed by the referee.
If you’re collaborating with several people you should have already gone through a fairly rigorous peer review process. Especially if the paper is part of a larger collaboration, that has to ‘sign off’ on it. Regardless, using arXiv as a means of peer review is an awful idea because the cons you listed are very detrimental in this day and age of google searching and ‘plot propagation’ across talks and presentations (by this I mean people re-using plots over and over without checking for updated versions).
Most people only read the arXiv mailing lists and such; and rarely look at updated versions. So let’s say I find an interesting paper, print it off, read it, and leave it sitting on my desk. 5 months later I need a paper on this topic; it’s up to me to check for an updated version, but is it necessarily true that everyone will do this?
I think the real issue is a lack of responsibility by authors, and editors and reviewers. Authors can simply ask for a different referee, and be granted one, without anything but the discretion of the editor. I know I personally wouldn’t want to post a potentially flawed version of my paper up for people to use against my work later. And it WILL happen.
Sure, we all understand the research process. But how long until a competitor says ‘So and So et al. now claim the same results, though a previous version of their paper claimed something else. So this essentially discredits their work.’
If we want better papers, we need to improve the official peer review process itself, not use arXiv as a substitute for professional responsibility.
Circulating to targeted individuals is better, particularly as it helps distinguish pre- from post-refereed versions.
Even there, one can run into problems. I once sent a paper to someone for pre-referee comments; the response I got was a preprint, six months later, criticizing the pre-referee version for content that wasn’t in the published version. Not good! That was in the days of paper pre-prints and perhaps the recipient didn’t realize this was a pre-referee copy. By handling pre- and post-referee versions differently you can perhaps avoid such confusion.
The older I get, the more the pros outweigh the cons (after having been preached the cons for many years).
One point made me laugh:
“Wrong results get circulated and could possibly never be corrected or retracted.”
This is almost a joke because the refereed literature is *full* of wrong results. One could spend an entire career rebutting wrong results in the literature, and do nothing more than be cleaning up after sloppy scientists. The refereeing process as is, relies on an overtaxed editor doing the best they can, often acting as editor for papers that they are not experts on, and relies on their judgement of who constitutes a proper referee (which works most of the time, but not always), and then of course relies on the quality of the referee, their judgement, and their ability to write a useful review. We’ve all had really good/useful referee reports and really bad/unhelpful referee reports. One argument for putting the papers on the arxiv *first* is to help open up the review process to the “cloud” of scientists – and those that are really interested in the paper might be able to send some feedback before the paper gets published (in principle, this is similar to emailing a pre-submission draft to ~10 of your colleagues and asking for input, which also is helpful). The author can obviously filter out what comments they deem helpful or not helpful. But overall, this process should lead to higher quality *final* publications, which is a good thing. There are so many *bad* papers out there (moreso in certAAin journals than others), where there has been a failure by the coauthors, the referee, and the editor to question the assumptions and analysis that went into the work. The classical view is that work can be easily rebutted by a response in the form of another refereed publication – but we don’t have infinite time for doing this for every instance of interstellar baloney that we detect (so usually only the most important papers have any serious challenges in the refereed literature).
[snipped off-topic bit about citation. That’s a discussion for another day, probably Friday. –kelle]
“[snipped off-topic bit about citation. That’s a discussion for another day, probably Friday. –kelle]”
As much as I’m a fan of this site, I do not support this site’s editors when they decide which comments are off-topic and selectively edit them.
“[snipped off-topic bit about citation. That’s a discussion for another day, probably Friday. –kelle]”
Ditto to what Ian said. This is a really surprising edit to see.
Many people post to the arXiv at submission because they’re afraid, if they wait, that a competitor will “scoop” them or claim precedence by earlier posting (then they have to cite the competitor in the final version and the competitor doesn’t, even if the refereed versions of the papers actually appear at the same time). Ultimately, scooping rarely matters unless it’s an extremely high profile result, because nothing we do is really so important that it matters whether one set of photons emitted millions or billions of years ago are published three months before another.
However, it’s a classic Prisoner’s Dilemma problem. You have to post before submission because your competitors might, and vice versa. This is one thing I don’t like about posting at submission. You’re making the choice not only for yourself, but forcing the same choice on the others in your field, even if they would prefer to post on acceptance.
If we have learned anything from open science, we have learned that the most helpful commenters and readers are invariably *not* among the 10 people you hand-pick. So send it to the whole community via arXiv and have them understand it is submitted not accepted! I think we are all adult enough to use that information responsibly.
And as Eric and others say, since when does the single referee’s eyes catch all the mistakes? If you want your published papers to be correct, you are *much* better off putting them on arXiv early and often! Like Eric, I laugh when I hear the idea that waiting until refereeing makes sure your results are “more correct”. Actually, it makes sure they are less correct, because fewer people have vetted them.
I think the cons here far, far outweigh the positives in most situations. While I agree with Eric and David that the literature is full of wrong results, and that refereeing is not a panacea, I think the chance of a wrong result coming out and/or being cited is much greater if papers are posted on arXiv before they are accepted. At least if your paper is accepted someone (sometimes several people) has been specifically charged with going through your paper with a fine tooth comb, which — if they’ve done their job right — is more than 98% of people reading a paper on arXiv will do. That’s not to say you don’t get helpful comments from people reading your paper on arXiv — you do. But I’ve found those happen much less often than helpful comments from a referee, and moreover, I feel strongly that once a journal article-to-be is posted on arXiv it should be good enough to cite.
Unfortunately, contrary to what certain people on this blog have said, one runs into seriously flawed papers on arXiv reasonably often — and much more often when they are posted before acceptance. Yet they get cited in the literature… is that right? And then they are used as priority claims when a discovery is made. That’s not right either. Yes, certain large collaborations have strict internal refereeing rules. But many of us do not work in large collaborations. And many of those same large collaborations have the rule that papers are not to be posted on arXiv before they are accepted, for exactly these reasons.
isn’t this about culture?
– the flawed papers garnering citations are more likely garnering self-cites because astronomy citation practices are pretty darn insular.
– you assert that you get useful feedback from a referee but not from arXiv readers because… that is what our current culture deems normative. if we valued feedback along the lines of a formal post-publication peer review model then I think you would be experiencing things differently.
as for your rather strong assertion in post #13 that:
i have to say that as some one very interested in the role arXiv plays in “scholarly communication” i would appreciate working with someone (yourself?) in quantifying these examples. even a few examples would probably give great insight into the cultural antics of these authors and help inform those of us who want to see arXiv used in other ways.
Mostly for the first two reasons in Kelle’s list, I believe that the cons outweigh the pros of submitting before acceptance. I’m proud and grateful that the physics/astrophysics community has a free, widely-used service like arXiv to circulate our work. Even in my (relatively) young career, though, I’ve had papers change by enormously significant amounts between the first submission and eventual publication (including feedback from referees, editors, co-authors, and my own revisions). For me, the benefits of getting a paper out a month early don’t outweigh the serious danger of incorrect results being circulated. I agree that the peer-review system isn’t perfect, but it’s far better than making our results completely open access.
Like the good scientists we all are, it would be very useful to have some additional data to support out various opinions. Is it possible to study the number of astro-ph papers that have been retracted/revised and see whether there’s any relation to posting before acceptance? If anyone has specific examples — eg, “I read an astro-ph paper and cited it, and later found their results changed” — it would be useful to know, even in a qualitative sense.
Given that anybody can submit absolutely anything they want to the arXiv, I much rather rely on a flawed refereeing system than on no filter whatsoever. The only way I will cite an astro-ph paper is if it has been accepted and waiting for a volume number. For me, if a paper has not been accepted, it can go from random gossip to great work, while accepted material tends to be more serious on average (with certAAin exceptions, of course).
In my personal experience, more than half of my papers have changed a lot (always improved) from submission to acceptance. I don’t want the worse version of my paper to be the first one to see the light and be remembered by the community. Also, the one paper that ended up as a press release, was chosen as such way after submission. I’m sure that if I had put it on the arXiv, there would be no press release.
I absolutely agree with this comment.
Most of serious scientists would always try sending their works to journals that are reputed. These journals are reputed only because of their rigorous peer-review system and they publish only high-quality outcomes. If I am submitting to a new journal, I always check its editorial board and review system to be sure that it has the good people in the field on board and a review system that ensures valuable feedback about my work. As in the cases mentioned here, even my papers have undergone significant changes (improvements) from submission to final acceptance. This is certainly because of the rigorous peer-review system of journals. And therefore, I would rely more on submitting to arXiv after acceptance rather than upon submission – I will not like the community to remember my work that has flaws and has not been peer-reviewed.
ArXiv might be an option to get feedback about your work prior submission. But, such feedback can be obtained by sending your work to ~10 of your friends working in the same field.
I post only after acceptance. I’ve had a paper go through a 1.5 year reviewing process, which was largely my fault but I still wouldn’t have wanted the community citing the version that had very different results and emphasis than the final.
An important reason I don’t like to see arXiv papers pre-acceptance is that this is an easy way to say “We have released the data…” when in fact the attached data will not be available from the publisher until (long after?) acceptance. This is currently an implementation problem more than a philosophical one – if there was a reliable place to post data concurrently with arXiv posting, it would be less of an issue. I know folks are working on such a solution, but until it’s available, I don’t want to see papers on arXiv claiming to make data available when it is not.
Also, Kelle, thanks for the arXiv – astro-ph comment. I would certainly have been confused a few years ago.
An easy way to solve that problem is to upload the data to your institution’s website (or the collaboration’s) – there’s no need for the journal to be the only ones hosting it (and it’s often easier to access on a non-journal website too, depending on the type of data).
re:Mike – I agree, to a limited degree. I don’t post data on my institution’s website because I know that my site will be deleted within a few years since I’m not tenured faculty. Putting that sort of a link in a paper is unreliable. I haven’t encountered any collaborations with permanent websites; the main collaboration I’m on has all of its data hosted at IPAC.
More important, though, is electronic-only data tables that really deserve to be on, say, Vizier or some other catalog service. I’ve seen many papers with large data tables saying “the full version is available in the electronic version….” which doesn’t become available until months later.
adam is definitely on to something here. this drives me crazy. do people know that you can post all kinds of ancillary “data” into your paper on arXiv? not that i’ve ever found a single paper example in astronomy where someone has (and i’ve trolled the sources looking).
there is no reason for the data to live on a soon to die home page via a soon to break link or for you to wait for the post print version to push it out there just because its a long table. unless there is a reason and it has nothing to do with access.
This is not that hard. When I make a machine readable table for a paper, I include it in the astro-ph tarball. I also have put the tables on my web page so that they are available (and findable with a Google search) after acceptance and astro-ph submission, but before the paper appears on the ApJ website. Yes, my webpage will die eventually, but by that time the table will long since be on the ApJ site and ingested into Vizier. If you really need a webpage that will be around longer, there are ways to do that (sites.google.com?)
Data is released when it is freely downloadable. If you want to make it available, there is always SOME way to do it. Most of the roadblocks are in people’s heads.
re: August – I think I encountered the arXiv ancillary data thing once, but was off-put by the size limitations (I wanted to upload a data cube… not a chance). Still, I doubt many folks go searching through arXiv’s capabilities when posting.
re: Ben – I agree, it’s not hard to find sites to host data, and I like the idea of finding a holdover storage site while the journals are busy formatting the data. If we could make early-posting common practice, it would dampen my objection.
As for sites.google.com, I don’t think google is supporting it any more? And that’s the problem with nearly anything on the web… not much lasts as long as academic careers. The most impressive example of longevity I’ve encountered so far is Bruce Draine’s website, which is cited in books at least a decade old (older?) but every piece of code and data is still available at the same url. I would like to see that kind of longevity available to all astronomers.
these are good points; i think that we were responding to observation that often, when a paper contains an extended table, the extension is not in the arXiv version.
i guess your point is that it might well be there but only if you download the original source tar ball and unpack it.
and my ancillary data in arXiv example was that one can force arXiv to expose your unlisted .tbl file in a unique way. Like on this page: http://arxiv.org/abs/0905.2326v1 on the right hand side under “ancillary files”. these do not count against the PDF size limit people complain about.
just like you said though — the arguments are often in people’s heads. the examples adam and I are thinking of might be cases where the authors excluded the tables because they didn’t know arXiv had this option or that they thought it would count against their upload limit.
My solution in that case is to leave the text in the paper as is (which mentions that the full data are available in the electronic version), but in the Comment section of the arXiv submission to say “The full data tables are available at http://blahlblah.edu/mywebsite“.
I don’t understand the issue – ultimately, arXiv papers almost always have an indication of whether the paper is accepted or just submitted, and whether it is to a refereed journal, or to a conference proceedings, so readers can just read papers with the appropriate pinch of salt, right? I choose not to upload papers to arXiv until they have been accepted, but I have no issues with reading papers I know have not been accepted yet.
In any case, as several people have posted above, there are many terrible ‘accepted’ papers out there!
this is pretty awesome — so the summary is that posting a paper to arXiv at submission is bad idea for you because
a half anonymous read over by a so called refereethe peer review process is necessary to catch your mistakes, bad plots, tables, data etc.
because what you should do wrt arXiv was the original question, not one about everyone else’s retching of papers on to arXiv or in to the journals or the ratio of refuse between them.
personally, the only “con” i consider to be serious is the one about versions. posting to arXiv at all might be not a good idea given that some journals prevent you (sic) from syncing their version with the open version; not that I’ve responsibly tried to keep them in sync. ever.
Do you take the referee’s role this lightly? I assure you that when I referee, I do not. The referee’s job is not to read over the paper, it’s to go over it with a fine tooth comb. That role is essential to astronomy, and in my view it is important that this process is complete *before* a paper (as distinct from a proceedings article) is posted to arXiv.
The cons have nothing to do with ‘retching’ papers to arXiv, what they have to do with is the fact that once a paper is posted to arXiv most regard it as being published. A citation to arXiv is perfectly legitimate in our field, as it should be — but we both know that a citation is far stronger when the paper is accepted than it is when the paper is merely submitted.
Let me ask you a hypothetical… paper X makes a claim to a discovery, and is posted to arXiv on submission by its authors. It is found to have serious problems by the referee. Meanwhile two months later a much better paper is submitted by a different team that makes the same claim, but this time with much better support and it sails through. Which paper should be credited with the discovery? I think the only defensible answer is that the second paper gets priority — they did the job right, had the better data and their paper got ACCEPTED first. But far too many times I have seen the first paper’s authors claim priority and then that gets spread throughout the field.
i do not take my role as a referee this lightly; i assure you that i do not remember commenting on my own actions as a referee. yet my characterization of the peer review process is the only valid prior i can think of to explain the quality control that published materials receive.
in fact i am squarely on your side about our job as referees. it troubles me that i take days upon days fine tooth combing a paper for bad plots, tables and simple errors because it wasn’t treated to the level of respect it should have been before it was submitted to whichever publisher is not named arXiv (bc I don’t review arXiv papers. is there an editor for that?).
this discussion is really about culture — whether or not we would or wouldn’t cite arXiv at all, about how well (or not) we do our jobs as peer reviewers sans any meaningful measure of peer pressure or attribution/blame. if our culture accounts for ‘publishing to arXiv’ and if we really can value load citation with something more than a counter increment or an unclicked URL, then we can deal responsibly with arXiv publication in advance of or maybe in place of acceptance.
as for your hypothetical, I repeat my earlier admonition that the question is about what you or I do as authors with our papers. your hypothetical treatment of bad papers written by bad authors caught by a good, responsible referee foiled by arXiv is precisely that — a hypothetical. i understand the problem you are trying to illustrate — the circumstance of misattribution of value to whoever published whatever first — but your hypothetical forgoes the slightly more probable reality of the first paper’s referee who is also the second paper’s first author.
what is worse, much worse, is that we actually do not need an arXiv proxy for the circumstance of misattribution of value to whoever published whatever first; it happens all the time… in the journals… precipitated by a substandard refereeing process.
One benefit of the traditional peer review process is that authors actually have to respond to the criticism of their readers, the referee(s), one way or another instead of just ignoring them. I would be much more in favor of papers posted on arxiv without prior refereeing if there were some mechanism where, say, readers could post comments or criticism of the paper and then the authors *had* to update the paper in response. That would prevent people from just posting half-baked work and then leaving it sit. And likewise, there might need to be some incentive for readers to do the really nitty-gritty detailed reading and commenting that a good referee can. (“you must post detailed critiques of at least two other papers before you can submit your next paper”?)
Like them or not, deadlines are a tool for focusing the mind, and likewise having formal structures around the refereeing process can be a motivating factor to make your paper better (rather than your just going on to the next project sooner, say).
On a semi-related note, I wonder what the community would think about making the referee process much more open. In this day of Wikipedia and open-source software development, why not approach academic writing with the same level of openness? Give every paper on arxiv a ‘comments’ box, and have the comments provided appear right there with the paper, and the authors’ responses along with ’em… If done right, this might also help provide checks and balances on overzealous (or conversely lazy and/or just plain bad) referees, too.
I have pretty strong opinions on this topic: I think that posting your paper to the arXiv at the same time you submit it to a journal is an _excellent_ idea. In fact, I see almost no downsides to it.
Note that I’m pretty specific about exactly what I’m advocating: posting only when you and your team are confident enough in your results so that you submit to a journal at the same time. I agree that the arXiv should not be where crappy draft manuscripts are incrementally improved version by version before submission to a journal or (worse) otherwise abandoned.
If you do this, I guarantee that your papers will be better for it.
Here is an example that happened to me that sealed my opinion.
When I was a postdoc, I was working on a very hot topic that had just exploded on the scene: the Double Pulsar. I managed to take some excellent very high quality data on the system and I found a result which was shocking. My collaborators included some of the foremost pulsar people in the world and we could find nothing wrong with the data or the analysis. So I wrote a quick ApJ Letter and posted it to the arXiv at the same time I submitted it to ApJ. Within 24 hours, I received two emails from other pulsar experts who suggested that what I was seeing was a particular data analysis issue that you would only see with such a relativistic system. They were correct. I immediately posted a letter to the arXiv (v2 of the paper!) explaining what had happened and the true reason for what we saw (which no one has seen in any other pulsar before or since), I thanked those who pointed out the issue to me, and then I retracted my submission from ApJ.
This was, of course, very embarrassing to me at the time: heck, lots of members of the pulsar community still jokingly call the data analysis issue the “Ransom Effect”. However, it was _highly_ valuable both to me and the community. Given that almost no one in the world knew about that potential issue, a random referee would almost certainly have accepted the paper (especially given the expert co-authors I had), and I would have had a major mistake in the literature. In addition, the whole pulsar community became instantly aware of the result and learned about the potential issues involved. So while my pride was hurt, things turned out much better this way in the end. You can still see both versions on the arXiv (it never forgets!).
This anecdote (and yes, it is only anecdote) exemplifies, though, everything that I think is good about early scientific discourse. We correct each others mistakes quickly and science improves as a result.
I don’t see why we should be worried about the arXiv clogging up with papers because of this behavior. Since most journal-submitted papers end up getting accepted eventually, as long as you submit to both arXiv and a journal and then update your paper to the accepted version when it happens, we shouldn’t get extra bogus papers because of this practice. In addition, if we hold ourselves and our collaborators to high standards for our papers in the first place, there would be even fewer “bogus” papers to contend with.
As for the argument that doing this can cause you not to get a NASA press release, that could be true in some circumstances (I’m not positive). However, I know for a fact that I’ve had a NASA (Fermi) press conference on pre-accepted or even pre-submitted results before. In addition, some journals have embargoes that can sometimes apply to posting to arXiv (but note that they often do _not_ apply to communications to colleagues but only to communications with the press). So you do need to carefully take these things in consideration. (We should remember that we scientists hold the power here in the long term, though: we are the ones on the NASA committees who can change these policies if they don’t make sense.)
Finally, I completely agree that real, careful refereeing is an crucial thing for science, but it’s not a panacea. I’m not arguing against it at all — if anything, I’m arguing for more of it from the arXiv. Have you ever sat in an Ohio State – style astro-ph coffee and heard how critical they are towards papers??
Bottom line is that I argue that for _your_ papers, the world will get a better product if you post to arXiv early. People _will_ read it. They _will_ be critical and let you know. And you _will_ make improvements because of it, including those that a single referee would miss. Finally, you’ll get more citations for doing it.
I believe this is a double edged sword in which the people have all good intentions, but it can easily go off to wanting to game the system so one can move quicker to the top. I mean, just look at the effect of “the importance of being first” and the number of articles that are posted each day within the first 5 minutes of 20:00 GMT.
Yes, in the end the good work will shine through, but for the short term gain, “the importance of being first” (publishing an article a few months before acceptance) might just get you that grant, that observing proposal, that prof come to talk to you at a conference to offer you a job. You are posting about submitting to ArXiv after you submitted to the journal, but there’s an even more worrying trend of people posting “draft versions”. What’s next? Post up a successful observing proposal, the data is coming in, results are mine?
I was listening to a podcast with Gary Taubes, a science journalist the other day and it is scary how he summarized the positions of scientists from his point of view nowadays. I do not claim I agree with his view, but it was interesting to hear it. Here is an excerpt:
“Now there’s room for hundreds because of the blogsphere, because of the explosion in information. And the competition is to be loud and sure of yourself. That’s how you get to the top. And of course, you could still be right. I don’t want to suggest that being loud and sure of yourself proves you are wrong. It doesn’t. But it does change the culture of the profession. The very first book I wrote about physics, Nobel Dreams, the physicist I was writing about who ran this experiment was a very Machiavellian Italian physicist who–he taught at Harvard and he worked in Geneva and he commuted between the two weekly; and he had been on on virtually everything he had done. But he’d just about demonstrated how he could move up to the very top of his field by claiming a discovery and then kind of moving on to the next experiment and leaving better scientists to clean up the mess after him. And this has been a common theme in everything I’ve written about. In making declarative pronouncements based on preliminary data, and fighting viciously to get people to believe in you, you can do very well for yourself in these careers. Nobody moves forward by spending their career checking other people’s work to see if it was right or not. And actually one of my favorite lines from physics was from this Nobel Prize winner Sam Ting at MIT, who said to me, if I can get this right: To be first and right is good; to be first and wrong is not so good; and to be second and right is meaningless. That’s deep.”
I admit, I don’t look back after months of a so-so article to see how it was improved. I can hardly keep up with the daily mailing as it is. I just hope that all this rush to put out papers and results “first” does not devaluate the quality of the data.
Also, I point out, I have received good comments from people when I put my papers on astro-ph after they have been submitted and they can still be revised before the final journal version.
An interesting discussion with some legs, since this is at least the third venue it’s jumped to this week with many dozens of thoughtful posts. Some observations. Whatever the wisdom of waiting until acceptance to post a preprint, unless arXiv itself were to enforce such a policy, some will jump the gun. At that point it is pure natural selection. If an early posting strategy preferentially pays off, then more authors will do it. Second, this is clearly an effect that varies between disciplines and likely between neighboring sub-disciplines. Astronomy as a whole appears to be very much in the vanguard here, but different topical communities will either randomly or on purpose make very different choices about the wisdom of early posting of preprints. And third, preprints of the refereed literature are not the only content appropriate for arXiv. “Priority” is not the only game in town. And researchers who aspire to tenure are not the only stakeholders.
Well, my experience has been that even the worse referee helps me improve a paper, even if it’s only forcing me to explain things in such a way that they are clearer. And, that is the version of the paper I want people to be reading. As Lee Anne notes, you want people to be using and thinking about the “final” version of the paper, not just something they found lying around on the internet. It would feel like cheating to me to post to astro-ph before having the paper formally accepted by the journal. I do pick out several people from the reference list whose work I cite, especially if I criticize it in any way, so that they can get in their two cents worth before the paper is published. That’s only fair. Should they read it on astro-ph instead? I’d feel like a jerk. I also find that many of my colleagues feel strongly enough about this subject to have adopted the same rule of thumb I have, namely that I will never referee a paper that has already been thrown up on astro-ph. If they have that little respect for the refereeing process, fine, let them “publish” on the web. But it doesn’t (and shouldn’t) count. IMNSHO.
Phil, the funny thing is that I like to post early on the arXiv early because I have tremendous respect for the value the refereeing process, not because I lack respect for it. And I don’t mind updating the version of my papers on the arXiv once official refereeing has been completed — I have no problem if people see the improvements or changes that were made along the way. That’s how science works.
So the list of papers here: Recent Astrostatistics Papers should mostly be ignored? I see.
while we are tilting at windmills, what’s up with this papers as “final” premise?
There is that fundamental difference: arXiv has revisions (and encourages us to make use of them), while a journal typically has not. I think that being able to correct a flaw or an error on arXiv is important and useful, since we all know accepted papers in journals that should have been revised.
I highly recommend Dava Sobel’s “A More Perfect Heaven”. On his deathbed, Copernicus was handed the proofs for “De Revolutionibus”, thus marking a rather final copy. Remarkably the manuscript itself still exists – basically a nearly 500 year-old preprint. As with Galileo’s work, the Church provided referees almost as challenging as Phil has had to deal with. Will astro-ph’s seven-journals-you-can’t-say-on-TV continue to fill the same role in the astronomical community of 500 years hence? Or will complete online chaos/nirvana rule? One suspects that the reality will fall somewhere between – but that Sobel’s numinous books will still be “in-print” in some format.
every time i hear this print/pre print anecdote i have to respond with this:
The Art of Computer Programming (TAOCP) by Donald E. Knuth.
yeah, the web is a ridiculous place to publish. the rights and responsibility are overwhelming. 😉
I’m strongly in favor of submitting to arXiv and to a journal at the same time. Doing it this way implies that my co-authors and I are confident that our work will survive the critical assessment of the referee AND the community. In turn, as a reader, I don’t mind reading papers, that have the “submitted to MNRAS/A&A/ApJ…” tag line; if I think their results make sense, I would also not shy away from citing such a paper.
As several people have said before, there is the chance to get feedback on your work before it gets accepted by a journal/referee. If a paper is changed from its submission to arXiv to its final publication, I don’t see this as a problem per se, on the contrary, to me it shows a heathy scientific process. That doesn’t mean, you are a free to submit any rubbish to arXiv because you’ll ruin your reputation if you do this too often.
A few thoughts…
First, I think there is a very serious generational divide here. In my experience, older scientists often view arXiv as a kind of global version of the rack of re-prints in the department library, and to therefore consider that they ought to be accepted papers, while younger scientists see it kind of as a blog, which you expect to attract comments and corrections that act as part of the refereeing process. And that explains why there are such strong feelings – there are fundamentally different views of what it is.
Personally, I don’t have a hard and fast rule – it varies from paper to paper. Sometimes there are external reasons for immediately needing to have a globally-accessible document that you can refer to in another document. Sometimes you feel hesitant about a result and need to hear what an anonymous referee has to say before you’re willing to have the wider community see it. Sometimes you think the wider community will act as a much more useful source of comments than one random referee. Sometimes you are submitting to a journal that forbids early posting. The pro/con balance is not the same for every paper.
I can’t say that I (as a younger scientist) have ever viewed the arXiv as a blog – most of what I know about astronomy originated on the arXiv. And on the contrary, in my experience graduate students are more hesitent to post to astro-ph pre-acceptance.
On a similar note, I do wish there was a certain amount of vetting on astro-ph. As a relatively new graduate student, I often find myself exploring very unfamiliar subjects on the arxiv and would appreciate outside verification (i.e. journal acceptance) that the paper I randomly picked up isn’t complete garbage. Obviously, I’m developing my own filters as I gain experience, but when I first started research as an undergrad I had very little on which to base my opinions.
Ellie: I’ve talked with quite a few graduate students about this and what I’ve heard agrees with what you say: grad students tend to be more hesitant to post early to the arXiv. Is it possible that that’s because advisors typically walk students through the process of producing their first couple papers, and when it comes to arXiv submission they recommend or even direct the student not to submit early? And perhaps give them the “cons” listed above? I know that I’ve seen that to some extent. I’m just wondering if it helps reconcile what you and Jeremy have suggested.
Jeremy: I disagree with the generational divide, and also with the notion of the younger generation using arXiv as some sort of blog. In my experience, it’s mainly the approach of your advisor that shapes your own. In the group where I did my PhD it was common practice (but not a general rule) to submit to arXiv early, so I followed this example and was never disappointed. I also know students whose advisors where reluctant to submit early (as Scott mentioned). But the age of the advisor was not the driving factor, I’ve seen examples of either approach from younger and older researchers alike.
Interesting! My perception was based on postdocs and younger faculty, rather than grad students, so it may be that there are actually 3 different generations: the two I was thinking of, followed by grad students who are more conservative in this regard because they are not yet confident in their abilities. It would be interesting to do a poll and find out people’s perceptions as a function of their career status.
I also disagree that this is a generational thing. Peter’s hypothesis seems more likely, and is certainly true for me. My advisor was very cautious and only submitted after acceptance, which is what I’ve done ever since. I’m still relatively young (first postdoc), and definitely do not view arXiv as a blog. For one, it’s acceptable practice to cite arXiv papers in ApJ, but I think they’d frown on citing random-blog-on-the-internets, or, you know, Wikipedia 🙂
There are plenty examples of both, but I think the question is whether or not the younger folks *on the mean* submit before/after acceptance. I think Peter is probably right that graduate students almost always follow their advisors, but that doesn’t mean they continue to do so after they go on to somewhere else. I don’t know if its confidence in their ability so much as just there’s no one looking over their shoulder anymore, so they’re free to do whatever seems right to them.
These are mostly testable hypotheses, though, so data would be very interesting… the question is: will the paper on the topic be posted on the arXiv before, or after acceptance? 🙂
Clearly there are both benefits and drawbacks, but one drawback I haven’t seen clearly articulated is that non-experts will read (and maybe cite) your arXiv paper.
Sure, there will be a few experts who can offer you great feedback on your astro-ph version (I’ve offered such feedback myself), but the vast majority of people who read your paper won’t. A lot of them will be observationalists/theorists reading your theory/observation paper, and will just take what you say at face value if it seems reasonable. They don’t have the expertise to correct you on your code/data analysis, the way a referee certainly should.
And when they cite you later on, they probably will look up your paper to see if there’s now a journal version to cite, but will they read the whole paper again? Unlikely.
Ignoring the fact that the OP was about what you do with your paper on arXiv, I have to say that I really appreciate the problem expressed repeatedly already about having zero domain knowledge when trying to understand unfamiliar fields using arXiv. Nightly I try to understand the papers in cs.CY or cs.DL and have no idea which papers to trust and which to ignore. In astronomy we have ADS to help us figure out who to trust and to delve into people’s track records but I don’t know what tool to use in digital humanities or CS.
Nevertheless the articulations about how any document, published, arXived or blogged, are used by scientists in some of the previous posts are downright disturbing to me and I’m hoping people can clarify their approach a bit. Does the little gold star on a Nature paper really change our belief prior by so much that we quote its result at face value? Is this really how we use the literature? Are we being lazy or trapped by TMI or are we buying into a trust model for journal orchestrated peer review that maybe we shouldn’t? It gives me an earworm:
And You May Ask Yourself
How Do I Work This? …
And You May Ask Yourself
Where Does That Highway Go?
And You May Ask Yourself
Am I Right?…Am I Wrong?
And You May Tell Yourself
MY GOD!…WHAT HAVE I DONE?
If I really make this about non-experts — about the general public for example — then I appreciate the point a bit more. It would be awesome to go down that rabbit hole — the public understanding of science via arXiv versus blogging versus journal published results — but maybe someone should write a cover story first so we have some better parameter space to comment in. I am not volunteering.
I also look forward to the day when refereeing includes people’s actual code.
There’s a sub-point to the second pro that I think is very important: posting on astro-ph can bring to the submitter’s attention related work that they didn’t know about… While in some cases this is a sign that the submitter was sloppy about doing an in-depth lit search when writing the intro, there are also papers out there that have sort of fallen off the network, but still are of great relevance. So in this culture where “credit where credit is due” is (at least in principal) a central value, the extra time to have more people see the paper before it’s locked in stone is quite valuable.
On the other hand, sometimes this is a bad thing – e.g., the e-mail I’m sure many of us have gotten that says “My paper from 1986 that showed your main conclusion in footnote 12 of page 14 of the erratum, as long as you turn Figure 2 sidewise and squint just right.” But I think this is a worthwhile price to pay given that it might also catch some important past work you (and the rest of your collaborators/scientific clique) may have missed.
Just my two cents: I agree with comments like ‘if you are confident enough to submit to a journal, you should be confident enough to show the paper to the rest of the world’. Also the early and free refereeing you get from readers of the arXiv can be very valuable. I do not agree with comments like ‘wrong results get posted, that will never get published and people are using wrong information in their work’. First: people can and should update their arXiv posting to the accepted version and indicate what has changed. If the paper does not get published, you should reconsider citing it when you are >yr or so into the future (as there probably is a reason why acceptance takes so long), and you should do some effort to find the paper you have found on the arXiv on ADS in order to see if the paper somehow did not get linked to the arXiv submission. In all those cases: check the paper to see if what you use is still valid. I think more care has to be taken by people who blindly cite papers, not by people who share their science in an early stage (other than that they have to make sure they update the info they put out).
A question and a comment.
For those who upload to arXiv when they submit papers, how many serious comments do you receive from unfamiliar authors? Following my supervisor’s example (I’m still a PhD student), I only upload after a paper is accepted, but I do circulate the submitted version to a few people outside the author list to get some extra input.
As for why grad students are timid about uploading on submission, I think that being accepted in a journal means there is one quasi-independent person who has declared that the work is of publishable quality. That makes it a bit more defensible when presented to the rest of the world.
Great discussion. I agree with many posters that there’s nothing wrong with posting to Arxiv on submission, and that it can carry significant advantages. But that statement comes with important sidenotes. A submission should always be labelled with its publication status so that readers know if contents might change with time (and authors should preferably upload a replacement once those have been made!). I think that peer review, while usually resulting in a better paper, does not ensure accuracy of the results or of their interpretation – that’s not saying the referee did a bad job, but (s)he is just one person with a limited eye and their own personal bias and that’s simply the reality.
Personally I’m actually more likely to contact an author about an Arxiv posting that is not yet accepted, as there’s the chance that my questions or comments might be addressed in the final paper. If it’s already labelled as “accepted”, well, what’s the point? (unless I really *need* the info, obviously, or I want to say something nice about the paper)
To those concerned about non-final plots or results propagating in the community, I think Marcel’s point is excellent that this is more a problem of blind citation. If you’re citing a result or referring to a plot without having checked on the paper’s status or made sure you understand its content, then you’ve not done a rigorous enough literature search. In any case, we cite unrefereed conference proceedings paper and as far as I know this has not resulted in a breakdown of the scientific process.
The issue of press releases and embargos is interesting, illustrated again this week with the posting to Arxiv of Murray-Clay & Loeb with an embargo label. Journals still insisting on embargos while allowing the paper to be posted to Arxiv are a little naive about the state of the media. As a blogger I’m allowed to write about it, but the press should keep shtum – why penalise the press in that way? And what if I blog on a mainstream media site, like The Guardian, is that allowed?
[NB Ivan Oransky’s Embargo Watch blog is a great read on the topic of embargos in scientific publishing.]
Making peer-reviewed papers freely available on the ArXiv is crucial as published journal costs are out of reach of many scientists. So if it was posted before peer-review, it is critical to update the paper on the ArXiv once it’s published in the journal. Hear, hear for Marcel.
The refereeing process is no guarantee of correctness. It just gives a false sense of security. In today’s world of huge information overload (including vast amounts of refereed papers) buyers beware. I used to advocate arXiv submission after acceptance. I completely changed my mind. I think that posting should be done at submission and clearly invite comments. I also wish that the comments to posted paper were public (threaded on the website) rather than e-mailed to the authors. The comments thread would act as a far better guarantee of quality than any refereeing. (I still would keep the refereeing process in place).
I submitted to arXiv after appectance to peer-reviewed journal because as an independent researcher I can’t afford journal’s publication fee.
No one seems to have mentioned the attitudes of the journals to arXiv. Are there journals who might not want to publish a paper if it has already been ‘published’ in arXiv? My own view (from the field of Quantitive Biology) is that in many cases neither editors nor referees are doing their jobs. The editors are probably overwhelmed and the referees are not paid. I could cite recent examples from Nature and PLoS Medicine but that would probably land me in trouble. My own view is that we should consider abandoning the formal journals which are in any case proliferating at a quite alarming rate and find a way to encourage feed-back to papers published in arXiv. We all learn who is doing interesting work and who is not and with modern search engines the fact that there may be many rather weak peppers in arXIv is not an issue for me.
Do not put your files on Arxiv. I explain more about this in the following.
Disadvantages of Arxiv:
Along with the benefits mentioned above, Arxiv has major disadvantages as well:
1- Your article will be published promptly by Arxiv without any reviewing in terms of writing or scientific. Therefore, it can be expected that most of the texts in the Arxiv have both scientific and literary mistakes.
2- Right at the same time as publishing your article in Arxiv, all the authors of the articles that you refer to them will be notified of the publication of your results by email, which if they are found to be inaccurate or incorrect.
3- One of the problems with posting content in Arxiv is that some journals do not have an interest in publishing an article you have previously posted on the Arxiv. Of course, the number of these types of journals is very low, and they are usually mentioned in the section on submit the articles that they do not accept the article published in Arxiv.
4- Since the articles are likely to have errors, so you need to correct them, in which case your main problem will start with Arxiv. In the simplest case, suppose there are a few simple writings or conceptual mistakes in your article, in which case Arxiv will allow you to correct them, but all previous incorrect versions of your article will still remain available to readers with errors and mistakes.
5- Worse of all, suppose that you realize that your results are incorrect overall, then you would prefer to withdraw your article from the Arxiv website. In this case, Arxiv again allows you to withdraw your article by placing a comment on the page of your article and mentioning the reasons for its withdrawing. But surprisingly, all previous versions of the article that you have admitted to false are still open to readers. In fact, even though you have acknowledged the mistakes of your results and explicitly stated this, others have access to the same wrong results on the web.
6- People working with you in the same area may get better results and publish them by seeing and improving your results. So, if they can publish their own results sooner than you, you will no longer be able to publish your results, and you found results that no journal eager to publish, due to better results.
7- Putting an article in the Arxiv may seem inappropriately available, but because no judgment has been made in your article, this article is scientifically and academically worthless, and an article presented at a conference is more credible and worthwhile, and has more value than an article published in the Arxiv.
8- In general, the main issue of Arxiv is that by placing an article on it, you are no longer the owner of that article, and practically there is no way to delete the previous versions of the error or to delete the article altogether if you make a fundamental mistake. This is where you have to say nothing is free without reason!
A few suggestions:
1- By considering the pages of many of the world’s leading academics and expertises in your area, you’ll probably find that most of them did not even have one file in the Arxiv. They are more interested in publishing their results in valid conferences and strong journals, so if Arxiv had a serious advantage, they also chose it. Instead, usually novice students and professors are putting files in the Arxiv. So, putting the file in the Arxiv is not a worthwhile topic.
2- Do not postpone your manuscript to Arxiv until you have been completely sure of the accuracy of your results.
3- If your university is accepting conference articles, submit your results to valid conferences that do principled review. Because writing a paper at a conference is much more credible than sending it to Arxiv.
4- You should avoid putting your results at Arxiv, because the results that a student earns in a short period is not so dramatic as he fears them that he will engage himself with the Arxiv system. Also, someone, maybe will see your incomplete results and complete an publish it before you. At the same time, putting your article in Arxiv is always an advantage to others, the advantage is that by placing your unpublished article in Arxiv, the number of references to articles from people you’ve used from their article in your manuscript is increased.
5- Avoid referring to Arxiv’s articles. Because the reference to an article that has not been officially reviewed and is still unclear that is true or false, it’s not logical.
Arxiv may be somewhat good in some aspects, but until it reaches a desirable structure, it requires some modifications in its rules and how to use it, some of which I will mention later.
1- It is better to place different options for users on how to distribute them when placing files, for example, whether the user wants to share the file on other sites or just Arxiv.
2- It should allow users to decide whether to modify the file and provide a new file if the user wishes to be left behind.
3- If the user decides to withdraw the article, allow the file to be completely deleted from the Arxiv site and its related sites. Since it is not reasonable to remain a mistake file that is made by the author himself to be incorrect, stay in the web and cause people to be confused.
4- By allowing users to remove previous versions of the error and removing the withdrawal articles, less space will be lost to the Arxiv environment for these invalid data.