New ADS Paper Discovery Tool

This is a guest post by Dr. Tod Lauer, an Astronomer at NOAO. Tod’s research is focused on understanding the evolution and structure of galaxies, near and far. In this post, he introduces us to an excellent new tool provided by the NASA Astrophysics Data System (ADS) for finding papers.

The NASA Astrophysics Data Sytem (ADS) is offering a new search tool, Citation Helper, that can help find papers closely connected with an input list of papers. This tool may help you identify “missing” papers that might be considered for citation, or more simply, find papers very closely related to a set of papers that you have selected.

The search works by aggregating all the references in the listed papers plus all the citations to them. The papers in the input list are then removed from the set, which is then sorted by number of hits. The top 10 papers are returned. The tool thus works as a sort of a “friends of friends” search. In this case you are finding the most common papers with one degree of separation from your input list. To make this more concrete, if 12 of your papers cite a paper not on your list, and that paper also references 8 other papers on your list, its total score is 20. The papers with the highest scores are returned to you. This tool works both forward and backwards. By looking at the references in your list of papers, you will preferentially find older papers, while the use of citations to your list will identify newer papers. Poorly-cited papers, or new papers that may have no citations yet, will still be identified if those papers have heavily referenced papers on your input list.

The genesis of this tool was to provide authors with a tool for coping with the perennial problem of missing important papers relevant to their field. Referees, for example, could check out the reference list of a manuscript that they’ve been asked to review to see if the authors may have missed a relevant work. This is not to say that the tool is intended to identify works that should be cited. The goal is simply to provide information that can be processed with appropriate judgement. More generally, this tool may allow someone new to a topic to discover key papers, based on an initial handful of references.

I been very happy with the results of the tool. For example, my search on the reference list to a recent paper that I wrote on the M31 nucleus returned the famed Stratoscope paper that I had previously considered citing, but in the end chose not to, among 9 other papers clearly related to the topic. (That decision inspired a lively discussion on citation etiquette.) In general, while it often returned papers that I knew about, but chose not to use, it also returned some papers that were closely related to the topic that I did not know about.

Presently the tool is a bit bare-bones. One must input a list of ADS codes by hand. I was able to build a list from an ADS search with the simple use of grep and awk on the HTML returned. Please share better ideas for how to build ADS codes from a reference list in the comments. My hope is that ADS will eventually offer the citation helper as an option on the main query page, as well as offering utilities to build ADS codes from a LaTeX reference list. Try it out and share what you think in the comments!

I would like to thank Edwin Henneken of ADS for developing this functionality.

6 comments… add one
  • EB May 28, 2012 @ 13:24

    If you have a bibtex file of references, one way to get the ADS bibcodes is by parsing the “adsurl” field–it’s the identifier at the end of the url.

  • Warrick May 29, 2012 @ 7:56

    For anyone who’s interested, a regular expression that matches Bibcodes is

    [0-9]{4}[A-Za-z].{12}[0-9][A-Z]

    That can be refined a bit but I haven’t had any false positives yet. If you copy your BibTeX entries from ADS, each entry should have the Bibcode at least once in the ADSURL field (as EB says). It might also appear somewhere else, in which case you can pipe through `sort` and `uniq` to get one hit per uniq identifier. So, to generate a list of codes from my `.bib` file, I run

    egrep -o “[0-9]{4}[A-Za-z].{12}[0-9][A-Z]” bibliography.bib | sort | uniq

    at the command line.

  • Edwin Henneken May 29, 2012 @ 8:53

    It was Tod who suggested this tool in the first place, we just translated his idea into code. Giovanni Di Milia made the UI part possible. Speaking of the UI part: very soon the you will be able to select records (e.g. in a private library) and press a button (just like the one to send the results to ADS Labs, for example) and the bibcodes will automatically be sent to this new tool.

  • Gleb Jun 2, 2012 @ 9:22

    Citation helper is accessible through ADS Labs interface (http://labs.adsabs.harvard.edu/). To get help in citations you must select papers and press “Export selected articles to ADS Labs” in classical ADS interface. Then in ADS Labs you can click the drop-down menu “More” and select “Analysis > Citation helper”.

  • Giovanni Di Milia Jun 5, 2012 @ 12:54

    Just a hint:
    to retrieve a list of bibcodes given a list of papers you can:
    (Short version)
    use “Custom format” with code “%R”

    (Long version)
    In a page with a list of bibcodes (any page result of a search or a private library) you can
    – select all the bibcodes you are interested in
    – then in the section “Retrieve the above records in other formats or sort order”, from the first dropdown menu select “Custom format” and then in the input form just below you insert “%R” (without quotes) and then click “Retrieve selected records”

    Anyway, like Edwin said, pretty soon a specific button at the bottom of the page will be available.

Leave a Reply

Your email address will not be published. Required fields are marked *