Growing up by the mountains of Northern Greece, Hercules (aka Iraklis) Konstantopoulos developed a fascination with the night sky and all its intrigue. After a career as a researcher in astrophysics that spanned ten years and four continents, he became drawn to addressing a greater variety of data-related problems. Data science ensued with work on sustainability and energy management, and now a spot as Sr Data Scientist at Atlassian, Sydney’s most acclaimed home-grown tech shop. There he focuses on converting support tickets and behavioural data into product strategy and business direction, and on creating informative and accessible data visualisation. This post is cross-posted from Dr. Konstantopoulos’ website.

In a past life I was a researcher in Astrophysics. I spent a decade or so in academia, a pretty tough field ruled by the old adage of having to publish or perish.

Publications and citations are the only primary, quantitative success metrics for a doctoral student or a postdoctoral researcher. Like every system that relies so heavily on metrics, researchers (as a group) tend to game publication and citation rates in order to advance in this very bumpy playing field.

One stereotype surrounding young researchers is that they publish more articles at the cost of quality. Another noticeable effect is the lengthening of author lists in recent years. Opinions are mixed on whether these are inherently negative effects, but there is some consternation expressed, particularly by more established academics, about dilution of quality.

How warranted are these complaints? Can we quantify the change in author lists over time? And can we tease out any effects on the impact of a paper?

To find that out I used the API offered by the NASA Astrophysics Data System (ADS) to fetch the author list of every paper published in Astrophysics since 1900.

How many papers, you ask? Yeah, a bunch… Around one million papers, to be exact.

Just how much has traffic increased?

“I can’t keep up with all this literature” has been a commonplace complaint for a while now. Every field is expected to grow over time, but an individual can still only parse so many papers. This focuses researchers into sub-fields, which increases depth but reduces breadth. That isn’t a bad thing, it is just a different mode of operation.

Fundamental change can bring about some existential dread, and Astrophysics is experiencing a bit of that. It is typical for researchers today to follow the daily cadence of article releases in order to keep up only with their subfield. There is living memory, however, of a time when a researcher in a subfield of Physics could keep up with the entirety of Physics.

Let’s have a look at the total number of Astrophysics papers as an indication of the overall size of this field.

That’s quite a picture! We may be observing the Cold War add resources to Space Science as the ramp-up through the 1960s. The proliferation of articles in the 1970s is largely due to the advent of the charged-couple device (CCD). This gadget revolutionised astronomy about 40 years before it transformed photography: it is the miniaturised detector that allows your smartphone to take beautiful, crisp, digital photos.

With this steady increase in publications after five decades of stability, I am curious to find out the centre of mass for this timeline: there will be a year that divides the overall number of articles roughly in two.

In the timeline of 1900 to 2018 that year is 1996. That is, the last sixth of the timeline accounts for half the activity. I find that wild! It implies that recently minted researchers operate in a fundamentally different terrain that those who got their PhDs before 1990, and an entirely different world to those who academically came of age by 1970—the esteemed emeriti of their institutions.

Have author lists appreciably inflated over time?

The short answer is yes.

The median number of authors remained at one for the first ~70 years of this timeline. It climbed to two and only stayed there for 30 years, only to last at three for a mere 15 years. One might expect four by 2025 . The mean has seen a huge inflation since the turn of the millennium, drawn out by enormous collaborations in the style of Particle Physics.

Fewer and fewer papers are written by individuals or small groups.


The stretching author lists make it clear that we need to bundle into a few discrete eras before we can delve into details. In the following chart I aggregate these numbers into a cumulative histogram: each step (bin) includes all the data for the one before it.

Cumulative histograms show what proportion of the whole population we are covering as we sweep in one dimension. In this case the proportion of all papers (in each period) that were authored by one, two, three, twenty people. The aggregate over all time stops shy of 100% as roughly ~5% of papers have more than 20 authors.

The breakdown over eras show a staggering variation, and a monotonic evolution: fewer and fewer papers are written by individuals or small groups of authors.

We can slice this a different way to more specifically visualise the length of author lists as it changes over time. First we map the number of articles of N authors over time.

Read the above in horizontal bands: e.g., between 1970 and 1995 there were 5000 papers a year with a single author.

Second we have the proportion of articles with N authors—read this in vertical bands.

From 1900 all the way to 1970, single-author papers represented over 50% of the whole for the year. After 1970 there isn’t such a dominant mode with papers spread thin vertically.

Impact factors

Articles are published in journals and they are cited by other researchers. Not all journals are considered equal: the prestige of a journal is determined by its impact factor, roughly the number of citations their articles get on average. One such definition of impact is CiteScore. The next chart shows the lay of the land for journals where Astrophysics papers are routinely published.

There are some high-profile journals and a bulk of lower-impact publications. Annual Reviews is the runaway champion here. I do not include Nature and Science, the world’s premier multidisciplinary science journals, because I want to compare Astronomy journals consistently among their peers.

Going back to the perception that quality and quantity are at odds, one immediate observation is that prestigious journals publish just as many articles as the rest. In fact, the three journals that publish the most are in the top quartile by CiteScore.

Of course we are covering a huge timeline here. Have astronomers’ attitudes toward citation changed? Yeah, leading question:

The slump after the turn of the millennium illustrates a delay effect: it takes a long time to rack up more than a couple dozen citations.

Contemporary astronomers publish more and cite others more often.


Again, if quantity dilutes quality, articles in these prestigious journals should have a smaller median author list and fewer papers with many authors. That is not the case:

Median values               N authors    CiteScore      Article count
Top quartile                       3               4.09              9967
Bottom three quartiles           2               1.23              422

Over all time, top-tier journals publish more papers with longer author lists.

Let’s look at trends over time. What happens to our previous plots if we restrict to high-impact publications?

While we do get a higher proportion of single-author papers in high-impact journals, the overall distribution extends to a longer tail of many authors.

In the breakdown over time we see stronger clustering than in the overall population where we saw strong divergence. But the trend over time, toward more authors is still strong.

Let’s summarise this part. Top-tier journals:

  • feature a higher proportion of single-author papers
  • publish more articles by an order of magnitude
  • feature significantly longer author lists (P≈0.0)

So, what have learned? If we accept that impact factors track quality we can reject the hypothesis that quantity dilutes quality.

Authorship as a decider for citation count

So far this is clear as mud… Up is down, good is bad, and your grandmother’s cooking is kinda meh.

What if we get rid of the middle man? We are trying to relate the length of an author list to the impact of the article. How do the two quantities plot against each other?

At first glance we seem to have an exponential drop-off in citations as the author lists become longer. Scaling both axes by their logarithms we can get a clearer picture, which is what we have on the right. Nothings jumps out of this picture, perhaps a shallow decline of the citation rate with increasing author list length.

Let’s just throw these into a regression model and see what happens. The simplest model, N(citations) ~ β * N(authors), should not be expected to perform very well given the exponential relation, but I’ll show it here for reference. The R-squared value suggests that we explain about 0.3% of the variance this way. We get a y-intercept at 20 citations and a coefficient β=0.2. This means the typical paper gets a very sporting start with 20 citations and gains roughly one citation for every five added authors.

A log-log model is more appropriate: log N(citations) ~ β * log N(authors). Yet this only explains 2% of the variance. We get a (log10) intercept of 0.91, that is 8 citations, with 1.8 citations added for each 10 authors. So the slope is shallower when we apply a more fitting relationship.

The number of authors has no predictive power over citations.


While the trend is positive the regression models state that the number of authors is of no use as a predictor of citation count. This persists if we perform over past timelines, e.g., papers published before 1950. We did learn above that the number of authors has been inflating recently and it stands to reason that it takes a while for a paper to gain popularity and rake in those citations.

Alas, we have no timeline of citations for individual papers so we cannot test this. But we can caution against over-interpreting the apparent decline toward the right of the log-log plot: many massively multi-author papers are recent and have not had the time to get those cites.

Great, there are more papers. So what?

Publications do not cover the full gamut of research activity. As an example, I spent most of my last postdoc helping to build an instrument for the AAT, a wonderful telescope near Warrumbungle National Park in New South Wales, Australia. This instrument was used to conduct a large survey and hugely benefited the global science community.

That’s great, right? Well… that building activity dampened my publication rate from one or two first-author papers per year to about one in two years. Oops ¯\_(ツ)_/¯

How do we ensure that young researchers are not disadvantaged by doing work of great community value through these massive projects? We reward them with publications. Yay! How? We grant them inclusion to every paper the project produces. This is a reasonable way to restore balance to the CVs of these individuals, but it brings us back to the effect of publication list inflation.

The way I have perceived this disgruntlement over the years it all hinges on the number of publications, and the number of authors. The feeling is that papers are devalued by long author lists, and that liberal inclusion of coauthors is harmful to the research field of Astrophysics.

We have learned that there is no link between the length of the author list and the number of citations. Does this mean we can look past this now?

{ 0 comments }

Susan Mullally is a Senior Archive Scientist for the MAST at STScI. She is an astronomer who likes working with time series data and improving the reliability of exoplanet catalogs.

The Mikulski Archive for Space Telescopes (MAST) at STScI introduces a new way to explore the archive’s exoplanet data: exo.MAST. For confirmed planets and planet candidates, the web interface brings together planetary parameters with a filterable list of the data products held by MAST and visualizations of those data.

The user starts with a single search bar that autocompletes to the confirmed exoplanet, Kepler Object of Interest, or TESS threshold crossing event (TCE) as the user types.

The result is a targeted cone search of MAST data at the coordinates of the planet’s host star at the bottom of the screen. Star, planet, and orbital properties are shown at the top-left and various data visualizations are shown on the top-right. For confirmed planets, users can choose to see this catalog data from either the exoplanets.org or the NASA exoplanet archive.

On the right, there are choices of several visualizations of the data. By default you are shown the data coverage of the MAST-held observations phase folded at the orbital period of the planet, if known. Selecting observations on this plot will filter the MAST holdings in the bottom panel.

MAST is the archive for Kepler, K2, and the latest planet hunting mission, the Transiting Exoplanet Survey Satellite (TESS). For all transit signals found by the TESS or Kepler pipeline, the mission provides the detrended light curve and a fit to the transit signal. Exo.MAST takes these data and gives scalable and zoomable plots of the most recent detrended light curve and transit model. These plots can also be obtained via API so that they can be embedded into your web page. For example, here is the folded light curve for WASP-18 b, a Jupiter-sized planet observed by TESS.

From the related links tab, there is a link to ADS with a pre-populated search of the literature for the planet and the host star to help the users dig deeper into the scientific literature. For TESS and Kepler exoplanets, this tab will also link to the reports used by those missions to review the transit signal before deciding if it’s a candidate or confirmed planet.

For exoplanets that have atmospheric characterization measurements, exo.MAST gives you access to the Space Telescope Archive of Transiting Exoplanet Spectra (STATES), which is a database of published transmission and emission spectra started by Dr. Hannah Wakeford. The data are plotted and are available for direct download. The STATES measurements are linked to the source paper and to the observations hosted within the archive so users can directly download and analyze the data themselves.

In the near future, MAST will be hosting the exoplanet characterization data taken by the James Webb Space Telescope (JWST). exo.MAST plans to integrate tools from the exoplanet characterization toolkit (exoCTK) to help users plan their exoplanet observations with JWST more easily.

Have questions or features you would like to see? What data visualizations would be most valuable to you? Comment below or e-mail us at archive@stsci.edu.

{ 0 comments }

The partial shutdown of the United States government is have far-reaching affects on US-based astronomers. The American Astronomical Soceity (AAS) sent an action alert to its members today.

“Many federal agencies, including NSF, NASA, and Smithsonian, have been shut down to all but essential operations for 33 days and counting… Hundreds of AAS members are missing paychecks, and the number is growing, including contractors and early-career postdoctoral researchers at NASA and NSF. University and grant-supported AAS members face longer-term uncertainty as NASA grant deadlines and NSF proposal reviews are postponed indefinitely. National observing facilities are preparing to cease operations entirely.”

To provide information and resources for astronomers during the shutdown, the AAS has created a “Shutdown Central” page. This page gives information about financial support, deadlines, and news coverage.

If you have further information to add to the AAS Shutdown Central site, please email shutdown2019@aas.org.

{ 0 comments }

Our guest post today is by Dr. Sarah Gallagher and details important guidelines for hosting effective remote meetings. Dr. Gallagher (@scgQuasar) is the Science Advisor to the President of the Canadian Space Agency and an Associate Professor in the Department of Physics and Astronomy at the University of Western Ontario in Canada.

Research groups and committees with members from different institutions or offices are meeting via videoconference increasingly often. Remote meetings save time and money and have a smaller carbon footprint, but if done poorly, videoconferences are significantly less effective than face-to-face meetings. In particular, remote members who aren’t able to hear what’s happening or follow the visuals can become exhausted and frustrated, which inhibits engaged, informed participation. Below are some best practices to set up videoconferences as well as recommendations for interaction protocols to retain the benefits of the in-person experience as much as possible.

The recommendations below assume a common setup where an institution (the host) has an in situ group of several people (3 to 15) and additional members are connecting remotely via laptop and/or telephone.

1. Use good software and invest in the professional version

My favourite is Zoom. I’ve had good experiences with Bluejeans, and both good and bad experiences with Webex. Do not use Skype: it’s not reliable enough. The software should have screen sharing, computer audio and video enabled, and allow for telephone call-ins (preferably with a toll-free number.) The host should have a wired connection and be aware that remote participants may not. All participants must first give their permission if the meeting will be recorded.

2. Build in communication redundancy

Assume that someone will encounter a problem accessing the video-conference system at some point and therefore will need a backup way of communicating, such as texting. To minimize delays, distribute a list of each participant’s email address and cell number in advance so there are alternate ways to communicate within the group. This is particularly useful for remote participants to check in with each other if one of them loses audio or screen sharing becomes too low resolution to read. Live, collaborative note-taking is also useful, particularly if a document needs to be produced following the meeting. My favourite is Google Docs because documents can be accessed directly by participants with a browser and therefore don’t require the videoconferencing software to share.

3. Set up the host conference room with the remote participants in mind

Ideally, a dedicated videoconferencing room should be set up with a camera that can see the whole room and a quality microphone that will clearly pick up the audio from the in situ participants. Quality external speakers are also required. (A large in situ group requires more sophisticated audio equipment such as distributed microphones in the conference room that can be individually muted.) If you have limited resources, invest in the audio equipment first. A camera located close to a large screen with the video feed and any visual presentations will create the most natural experience for the remote participants. A single laptop in front of one of the in situ participants or located in a corner of a conference table DOES NOT WORK, and creates a horrible experience for the remote participants. They generally can’t hear anyone except the person in front of the laptop, and they can’t see most of the in situ participants.

4. Consider time zones when scheduling the meeting

A meeting with participants in three to five adjacent time zones can be scheduled within typical business hours for all parties, between noon and 5:30 p.m. in the easternmost time zone. A meeting with participants in time zones further apart (e.g., Beijing, Toronto, and Paris) means someone will be calling in outside typical business hours. While many people don’t have too many conflicts at 2 a.m. (other than sleeping), insufficient sleep can ruin productivity the following day, and is an imposition on your participants. If this has to happen, then the pain should be distributed so that the people in a particular time zone are not the ones who are always calling in at awkward hours.

5. Schedule regular breaks and keep to the schedule

Remote participants are likely busy with other responsibilities and have only allocated the time scheduled for the meeting to participate in it. The host is not providing them with snack breaks or lunch, but should respect that they need both and stick to the schedule. It can be physically uncomfortable to sit in a chair and stare at a screen for hours. Breaks should be scheduled at most every 90 minutes to allow people to stretch their legs. The videoconferencing system should also be left on during breaks so remote participants can join in with casual conversations if they so choose.

6. Minimize presentations to maximize discussion time

The value of having everyone in one (virtual) place at the same time is the opportunity to develop relationships and discuss the issues at hand. I’ve participated in many advisory meetings where we were bombarded by hours of overly detailed and often redundant presentations because the presentations were not coordinated and the presenters all went over their allotted time because of ineffectual moderating. All visuals and supplementary material should be provided to the participants at least several days in advance. This allows remote participants to use the downloaded slides in case there is a problem with the screen sharing video feed (e.g., it cuts out or the resolution is too low), and also allows participants to review materials in advance so less time can be spent presenting them. Slides should all be numbered with the full name, title, and contact information of the presenter included. Twenty minutes should be considered the maximum time allowed for a presentation. Presentations can be shortened by using a reduced set of big-picture summary slides and including backup slides with further details that can be shown for questions and discussion as needed.

7. Be respectful of the newbies

Invariably some members of the committee will be new. This means they don’t know everyone already, can’t recognize people’s voices when they start talking, and they have no idea who “John” is (and there will likely be two Johns). These are important reasons why videoconferencing works better than teleconferencing for remote meetings. Start with a round of introductions and when each person speaks (at least for the first few times), they should state their name and affiliation. Refer to people not present in the meeting by both first and last names with a frame of reference (e.g., Janelle Monet, Director of Programs), and minimize the use of acronyms without definition. The goal is to bring everyone up to speed quickly with sufficient context to participate fully in the discussion.

8. Moderate the meeting and establish videoconference etiquette at the outset

In general, it works better if the Chair (who may be remote) is not also managing the logistics of the meeting. An in situ moderator should be appointed to set up the meeting and manage the connections, etc. Testing connections prior to the start of the meeting is a good idea, particularly for people using the system for the first time. At the start of the meeting, the Chair or moderator should state the expected etiquette, including the recommendations in point seven. In addition, remote participants should mute themselves when they are not talking to minimize background noise: even paper shuffling and typing can be distracting. Headphones help reduce echoes. The in situ moderator should monitor problems such as audio dropping out or poor camera sightlines. It’s useful to have private chats enabled so remote participants can inform the moderator when there are problems without disrupting the meeting. For larger meetings that require a more formal structure, participants can virtually “raise their hands” and the moderator can call on people in order. This gives everyone an opportunity to contribute without interrupting each other; it can be particularly difficult for remote participants to cut in and be heard.

These best practice guidelines were specifically inspired and informed by my experience on several advisory committees that evaluated the performance of and made recommendations to institutions that serve scientist user communities. Poor videoconferencing implementation has a negative impact on the experience of the participants and can make serving on such committees inefficient and frustrating. Institutions that respect the time of members (who are generally experts volunteering their effort) and optimize their experience will obtain the best advice.

{ 0 comments }

Resources for the AAS Winter Meeting #AAS233

by Joanna Bridge January 4, 2019

It’s that time of year again: The 233rd Meeting of the American Astronomical Society is nearly upon us. To make the most of your time at the meeting, we at AstroBetter would like to remind you of some resources available on the blog and wiki. First, the post that everyone attending a winter AAS meeting […]

{ 2 comments }

Read more →

An American perspective of astro graduate school outside the US II: PhD

by Guest December 17, 2018

This is the second of two guest posts contributed by Dr. Abbie Stevens, who completed her Masters at the University of Alberta in Canada and her Doctorate at the Universiteit van Amsterdam in the Netherlands. She is now an NSF Astronomy & Astrophysics postdoctoral fellow at Michigan State University and the University of Michigan. In […]

{ 0 comments }

Read more →

A Workshop That’s All Coffee Breaks: Astro Hack Week

by Guest November 26, 2018

Daniela Huppenkothen is the Associate Director at the Institute for Data-Intensive Research in Astrophysics and Cosmology (DIRAC) at the University of Washington and a Data Science Fellow at the University of Washington’s eScience Institute, where she works on astrostatistics for astronomical time series, and is interested in everything from asteroids to black holes. She is […]

{ 0 comments }

Read more →

An American perspective of astro graduate school outside the US I: Masters

by Guest November 19, 2018

This post is the first of two guest posts contributed by Dr. Abbie Stevens, who completed her Masters at the University of Alberta in Canada and her Doctorate at the Universiteit van Amsterdam in the Netherlands. She is now an NSF Astronomy & Astrophysics postdoctoral fellow at Michigan State University and the University of Michigan. […]

{ 3 comments }

Read more →

Citations to Astronomy Journals 2: Ranking the Journals [Cross-post]

by Joanna Bridge October 22, 2018

This article is the second of a three-part series that is cross-posted from the ADS blog. In this series, the ADS team has performed an analysis of citations to astronomy journals over the last 20 years. This post is written by ADS project scientist Michael J. Kurtz and Edwin Henneken, who works on the ADS system […]

{ 3 comments }

Read more →

Predatory Publishers in Astronomy and How to Identify Them

by Guest October 13, 2018

Our guest post today is from Dr. Michael Brown of Monash University, Australia, where he studies the evolution of active galactic nuclei and the growth of galaxies over cosmic time. He has written several articles on the topic of predatory publishers and conferences. Have you checked your spam folder recently? A decade ago it may […]

{ 4 comments }

Read more →