Abstract

To explore whether gender bias has an effect on publication times, we looked at the elapsed time from submission to acceptance, $\Delta$ t, for female and male first authors submitting to the Astrophysical Journal. For the years 1998 and 2018, around 4000 papers were collected and analyzed to determine first author gender and $\Delta$ t. On average, papers with women as first authors take two weeks longer to be accepted for publishing than papers written by their male counterparts. Although we do not believe a week is a long enough elapsed time to give female authors a disadvantage, it shows that there are gender discrepancies within astrophysics that need to be addressed. We hope by collecting voluntary demographic data from first authors publishing to astrophysical journals that biases such as these can be resolved.

Introduction

Different fields within science, technology, engineering, and mathematics are actively investigating gender disparity in relation to systematic discrimination. Within the field of astrophysics, gender bias does exist and there are claims that it affects important research opportunities. Many studies and anecdotes have indicated that women astronomers are treated less favorably than their male colleagues with respect to education, pay scales, grants, telescope time, promotion, tenure, and receiving awards.

One study conducted by editor-in-chief of Science Journals, Jeremy Berg, looked at the gender of first authors submitting to the publication Science. Berg found that only 25 $\pm$ 1% of first authors submitting to Science, across all disciplines, were female between 2010 and 2017. Berg also looked at elapsed time from submission to acceptance of female and male authors and found that they were not significantly different from the years 2015 to 2017, but male authors were favored between the years 2011 to 2015 [1]. Using this study as inspiration, we decided to look at the time between when a paper was submitted and when it was accepted for publishing in astronomical journals for both males and females. In our work, we explore if females in astrophysics are facing a noticeable barrier with regard to their work being published in a timely manner.

Methods

For this project, the first author looked at all literature published in the Astrophysical Journal (ApJ) in the years 1998 and 2018. All volumes were published electronically and were available for download from the publisher’s website. There is a difference of twenty years between the two samples to observe general trends in the field over the last two decades.

To collect the data for this project, around 4000 papers were individually downloaded from the ApJ website. Once the papers were collected, the first author, in collaboration with colleagues, wrote a computer program to sort through each paper and pull first name, last name, submitted date, and acceptance date. For this project, only first authors, not submitting authors, were used in our data collection, as submitting author data was not easily accessible. From there, a gender determining application program interface (API) claiming 94% accuracy was used to predict whether the first named pulled from the paper was male or female. Understandably, the program, Gender API, follows gender norms related to the relevant country and predicts gender based on historical statistics pulled from populations over time. In the future, we hope to see a voluntary document filled out by publishing authors to offer insight into gender identity and race so we do not have to rely on programs to make assumptions. Even so, we still felt confident in using this method because our objective was to look specifically at gender discrepancies between males and females, though we recognize there may be error based on our assumptions.

From the 4000 paper total, there were around 1500 papers that the program could not be identify gender for. This occurred because many authors use initials in lieu of a first name. For those papers, we went through by hand to look up individual authors and determine their gender through finding their first name, website, or photograph. Even still, 74 papers remained where gender was unable to be determined. A few factors contributed to the lack of data: either we could not find any other information on the author or could not determine the gender based on the first name. A prominent example is determining gender from the first name of Chinese authors. It is extremely difficult to tell the gender of an individual only based on first name as names are often gender neutral. Without personally knowing the author, it became impossible to determine.

Results

Figure 1. Top left: Male and female histograms showing the number of papers binned by elapsed time, $\Delta$ t, in the year 1998.
Top right: Normalized histograms used to compare overall shape of female and male histograms. Bottom left: Line plots depicting the histograms in the top left of the figure. Bottom right: Line plots depicting the normalized histograms in the top right of the figure.

Figure 2. Identical to Figure 1 except for the year 2018.

Looking at both Figures 1 and 2, more male authors published than female authors. The percentage of females who published in 1998 and 2018 was 19 $\pm$ 2% and 33 $\pm$ 2% respectively. An increase in female authors could be due to the growing efforts to support and encourage females to pursue STEM related fields over the last two decades. In 1998, 1183 papers were published in contrast to 2018 where 2823 papers were published. The drastic increase in the number of manuscripts published in 2018 compared to 1998 is relevant to the growing number of astronomers over the last two decades. Noticing the shape of the distributions in 1998 and 2018, there are tails that show that authors have a relatively long elapsed time from submission to acceptance, $\Delta$ t, for their papers. Both histograms for 1998 and 2018 begin sloping upward earlier for male authors than female authors, showing male authors publish faster than their female counterparts. In 1998, there seems to be a larger percentage of papers published after the peak $\Delta$ t. In 2018 that this tail is curtailed because of the access to electronic submissions and acceptations, effectively expediting the process. This is also a probable reason for why the mean and median $\Delta$ t are lower in 2018 compared to 1998.

Further observations regarding the shape of the histogram prompted us to look at if the distributions for males and females had the same shape. The normalized histograms from 1998 and 2018 are used to compare the shape of the relevant male and female distributions. In 1998, the male $\Delta$ t seemed to be skewed toward the lower $\Delta$ t. In general, there was a large concentration of male papers that were accepted early, less than 100 days, which confirmed previous observations. In 2018, female papers seemed to be skewed toward a higher $\Delta$ t. There are higher peaks on the female histogram as the number of days increases, showing that proportionally, females take longer to have their papers accepted for submission.

Figure 3. Mean and median $\Delta$ t for male and female publications in 1998 and 2018.

Regarding the median $\Delta$ t for males and females in Figure 3, in both 1998 and 2018, there is a notable difference. The median was looked at in addition to the mean $\Delta$ t to properly account for outliers. However, looking at the median and mean $\Delta$ t was not sufficient enough to determine whether two samples were inherently different.

To investigate the differences in each subgroup further, statistical methods were used to determine if there was a meaningful variation between the observed distribution for each sample group. A Kolmogorov-Smirnov (KS) test was used to decide if the two differing distributions for females and males come from the same parent distribution. The null hypothesis states that our distributions come from the same, larger sample [2]. For each test the first author looked at the p-value, or the certainty that the two distributions came from the same sample. The found p-values were .027 and .019 for 1998 and 2018 respectively. These low p-values confirm that the null hypothesis should be rejected and that it is very unlikely that the two distributions could come from the same sample. However, although it was determined that the two distributions are different, the KS test does not explain how or why the distributions are different. Figures 1 and 2 were used to visualize the distributions and describe reasons why the shapes of the histograms are different. The possible implications of the shape of the distributions are explained in the next section.

Discussion

Through the analysis of $\Delta$ t, we found there to be a clear, but not overwhelming difference between males and females. It takes on average about two weeks longer for papers with female lead authors to be accepted than for papers with male lead authors. However, we believe that two weeks, in most cases, is not a large enough time difference to cause concern. Even though we did not believe the offset is large enough to burden females, it highlights other issues that must be addressed. Female first authorship has increased dramatically over the last twenty years; however, academic literature is still dominated by male researchers. Another issue we would be interested in examining is the role that nationality or race plays with respect to $\Delta$ t. Looking at females and males as larger subgroups helps us look at patterns within the field of astrophysics, but does not take into account intersectionality. Individuals that identify with more than one minority group, such as female and a person of color, may be discriminated against more than individuals that identify with one minority group.

We have found, as have others, that it is extremely difficult to process this type of data on larger scales. A recommendation to remedy the issues with mis-gendering and intersectionalty is suggesting more affluent journals collect voluntary data on authors publishing their work. We hope that the work we are contributing to the field of astrophysics will help us look at all aspects of academia that are affected by discrimination and encourage others to consider the scope of these issues.

Acknowledgments

The first author would like to thank Gender API for providing a powerful interface that is used to determine gender. Without this tool, it would be unlikely that a project of this scope could be completed. The first author would also like to thank Leigh Yeh and Lasse Nordahl for helping create a program to sort through PDFs, recording the crucial information needed for this project. Finally, the first author would like to thank Nathan Whitehorn, Ph.D. for helping and teaching the proper statistical methods to determine whether or female and male distribution were statistically related. Without the help of these groups and individuals, these conclusions would not exist.

Berg, J. 2019. "Examining author gender data." Science, 363(6422): 7.

Pratt, J.W. and Gibbins, J.D. 1981. "Kolmogorov-Smirnov Two-Sample Tests." Concepts of Non-Parametric Theory. J. W. Pratt and J. D. Gibbons, eds, New York, NY: Springer-Verlag.

Correlation of Time from Submission to Acceptance of Astronomical Papers with Gender of Lead Author