Presentation #313.04 in the session Highlighting Open Science Success Stories and Challenges from Researchers and Community.
Over the past couple decades, astronomy has become much more dependent on archival data. Thanks to all-sky surveys such as 2MASS, WISE, Gaia, TESS and soon Rubin, an entire observational PhD thesis can be completed without collecting new data. However, many of the archives tasked with delivering astronomical data to the community have not advanced much in their design since the last century. Data is spread over multiple platforms and is siloed based on catalog or mission rather than object. There are no centralized archives for data products such as stellar spectra and stellar multiplicity catalogs have not evolved much past text files. Extracting data from publications is tedious as not all tables are accessible on CDS and there are no enforced data formatting standards for published data. There are also no standards for published object names making cross-checking tables a task often conducted by hand object by object. All these issues combine to act as a barrier between those who want to use the data versus the institutions that store and distribute it. With multiple initiatives to aimed at providing an “open data” environment, funded entities promise to make the data available via a website with no guidance or requirements related to making it useable with no insider knowledge as to how it was obtained. When these barriers prevent amateur, citizen, community college or K-12 scientists from participating in new discoveries, this turns into an equity, diversity, and inclusion issue as well.
Over the past seven years I have been assembling data and developing the web application for the Starchive. It contains over 30,000 stars and brown dwarfs, 270,000 fluxes and photometry, 140,000 coordinates, 1.4 million stellar parameters and 22,000 references. I am still in the process of adding stellar parameters, spectra, light curves, RV time series and high contrast images. It’s a significant undertaking which will serve research programs over multiple disciplines. It has been the tedious and time-consuming construction of the content of the database where I often wonder why such a resource does not exist. SIMBAD/CDS is useful for looking up stellar coordinates and some parameters, but I still sift through multiple Vizier-stored catalogs and publications to address my science investigations. For my talk I will discuss how my experiences building the Starchive database and web app have influenced my views of the types of archives, software resources and personnel we as a community need to make astronomy accessible to all for the next decade and beyond.