This LISA presentation describes some new features of ADS and gives examples on how to use the ADS beyond basic searching. Use the full text search, affiliation search, and visualization tools to become an ADS power user.
The Astrophysics Data System (ADS) offers several new powerful ways to explore how research data is being used. Our full text index allows searching the full text of more than 6 million articles, including ones that may have access restrictions as well as many older ones which may be historically significant. We continue to improve the breadth and quality of our external linking, providing links to both data and software used in the process of writing articles. We have recently implemented our affiliation search, which gives users the ability to search and filter results by organizations, allowing for better author disambiguation and improving the ability to create institutional bibliographies. In addition, our visualization tools allow a variety of insights into the data which can be used to cluster and display results potentially leading to new scientific conclusions.
The ADS offers a full text index of almost 7 million articles, corresponding to just over 40% of our holdings. The majority of these articles are from the modern literature, including those published on arXiv, however there are also several hundred thousand articles which are better described as historical literature, typically published before 1970. Figure 1 shows a histogram of these articles.
To search the full text index, use the “full:" qualifier in the ADS search box. This search combines title, abstract, keywords, body, and acknowledgements into the index. It should be noted that each of these fields can also be searched individually, which provides users the flexibility of searching only a subset of the index if desired. Results of the search are shown with a snippet of surrounding text for context, both for open access articles and for articles which have access restrictions.
Providing links to external resources is one of the most powerful features of the ADS. 70% of our 15.4 million records have DOI links providing users with the ability to view the articles at publisher websites. Direct access to PDF full-text documents is provided for approximately 30% of our records. In addition, we have data links for 270,000 records, software links for 9,000 records, and a number of other types of external links such as those to SIMBAD, NED and numerous other data archives. These links are provided both in search results lists and in the individual article display. Figure 2 shows the icons displayed in results lists and the typical types of links listed within each.
ADS currently indexes about 35 bibliographic groups allowing projects, telescopes and institutions the ability to maintain bibliographies which can be used as filters or collections in ADS. To ensure that these bibliographic groups are properly represented in our system we ask that they:
Be stored in an ADS library
Be regularly maintained
Provide clear criteria for inclusion
Bibliographic group selection can be applied either directly at search time using the search field “bibgroup" or by using the filters in the panel to the left of search results lists.
One of the features new to ADS since the new interface has launched in 2015 is the ability to search by institutional affiliation. The 15 million publications in ADS have more than 35 million combined author affiliations. The ADS has long wanted to have these data in a searchable format, and we introduced a new curated affiliation feature in early 2019. The project involved matching existing publisher-provided affiliation strings to unique, curated affiliation identifiers and institution strings, stored in an internal affiliation database, and constructing a pipeline to match the publisher-provided affiliation strings in incoming new publications to the appropriate entry in this database.
We have currently assigned identifiers with parent/child relationships, such as an academic department within a university. A child may have multiple parents, but we restrict a child from having children of its own. This has required a few modifications to remain useful. For instance, so that University of California schools can identify departments, we have assigned them a parent status, even though the “University of California System” should really be the parent level. Likewise, NASA’s Goddard Space Flight Center is at a parent level, as are France’s CNRS institutions to allow for further subdivision. Further work on a schema to allow more complex relationships between institutions is under development in conjunction with work by the ROR Community (https://ror.org).
The two primary ways to search by affiliation are with the fields:
“aff" - searches the raw affiliation string word-by-word
“inst" - searches the canonical institution name listed in our mapping of organizations to identifiers.
The advantage of using “inst" is that it returns all variations of an institution that we have been able to map together. In addition, other identifiers (such as ROR and grid) work with the “inst" search. Our controlled institutional vocabulary is maintained as a github repo, available at https://github.com/adsabs/CanonicalAffiliations.
To search directly for a department with our controlled institutional vocabulary, use a slash ("/") to specify the parent/child hierarchy, e.g. “inst:UCLA/IGPP". Alternatively, the Institution filter in the left panel of search results lists also allows filtering institution to department level
Visualization options in the ADS provide ways to illustrate relations between authors, papers, and concepts. They are available from the "Explore" pulldown menu. Visualizations include the “paper network," which provides a clustering of search results based on the connections between papers in the citation network. This visualization provides the option to show histograms of the frequency of papers published in each of the detected groups over time. The “author network" leverages the co-authorship information of a list of papers and detects groups of authors and connections between them. Visualizations are interactive, allowing users to explore the papers associated with each author, collaboration, or topic detected. Visualizations can easily be restricted to a subset of the results and can be reconfigured to block results together by number of occurrences, number of citations, or number of downloads. Examples are shown in Figure 3.
Author Network: takes the top 200 most frequently appearing authors within your result set, and displays color-coded clusters of authors representing collaborations based on co-authorship frequency analysis.
Paper Network: clusters groups of papers that share a significant number of references, and assigns keywords to those groups by looking for shared, unique words in their titles.
Concept Cloud: clusters words from the titles and abstracts of your search results, counts their frequencies, and compares them to the same word’s frequency across the entire ADS corpus.
Using ADS beyond simple searching provides several powerful ways to enhance research. There are a number of resources which explain how best to use these features, including our help text, our blogs, and a number of videos on our youtube channel. A list of those resources is provided below.