Skip to main content# Investigating Quasar Emission-Line Variance Using a Variational Autoencoder

Published onJun 01, 2020

Investigating Quasar Emission-Line Variance Using a Variational Autoencoder

Quasars (QSOs) are not perfectly understood despite decades of study. Quasar spectra are quite diverse: while most quasar spectra show the same emission lines, those lines do not always appear in the same ratio or with the same shape. Characterizing the variations in the appearance of emission lines is important in order to understand the physical processes that drive those changes. Empirical studies of large samples of quasar spectra can clarify the most significant deviations among objects and their associated physical processes. One statistical technique commonly applied to large datasets is principal component analysis (PCA), which can identify variance in a dataset and reduce its dimensionality. PCA is inherently limited by the fact that it is a linear analysis. Quasar spectra show significant deviations in the width of their emission lines, which is a nonlinear process. An alternative to PCA is an autoencoder, which can function like a nonlinear generalization of PCA. An autoencoder consists of two neural nets, an encoder which maps each input into a low-dimensional latent space, and a decoder, which maps points in the latent space back to the input space. By training the model to condense and then reconstruct given inputs, it can learn an efficient and informative lower-dimensional representation of the input space, in a nonlinear way. We study a sample of quasar spectra using an autoencoder variant called a variational autoencoder (VAE), which combines the autoencoder with Bayesian inference by modeling the latent space as a probability distribution. Drawing from the latent probability distributions allows the VAE to function as a generative model, with potential applications to spectral synthesis models like SimBAL (Leighly et al., 2018). We collect a sample of rest-frame ultraviolet quasar spectra from the SDSS DR 14 Quasar Catalog, removing all but the emission line features. We study this sample using PCA and VAE analysis, comparing the performance of each method at characterizing and accurately reconstructing the variations in the dataset. We find that the VAE is not significantly more accurate than PCA at reconstructing spectral features of this data, potentially suggesting that the most significant sources of variance in these spectra are more linear than not.