Presentation #309.04 in the session AGN, QSOs, and Galactic Evolution.
We assemble the largest CIV absorption line catalog to date, leveraging machine learning to remove the need for visual inspection. We also provide a probability to classify the reliability of the absorption system within a quasar spectrum. We used Gaussian processes to train a quasar continuum model to detect CIV absorbers. Our training set was a subsample of DR7 spectra that had no detectable CIV absorption in the previous largest (visually inspected) CIV absorption catalog. We use Bayesian model selection to decide between our continuum model and our absorption-line models. Our catalog provides maximum a posteriori values and credible intervals for CIV redshift, column density, and Doppler parameter. We validated our pipeline and obtained a classification score of 87%. We find good purity and completeness values, both ∼ 80%, when a probability of ∼ 95% is used as the threshold. We obtain similar CIV redshifts and rest equivalent widths with our pipeline compared to our training set. Applying our algorithm to 185,425 selected quasar spectra from SDSS DR12, we produce a catalog of 113,775 CIV doublets with at least 95% probability. We detected CIV absorption systems in a redshift range of 1.37–5.1, including 33 systems at a redshift larger than 5 and 549 absorber systems with with more than 95% reliability and a rest equivalent widths greater than 2 Å. Our catalog can guide high resolution follow-up observations and may be cross-matched with galaxy catalogs or other absorption catalogs to investigate the properties of the circumgalactic medium.