Presentation #140.02 in the session “Spectro-Timing Analysis Methods”.
Developing effective machine learning training labels for classifying new time domain astronomy data can be a daunting task, particularly for a classification that has many representations. This task is made worse when only a small portion of the data contains the desired class. Utilizing statistical analysis to pre-classify a portion of the data can lessen the burden of manually classification. Using TESS data as an example, we applied a binary statistical classification to ~500,000 light curves to determine whether or not each curve is an eclipsing binary (EB). After the first round of statistical classification, we restricted our inspection to all observations of any object that had at least one observation classified as an EB. This gave us 6,000 light curves to manually classify, which resulted in a balanced label set that included a variety of EBs and non-EBs, including non-EBs that were able to trick the statistical classifier.