Presentation #123.02 in the session “Solar Physics Division (SPD): Analysis Tools and Solar Wind”.
Machine learning can be an efficient approach to discover patterns from large datasets. Supervised learning techniques often surpass unsupervised approaches for performing classification tasks on complex data. However, labeling large datasets is a time consuming process. In this study, we show that a convolutional neural network(CNN), trained on crudely labeled time sequences of astronomical images, can be leveraged to improve the quality of datalabeling in a time efficient manner that minimizes human intervention. Furthermore, a CNN trained to determine if an event takes place within the image sequence can be re-purposed, without changes, to determine the time of the event occurrence.We use SoHO/MDI videos of the solar photospheric magnetic, approximately labeled into two classes: emergence or non-emergenceof large bipolar magnetic regions. The complex interaction of solar magnetic elements often limits the ability of conventional image-processing techniques to identify this emergence, especially near the solar limb. Our results demonstrate that big datasets do not need to be perfectly labeled for supervised learning. Instead, focusing only on false model inferences can refine labeling. We also test the limits of the detection ability of our network by resampling the data both spatially and temporally to simulate other instruments.