Presentation #431.02 in the session The Sun and Solar System III.
Solar flare prediction using modern machine learning models has been an active field of research for the past several years. Due to the impulsive, episodic, nature of solar flares, the datasets used to train the models, whether from solar magnetic field or atmospheric imaging instruments, are highly imbalanced: there are always many more “non-flare” data than “flare” data for any given prediction window. This dataset imbalance has two major impacts: one, it forces adaption of training algorithms or model parameters that lead to unacceptably high false-positive rates (FPR); and two, it skews the skill metrics used to evaluate predictive performance of any model.
Here we demonstrate a hybrid Convolutional Neural Network (CNN) and Extremely Randomized Trees (ERT) model that is trained and tested on fully imbalanced Solar Dynamics Observatory (SDO) Helioseismic and Magnetic Imager (HMI) vector magnetic field data but which achieves a 48% reduction in FPR relative to traditional single-architecture models for a 12-hour forecasting window. The reduction in FPR is accompanied by only a slight reduction in true positive rate (-12%), leading to a slight decrease in the True Skill Score (TSS), but a large increase in the Heidke Skill Score (HSS) and F1 score. The addition of the ERT stage to the “deep learning” CNN model has the added advantage of enabling ranking of magnetogram features used to achieve a high skill flare prediction. We find that the probability of flaring provided by the CNN model is the most predictive input, followed by the Schrijver R-parameter, measures of magnetic field topological complexity, and then the total unsigned vertical current and helicity. The resulting model could be transitioned to operations to increase the short-term forecasting skill of human-in-the-loop solar flare prediction systems currently in use in space weather forecasting offices.
We also demonstrate that the SDO Atmospheric Imaging Assembly (AIA) extreme ultraviolet (EUV) images that are concurrent with the HMI magnetic field data can be used to both replace the NOAA GOES X-ray flare catalog as a source for supervised learning data labels, and as an additional data source for increasing the skill of ML flare prediction models.