Training a Convoluted Neural Network to Model the Early Universe

The Cosmic Microwave Background (CMB) radiation is the remnant of the birth of our Universe. For the first 200 Million years, the Universe was too hot and dense for atoms to form so protons and electrons roamed freely. Soon after, the Universe cooled down and atoms of hydrogen began to form. This period of our Universe is called the Epoch of Reionization (EoR) which occurred between 200-Million years to about 1 Billion years after the Bing Bang. This ushered in the first star formations and the CMB radiation that was initially detected. The Hydrogen of Epoch Reionization Array (HERA) in South Africa wants to build a detailed 3-Dimensional model of the evolution of the Universe that begins at the EoR period. The Universe is abundant in Hydrogen and HERA uses its large radio antenna array to detect fluctuations of neutral Hydrogen gas which emit at a unique wavelength of 21cm. As the 21cm signal travels across spacetime, it is redshifted. We can detect and measure the fraction of hydrogen that is ionized at a particular redshift by observing the redshifted emissions. This allows us to use the 21cm measurements to accurately reconstruct the optical to the last scattering surface of the CMB.

I took 2-Dimensional slices of the EoR period from a toy mode at a particular redshift to train a Convolutional Neural Network (CNN) to identify dense regions of hydrogen gas. The algorithm mimics how neurons in the visual cortex process images. It adds weighs and biases to pixels of an image which the algorithm uses to learn from so it can make predictions about future data. The algorithm identifies what fraction of the test image corresponds to ionized regions to reconstruct the ionization history as we survey the different redshifts. The CNN normalizes the image after every convolution to make the learning process easier. After normalization, a MaxPooling function is used for the first two convolution layers to reduce the image size and capture key features of the ionized regions in the image. The data is passed through Hidden Layers which transfers the output from previous layers as inputs to the next layers. I tested a variety of CNN architectures for a 32×32 pixel image. The final CNN architecture was made up of an initial convolutional layer that began with 16 neuron inputs. The CNN takes an unorthodox approach by increasing the neuron inputs then reducing them to a single neuron output. This CNN was composed of 1,560 parameters at a noise level of 0.3. The network achieved an absolute error of 4.36% which is promising and demonstrate a network that can deal with realistic corruption of data.