Quasar spectra are used to study the behavior of the Lyman-alpha forest at high redshift. However, the Lyman-alpha forest itself makes predicting the underlying quasar continuum difficult, because the blue-side spectra ([1020, 1216] Angstroms) of high-redshift quasars are contaminated by the Lyman-alpha forest absorption. To retrieve an uncontaminated quasar continua, we used the high-resolution quasar spectra at low redshift (z<1) from the HST Spectroscopic Legacy Archive (HSLA) and propose a deep learning pipeline which takes the red-side ([1216, 1600] Angstroms) spectra as input, and predicts the whole quasar continua ([1020, 1600] Angstroms) as output. To robustly generalize the quasar continua on different data set at various astronomical surveys, we selected HST quasar spectra based on only one constraint, which states that the spectrum flux is well-defined in the region [1020, 1600] Angstroms with an overall median signal-to-noise ratio of at least five. In addition, we added a normalization-standardization process prior to our deep neural network model, which helped reduce the absolute fractional flux error (AFFE) approximately by half. To train our deep neural network, we used Principal Component Analysis (PCA) and Gaussian Mixture Model (GMM) to categorize HST quasar spectra into four classes and synthesize mock quasar spectra to generate training data set. Our neural network model achieved an average absolute fractional flux error of 0.8% on the training quasar spectra, which is approximately ten times better than traditional PCA-based prediction methods, and 6.3% on the testing quasar spectra. Our neural network model is used for SDSS-DR16 quasar spectra at high redshift (z>2.5) and finds that the decrease in mean flux evolution with redshift in the Lyman-alpha forest. This change confirms that the absorption at the Lyman-alpha forest by the intergalactic neutral hydrogen increases with a higher redshift.