Galaxy clusters are tracers of massive halos over cosmic time. For cluster cosmology, it is most desirable to obtain robust mass estimates for a sufficiently large number of clusters. The diffuse X-ray and SZ emissions are much more ideal mass proxies than optical observables, while most galaxy clusters are only observed in ground-based imaging surveys. We present a semi-supervised deep learning to probe X-ray or SZ derived cluster masses using the ugriz-band images from SDSS DR12. We first establish a feature extractor by pretraining a Residual neural network (ResNets) to classify the photometric bands of the input images. Based on the pretrained feature extractor, we then train a convolutional neural network to predict the masses of 1515 clusters for which their X-ray or SZ derived masses are available. This approach greatly alleviates the problem of lacking labeled training data. The performance of our network is comparable to the optical richness estimates given by the redMaPPer cluster finding algorithm. We further perform the attribution tests on the network for model interpretation. Our semi-supervised approach is well suited for exploiting the incoming flood of data promised by the ongoing and future multi-wavelength surveys.