Presentation #302.12 in the session Computation, Data Handling, Image Analysis — iPoster Session.
The design of Montage enables it to process images equally well on desktop machines and in parallel on clouds and clusters. The high volume of modern datasets requires parallel processing and we are addressing this with two all-sky data sets (WISE and TESS) on the Amazon Elastic Compute Cloud (EC2).
The WISE data set contains over 7 Terapixels in four bands and is being used to create Hierarchical Progressive Survey (HiPS) maps intended for interactive visualization at various spatial scales. Such maps are in wide use for data exploration and outreach. The TESS data set contains many more images stacked in a time sequence, 103 Terapixels in all. Our goal is to create a single co-added map for addressing important topics in low surface brightness astronomy, such as galaxy formation. The requirements for both have been informed by successful prototyping efforts. Both require parallel platforms and large-scale storage: estimated total processing times are the equivalent of 126 CPU days for WISE and 492 CPU days for TESS, with 64 TB storage for WISE and 210 TB for TESS.
Prototyping efforts have used Montage command-line tools in a set of job scripts run on single machines or in parallel on a local IPAC SLURM cluster. For the cloud, formal workflow managers add data staging, task dependency management, task scheduling and balancing, failure recovery and performance monitoring. We have a long history of collaboration with the developers of the Pegasus Workflow Manager (ISI/USC), a mature tool widely used in many disciplines. Recently, we demonstrated how the Pegasus Python API can be used in a Jupyter notebook to construct a Montage processing plan suitable for execution on a wide variety of platforms.
To manage the economics of running such workflows on the cloud, we have been experimenting with approaches to controling resources. Traditional processing uses local resources which have already been paid for whether or not they are actively being used. On the cloud we must dispose of resources except when used. So in addition to the image building, we are developing best-practices for this type of processing.