Skip to content

2026 Technical Course:  Using MsPASS for Data Processing on HPC and Cloud Systems

Date(s): July 13-17, 2026
Location: Virtual

Course Description:

This course is an evolution of the MsPASS course taught over the past two years. Facilitated by the EarthScope Consortium, this course is designed to introduce participants to MsPASS, a powerful, modern framework for processing seismology data on desktop, High-Performance Computing (HPC), and cloud systems. The course aims to teach seismologists how to effectively manage waveform, source, and receiver metadata using MongoDB or the Amazon Web Service equivalent called DocumentDB.  The course will help students understand modern computing cluster concepts and learn how to utilize MsPASS to parallel-process large data sets.   With the transfer of the seismology archive to cloud storage the course will teach students how to use GeoLab to create workflows that generate reproducible data sets from archival, seismic data.  Session content will be as follows:  (1) introduction to data analysis with MsPASS, (2) parallel processing with MsPASS with application to retrieval of event data from the Earthscope cloud archives, and (3) parallel data processing for data reduction on HPC or cloud systems.

Primary Audience:  Any seismologist, primarily graduate students and postdocs.

Secondary Audience:  Seismologists at any level who can handle the computing elements of the course.

Learning Objective:

  • Gain experience using MsPASS for seismic data analysis.
  • Perform parallel processing workflows using MsPASS.
  • Work with seismic data archives.
  • Apply cloud-based workflows for seismic data access (Session 3 focus).

Participant Commitment:

Participants attend 3 online sessions (4 hours each). Exercises require approximately 3–4 hours per session (12–16 hours total offline work).

Prerequisites, Computer and Data:

Coursework/Content Knowledge:

  • Basic seismology.

Scientific Computing Skills Required:

  • Python (Functional)
  • ObsPy (Beginning–Developing)
  • Jupyter Notebooks (Functional)
  • Linux/UNIX Shell Scripting (Functional)
  • Signal Processing (Developing)

Hardware:

  • Hardware sufficient to work with GeoLab; modest desktop or laptop recommended for installing a local copy of MsPASS.

Software:

  • Web browser.

Internet:

  • Performance sufficient to participate in Zoom sessions.

Brief Agenda:

Tentative agenda is listed below, and subject to change.

Pre-Course Work
  • Verify user account GeoLab
  • Run example processing workflow in GeoLab
Session 1 Introduction to data analysis with MsPASS (4 hours)
  • Data object concepts in MsPASS
  • Data, Metadata, and their relation to the archaic concept of trace headers
  • Data management with the MsPASS document database (MongoDB or DocumentDB)
    • Document Database concepts
    • CRUD operations
    • Mongo Query Language
    • Indexes
  • Data editing with MsPASS – the “kill” concept
  • Error logging in MsPASS
Session 2 Cloud data access (4 hours)
  • Parallel Processing Concepts
  • Cloud Computing and MsPASS service abstraction
  • Cloud object store concepts
  • Organization of Earthscope object store
  • Application to retrieve a segmented data set for a set of sources from the continuous archive
Session 3 Parallel processing (4 hours)
  • Input-output (IO) concepts
  • IO bound versus compute bound workflows and how to tell the difference
  • Running MsPASS on a High-Performance Computing (HPC) cluster
  • Review example workflows for data processing

Assessment:

Course assessment will be conducted via Moodle, similar to the previous year. Exercises will be revised but overall length and content will remain similar.

Instructors:

Gary Pavlis, Indiana University

Ian Wang,

Tammy Bravo, EarthScope