Skip to content

GeoLab cloud compute hub now open to all users

Tags: cloud platform

illustration of a jupyter notebook

The EarthScope-operated data systems of the NSF National Geophysical Facility are migrating to cloud services. To learn more about this effort and find resources, visit earthscope.org/data/cloud

GeoLab is now open to everyone in our community! We officially opened the doors during the NSF NGF Community Science Conference last week — here’s the key information we covered there if you missed it.

GeoLab is a browser-based JupyterHub notebook environment hosted alongside the NSF NGF cloud data archive. You can use it to explore and analyze data without downloading, prototype intensive data processing workflows before implementing them, and build notebooks that students and collaborators can run without having to configure software environments or copy curated datasets.

To access it, all you need to do is visit the GeoLab page and log in with your EarthScope User Account.

Getting started

We have several resources available to help you get started. The GeoLab Documentation contains instructions for navigating the basics. We offer a structured, asynchronous  Cloud Foundations course to walk you through working in GeoLab. Throughout the summer, we’ll also be increasing the number of example and tutorial notebooks available in our  Learning Hub.

There are two routes to direct support. For technical IT support from EarthScope staff, you can submit a request through the Help Desk. But to facilitate the community in sharing knowledge about scientific data workflows, we’ve created a new GeoLab Community Forum you can post to using your EarthScope User Account.

There is no fee to use GeoLab, just a usage quota over a rolling 30-day window. Each user has 50 GB of file storage available in their home directory, plus ample scratch space and the ability to link to a personal S3 bucket

The GeoLab environment includes a number of common software dependencies, but it’s also possible to install additional python packages or to build your own environment. Using GitHub provides a great way to pull in custom environments, work collaboratively with teams, or enable students to easily load up prepared materials. (More information on things like this in the documentation!)

Just the beginning

The development of GeoLab is a close partnership with 2i2c, who is managing the infrastructure. That development is not over — GeoLab will evolve over time as we learn from community feedback and experience.

infographic titled "Cloud On-Ramp Construction", top progress bar represents data systems and notes that progress goes from lifting existing data systems into the cloud to api/sdk development to powering up data architecture for full cloud capabilities; second progress bar labeled "GeoLab JupyterHub" noting progress from open access to geolab to fully leverages cloud-native data analysis methods; third progress bar titled "Training and Documentation" with progress from initial training opportunities to introductory learning resources to extensive how-to resources.

Your options for accessing data in GeoLab will also be evolving and expanding, but you will already see clear advantages. The ability to query and read data without downloading a copy is a huge time saver that can streamline the way you explore and process data. For seismic data, dataselect and obspy will work significantly faster in GeoLab than on your own computer because it is running adjacent to the data archive in the cloud. For geodetic data, check out the API and SDK sections of our documentation to learn about our growing set of new tools to interact with the archive.

We hope to see the community share these kinds of resources, as well. Now that GeoLab is open for your experimentation, send us your feedback and discuss it in the community forum! We’re eager to find out how you put this tool to work — and what we can do to further support that work.