SciPy 2025

Justus Magin

Justus Magin is a research engineer working at the Laboratoire de l’Oceanographie Physique et Spatiale (LOPS) in Brest, France, where he assists scientists in making computations scalable. He is also an Xarray maintainer and contributes to many projects in the Pangeo ecosystem, most notably to pint-xarray and xdggs.


Sessions

07-08
13:30
240min
Hierarchical Data Analysis with Xarray DataTree & Zarr
Deepak Cherian, Negin Sobhani, Ian Hunt-Isaak, Eniola Awowale, Tom Nicholas, Joe Hamman, Justus Magin

Xarray provides data structures for multi-dimensional labeled arrays and a toolkit for scalable data analysis on large, complex datasets. Many real-world datasets often have hierarchical or heterogeneous structure, and are best organized through groups of related data arrays. Through xarray.DataTree, the xarray data model now supports opening datasets with a hierarchical structure of groups, such as HDF5 files and Zarr stores. This expanded data model is now general enough to manage data across different scientific disciplines, including geosciences and biosciences. This hands-on tutorial focuses on intermediate and advanced workflows using xarray to analyze real-world hierarchical data.

Tutorials
Room 315
07-09
13:55
30min
Using Discrete Global Grid Systems in the Pangeo ecosystem
Tina Odaka, Jean-Marc Delouis, Justus Magin, Anne Fouilloux, Benoît Bovy, Alexander Kmoch

Over the past few years, Discrete Global Grid Systems (DGGS) that subdivide the earth into (roughly) equally sized faces have seen increased popularity. However, their in-memory representation is different from traditional projection-based data, which is either comprises of evenly shaped rectangular grid (aka raster) or discrete geometries (aka vector), and thus requires specialized tooling. In particular, this includes libraries that can work on the numeric cell ids defined by the specific DGGS.

xdggs is a library that provides a unified interface for xarray that allows working with and visualizing a variety of DGGS-indexed data sets.

Earth, Ocean, Geo, Climate, and Atmospheric Science
Room 318
07-10
16:30
30min
The brave new world of slicing and dicing Xarray objects.
Deepak Cherian, Justus Magin, Benoît Bovy

We illustrate the power and flexibility of a new extension point in Xarray's data model: "custom indexes" that allow Xarray users to neatly handle complex grids, and enables at least one new data model (vector data cubes). We present a whirlwind tour of specific examples to illustrate the power of this feature, and aim to stimulate experimentation during the sprints.

Earth, Ocean, Geo, Climate, and Atmospheric Science
Room 315
0min
Fast and scalable general geospatial regridding
Justus Magin

Being able to regrid between various grid types is very important in geoscience research. While the scientific python ecosystem includes numerous geospatial regridding packages, most of them are tailored to only a few specific grid types. Additionally, very few of them are designed to handle regridding of grids that are too big to fit into memory using distributed computation frameworks like dask.

grid-indexing and grid-weights are a set of rust-based libraries that implement regridding between arbitrary grids using a RTree and rely on dask to scale for larger-than-memory grids.

Earth, Ocean, Geo, Climate, and Atmospheric Science