Justus Magin
Justus Magin is a research engineer working at the Laboratoire de l’Oceanographie Physique et Spatiale (LOPS) in Brest, France, where he assists scientists in making computations scalable. He is also an Xarray maintainer and contributes to many projects in the Pangeo ecosystem, most notably to pint-xarray and xdggs.
Sessions
Xarray provides data structures for multi-dimensional labeled arrays and a toolkit for scalable data analysis on large, complex datasets. Many real-world datasets often have hierarchical or heterogeneous structure, and are best organized through groups of related data arrays. Through xarray.DataTree, the xarray data model now supports opening datasets with a hierarchical structure of groups, such as HDF5 files and Zarr stores. This expanded data model is now general enough to manage data across different scientific disciplines, including geosciences and biosciences. This hands-on tutorial focuses on intermediate and advanced workflows using xarray to analyze real-world hierarchical data.
Over the past few years, Discrete Global Grid Systems (DGGS) that subdivide the earth into (roughly) equally sized faces have seen a rise in popularity. However, their in-memory representation is different from traditional projection-based data, which is either comprised of evenly shaped rectangular grid (aka raster) or discrete geometries (aka vector), and thus requires specialized tooling. In particular, this includes libraries that can work on the numeric cell ids defined by the specific DGGS.
xdggs
is a library that provides a unified interface for xarray
that allows working with and visualizing a variety of DGGS-indexed data sets.
We illustrate the power and flexibility of a new extension point in Xarray's data model: "custom indexes" that allow Xarray users to neatly handle complex grids, and enables at least one new data model (vector data cubes). We present a whirlwind tour of specific examples to illustrate the power of this feature, and aim to stimulate experimentation during the sprints.