SciPy 2023

UXarray, a python library for unstructured climate and weather data
07-13, 14:20–14:50 (America/Chicago), Grand Salon C

UXarray aims to provide xarray-styled functionality for unstructured grid datasets. UXarray offers support for loading and representing unstructured grids by utilizing existing Xarray functionality paired with new routines that are specifically written for operating on unstructured grids. In this talk, we will present the current capabilities of the library: reading and writing of unstructured grids, reading of datasets along with basic grid operations and the need to speed up computations, integration operations along with details on speedups obtained by using Numba and python indexing. We will also demonstrate the use of this library for visualization of unstructured grids.


After less than a year of development, UXarray has already become a popular Python repository with an active community engagement, boasting more than 10 forks and 77 stars on GitHub.

The UXarray project aims to bridge the gap between traditional operations on structured grids and modern standards for unstructured grids, such as the UGRID specification. Global climate models have traditionally used rectangular latitude-longitude grids for their data layout, but these grids lead to computational challenges at high resolutions due to the convergence of lines of longitude at the poles. Therefore, modeling centers worldwide have adopted unstructured grids that allow for quasi-uniform distribution of data over the sphere. However, analyzing data on these grids is far more difficult than on latitude-longitude grids, often requiring groups to apply lossy regridding to their data so that traditional tools can be applied. To partly address this problem, groups worldwide have moved towards the adoption of standards for unstructured grid data, such as the UGRID specification developed under the Climate-Forecast (CF) conventions.

Most climate models output data in the NetCDF format, and the CF conventions are an important standard for organizing the metadata of these files and includes details on how to describe a rectangular latitude longitude grid. The UGRID specification describes how a NetCDF file can represent an unstructured grid, but it has potential issues. Currently, the UGRID specification is under consideration to be included in the netCDF CF conventions.

Our new Python library, UXarray, supports operations directly on unstructured grid data, reducing the need for creating regular-grid copies of unstructured grid output and simplifying the workflow. Unstructured grids can be provided in files following various conventions, such as UGRID, SCRIP, EXODUS, etc. These conventions have different definitions and representations of the attributes and variables used to describe the unstructured grid topology. Moreover, the UGRID convention does not enforce standard variable namings for most of the attributes and variables, except for a few required ones. UXarray unifies all of these conventions at the data loading step by representing grids internally in the UGRID convention, regardless of the original grid file type. Furthermore, it uses a set of standardized names for topology attributes and variables, while still providing the user with the original attribute names and variables from the grid definition file. All of these features lay the foundation for the development of quick and efficient algorithms for climate scientists around the world.

Our design for UXarray aims to maintain Xarray interoperability, which allows us to utilize various Xarray-compatible packages. UXarray uses Numba for loop optimizations and faster computation. Additionally, we provide examples and performance metrics showcasing interoperable read/write operations, grid and corresponding data reading, efficiency and optimization built into UXarray, and visualization.

Overall, UXarray aims to simplify the workflow for climate and weather scientists working with unstructured grids and allow them to efficiently analyze and visualize their data.

Rajeev Jain is a Principal Research Software Specialist at the Argonne National Laboratory, located in the suburbs of Chicago, with a focus on managing multi-disciplinary simulation, scalability and computation for applications-oriented problems.

He is a quick learner who loves to solve complex problems and readily adapts to new challenges. His work encompass a range of scientific domains, from simulating physical phenomena to developing deep-learning-enabled precision medicine for cancer and providing data analysis tools for the geoscience community.

To learn more about Rajeev Jain's work and research, you can visit his profile page on the Argonne website: https://www.anl.gov/profile/rajeev-jain

LinkedIn: https://linkedin.com/in/rajeeja