SciPy 2023

Deepak Cherian

The speaker's profile picture

Sessions

07-11
13:30
240min
Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Deepak Cherian, Thomas Nicholas, Negin Sobhani, Anderson Banihirwe, Jessica Scheick, Don Setiawan, Scott Henderson

Xarray provides data structures for multi-dimensional labeled arrays and a toolkit for scalable data analysis on large, complex datasets with many related variables. Xarray combines the convenience of labeled data structures inspired by Pandas with NumPy-like multi-dimensional arrays to provide an intuitive and scalable interface for scientific analysis. This tutorial will introduce data scientists already familiar with Xarray to more intermediate and advanced topics, such as applying functions in SciPy/NumPy with no Xarray equivalent, advanced indexing concepts, and wrapping other array types in the scientific Python ecosystem.

Tutorials
Classroom 203
07-13
15:50
30min
Tidy Geospatial Cubes
Emma Marshall, Deepak Cherian, Scott Henderson

The open-source project, Xarray, combines labeled data structures inspired by Pandas with NumPy-like multi-dimensional arrays to provide an intuitive and scalable interface for scientific analysis. Xarray has strong user bases in the physical sciences and geospatial community. However, new users commonly struggle to fit their dataset into the Xarray model and with conceptualizing and constructing an Xarray object that makes subsequent analysis steps easy (“dataset wrangling”). We take inspiration from the “tidy data” concept for dataframes — “datasets structured to facilitate analysis” (Wickham, 2014) — and attempt a definition of tidy data for labeled array objects provided by Xarray.

Earth, Ocean, Geo, and Atmospheric
Grand Salon C
0min
Xarray with GPUs
Negin Sobhani, Deepak Cherian, Max Jones

We will present multiple demonstrations of the ability to easily translate existing Xarray workflows to the GPU using CuPy and CuPy-Xarray packages. Our intent is to galvanize community interest around this capability and emphasize recent developments in the ecosystem. The demos will commence with a simple showcase of Xarray wrapping CuPy on a single GPU, and will gradually advance in complexity to exhibit Xarray wrapping Dask wrapping CuPy for multi-node GPU computations on NCAR computing resources.

Earth, Ocean, Geo, and Atmospheric