John Kirkham
Got my B.S. & M.S. in Physics. After graduating went to work at Howard Hughes Medical Institute for 5 years working on image processing problems particularly in neuroscience. Got more involved in open source during that work with particular interest in packaging, storage, and distributed array processing. Then joined the NVIDIA RAPIDS team where there has been good overlap with these past interests as well as new ones.
Sessions
A key feature of the Python data ecosystem is the reliance on simple but efficient primitives that follow well-defined interfaces to make tools work seamlessly together (Cf. http://data-apis.org/). NumPy provides an in-memory representation for tensors. Dask provides parallelisation of tensor access. Xarray provides metadata linking tensor dimensions. Zarr provides a missing feature, namely the scalable, persistent storage for annotated hierarchies of tensors. Defined through a community process, the Zarr specification enables the storage of large out-of-memory datasets locally and in the cloud. Implementations exist in C++, C, Java, Javascript, Julia, and Python, enabling.
In this talk, we will examine the new CUDA package layout for Conda (as included in conda-forge). Show how CUDA components have been broken out. Share how this affects development and package building. Walk through changes in the conda-forge infrastructure made to incorporate these new packages. Examine recipes using the new packages and what was needed to update them. Additionally will provide guidance on how to use these new packages in recipes or in library development.
The array API standard (https://data-apis.org/array-api/) is a common specification for Python array libraries, such as NumPy, PyTorch, CuPy, Dask, and JAX.
This standard will make it straightforward for array-consuming libraries, like scikit-learn and SciPy, to write code that uniformly supports all of these libraries. This will allow, for instance, running the same code on the CPU and GPU.
This talk will cover the scope of the array API standard, supporting tooling which includes a library-independent test suite and compatibility layer, what work has been completed so far, and the plans going forward.