SciPy 2023

Thomas J. Fan

Thomas J. Fan is a Staff Software Engineer at Quansight Labs and is a maintainer for scikit-learn, an open-source machine learning library for Python. Previously, Thomas worked at Columbia University to improve interoperability between scikit-learn and AutoML systems. He is a maintainer for skorch, a neural network library that wraps PyTorch. Thomas has a Master's in Mathematics from NYU and a Master's in Physics from Stony Brook University.

The speaker's profile picture

Sessions

07-12
11:25
30min
Can There Be Too Much Parallelism?
Thomas J. Fan

Numerical Python libraries can run computations on many CPU cores with various parallel interfaces. When we simultaneously use multiple levels of parallelism, it may result in oversubscription and degraded performance. This talk explores the programming interfaces used to control parallelism exposed by libraries such as NumPy, SciPy, and scikit-learn. We will learn about parallel primitives used in these libraries, such as OpenMP and Python's multiprocessing module. We will see how to control parallelism in these libraries to avoid oversubscription. Finally, we will look at the overall landscape for configuring parallelism and highlight paths for improving the user experience.

General Track
Amphitheater 204
07-14
11:25
30min
Python Array API Standard: Toward Array Interoperability in the Scientific Python Ecosystem
Aaron Meurer, Thomas J. Fan, Stephannie Jimenez Gacha, John Kirkham, Stephan Hoyer, Tyler Reddy, Leo Fang, Matthew Barber, Ralf Gommers, Andreas Mueller, Athan Reines, Mario, Alexandre Passos, Travis E Oliphant, Saul shanabrook

The array API standard (https://data-apis.org/array-api/) is a common specification for Python array libraries, such as NumPy, PyTorch, CuPy, Dask, and JAX.

This standard will make it straightforward for array-consuming libraries, like scikit-learn and SciPy, to write code that uniformly supports all of these libraries. This will allow, for instance, running the same code on the CPU and GPU.

This talk will cover the scope of the array API standard, supporting tooling which includes a library-independent test suite and compatibility layer, what work has been completed so far, and the plans going forward.

General Track
Amphitheater 204