SciPy 2023

Can There Be Too Much Parallelism?
07-12, 11:25–11:55 (America/Chicago), Amphitheater 204

Numerical Python libraries can run computations on many CPU cores with various parallel interfaces. When we simultaneously use multiple levels of parallelism, it may result in oversubscription and degraded performance. This talk explores the programming interfaces used to control parallelism exposed by libraries such as NumPy, SciPy, and scikit-learn. We will learn about parallel primitives used in these libraries, such as OpenMP and Python's multiprocessing module. We will see how to control parallelism in these libraries to avoid oversubscription. Finally, we will look at the overall landscape for configuring parallelism and highlight paths for improving the user experience.

Numerical Python libraries such as NumPy, SciPy, and PyTorch can run computations on multiple CPU cores. These libraries expose a wide range of programming interfaces to control parallelism. These interfaces include environment variables, library-specific APIs, and context managers such as threadpoolctl. While reviewing the interfaces for controlling parallelism, we will learn about the many parallel primitives used in these libraries. We will cover lower-level primitives such as pthreads or OpenMP and higher-level primitives such as Python's multithreading and multiprocessing modules. Libraries that require lower-level parallel primitives need to go through a compilation step with languages and tools such as Numba, Cython, C++, or Rust. When we use multiple forms of parallelism, controlling how many cores your program uses is essential to prevent oversubscription. We will learn how libraries such as Dask, Ray, and scikit-learn handles mix their parallelism with user-provided parallel routines. Finally, we will zoom out to see the overall landscape for controlling parallelism and highlight possible paths to improve the user and developer experience. This is an intermediate talk for software and machine learning engineers that want to understand and configure parallelism in the PyData stack.

Thomas J. Fan is a Staff Software Engineer at Quansight Labs and is a maintainer for scikit-learn, an open-source machine learning library for Python. Previously, Thomas worked at Columbia University to improve interoperability between scikit-learn and AutoML systems. He is a maintainer for skorch, a neural network library that wraps PyTorch. Thomas has a Master's in Mathematics from NYU and a Master's in Physics from Stony Brook University.

This speaker also appears in: