07-07, 13:30–17:30 (US/Pacific), Room 315
Scientific researchers need reproducible software environments for complex applications that can run across heterogeneous computing platforms. Modern open source tools, like pixi
, provide automatic reproducibility solutions for all dependencies while providing a high level interface well suited for researchers.
This tutorial will provide a practical introduction to using pixi
to easily create scientific and AI/ML environments that benefit from hardware acceleration, across multiple machines and platforms. The focus will be on applications using the PyTorch and JAX Python machine learning libraries with CUDA enabled, as well as deploying these environments to production settings in Linux container images.
As artificial intelligence (AI) and machine learning (ML) becomes a modern part of the scientific toolkit, the need to have robustly reproducible scientific computing environments that support hardware acceleration, e.g. with CUDA, becomes more important. However, historically just installing a working CUDA environment on a single machine, let alone on multiple platforms with different requirements, was considered a particularly difficult and painful task. This lead to many scientific machine learning workflows being reliably runnable on only particular machines, and, even worse, with environments that were not reproducible across time.
With significant recent advancements by the NVIDIA open source team and the conda-forge open source community, the entire CUDA stack — from compilers to runtime libraries — is now distributed on conda-forge. This significantly reduces the overhead to install CUDA dependencies, but packaging and distribution of binaries alone does not solve the problem of reproducibility. With automatic multi-platform hash-level lock file support for all dependencies that are available on package indexes (like PyPI and conda-forge), highly efficient solving strategies, and high level user interfaces, pixi
provides a missing piece to the scientific researcher toolkit. With pixi
, researchers are able to easily specify the hardware acceleration requirements they have, multiple different computational environments needed for their experiments, and the required software dependencies, and then quickly solve for a multi-platform lock file of all the dependencies required, down to the compiler level. This makes it possible to have multiple hardware accelerated environments defined that are able to run AI/ML workflows across heterogeneous machines with different GPU types and CUDA compatibility.
This tutorial will be targeted to scientific researchers who use Python for scientific computing and use hardware accelerated workflows in their research, with a particular focus on AI/ML. No prior expertise with hardware accelerator systems is assumed. The tutorial structure will begin with an introduction to pixi
as a computational environment manager, and explore how it provides features beyond other more common package managers that might be used for Python dependencies. It will then extend to adding CUDA requirements to pixi
environments, and provide participants with exercises for solving environments and running simple AI/ML workflows using the PyTorch and JAX machine learning libraries. The tutorial will then move towards more complex environment requirements in later exercises. The tutorial will conclude with examples and exercises focusing on deploying pixi
workflows to production environments by distributing pixi
environments in Linux container images.
Tutorial participants will code all examples themselves. Participants will also be given time to explore solutions to their own hardware accelerated Python workflows. To make the tutorial more practical and interactive, cloud GPU resources will be requested from industry partners, that will allow for participants to have hardware accelerated resources to run their own examples on.
pixi
is the only tool that needs to be installed prior to the start of the tutorial. Install instructions for pixi
are provided on the pixi
documentation website, but can be summarized as * Linux, macOS: curl -fsSL https://pixi.sh/install.sh | bash
* Windows powershell -ExecutionPolicy ByPass -c "irm -useb https://pixi.sh/install.ps1 | iex"
Participants should be familiar with Python programming for science, and using external dependencies in their work. The tutorial will use machine learning workflows as examples, but while familiarly with machine learning may be useful for conceptual understanding of the tasks, no prior machine learning knowledge is required to complete the tutorial. No prior expertise with CUDA is assumed.