07-11, 14:20–14:50 (US/Pacific), Ballroom
Over the past year, there has been an increase in the number of libraries that leverage Rust and pyo3 to significantly increase performance. What's the catch? In this talk, we will discuss how the Data Science team at Capital One has been thinking about the power of Rust-backed Python and whether the benefits justify the complexity.
- Introduction (5 minutes)
- Who am I?
- Data Science at Capital One Canada & our expectations of ourselves
- We expect Data Scientists to be comfortable reading/writing Python and SQL
- Using Rust with Python (5 minutes)
- Why?
- At PyCon 2023, we attended talks by the developers of Pydantic and Robyn.
- Our takeaway was that Rust could enable us to introduce significant performance improvements with very few user-facing changes to the API.
- A brief introduction to pyo3 and polars
- Why?
- Benchmarking – what did we gain? (5 minutes)
- Performance benefits for various coding challenges
- Rewriting an entire package in Rust
- Rewriting concrete subclasses in Rust
- Using Polars’ plugin architecture to write specific functions in Rust
- Performance benefits for various coding challenges
- The Big Tradeoff (10 minutes)
- How do we maintain code in a language only a few of us know?
- Are we trying to use an airplane to travel a single city block? Let’s talk about whether the scale of the efficiency gain matches the problem space.
- Introducing the “get a coffee” scale from HDBSCAN and layering on our benchmark results.
- What will all of this change if it works perfectly? We want to ensure that refactoring our code will enable the team to experiment more often.
- Equilibrium (5 minutes)
- Are we using Rust? Where?
Akshay Gupta is a Data Scientist and Python developer at Capital One, where he works on building libraries that cover a variety of functions from developer enablement to predictive modelling. Akshay's background is in mathematics & statistics, and he has been in the field for 7 years.