SciPy 2024

Naty Clementi

Naty is a senior software engineer at Voltron Data. She is a former academic with a Masters in Physics and PhD in Mechanical and Aerospace Engineering to her name. She is currently contributing to Ibis, but in the past has also contributed and maintained Dask. She is also an active member of Pyladies and a one of the directors of Women Who Code DC.

The speaker's profile picture

Sessions

07-08
08:00
240min
Intro to Ibis: blazing fast analytics with DuckDB, Polars, Snowflake, and more, from the comfort of your Python repl.
Gil Forsyth, Phillip Cloud, Naty Clementi, Jim Crist-Harif

Tabular data is ubiquitous, and pandas has been the de facto tool in Python for analyzing it. However, as data size scales, analysis using pandas may become untenable. Luckily, modern analytical databases (like DuckDB) are able to analyze this same tabular data, but perform orders-of-magnitude faster than pandas, all while using less memory. Many of these systems only provide a SQL interface though; something far different from pandas’ dataframe interface, requiring a rewrite of your analysis code.

This is where Ibis comes in. Ibis is a pure-Python open-source library that provides a dataframe interface to many popular databases and analytics tools (DuckDB, Polars, Snowflake, Spark, etc...). This lets users analyze data using the same consistent API, regardless of which backend they’re using, and without ever having to learn SQL. No more pains rewriting pandas code to something else when you run into performance issues; write your code once using Ibis and run it on any supported backend.

https://ibis-project.org/
https://github.com/ibis-project/ibis

Tutorials
Ballroom A
07-11
10:45
30min
Ibis + DuckDB geospatial: a match made on Earth
Naty Clementi

Geospatial data is becoming more present in data workflows today, and plenty of Python tools allow us to work with it. In the past year, a new contender emerged: DuckDB introduced an extension for analyzing geospatial data. Everyone in the data world has been buzzing about DuckDB (~15k stars on GitHub), and now this duck quacks geospatial data too. But wait a minute, isn’t DuckDB all SQL? Yes, but fear not, Ibis has you covered! Ibis is a Python library that provides a dataframe-like interface, enabling users to write Python code to construct SQL expressions that can be executed on multiple backends, like DuckDB. In this talk, you will learn how to leverage the benefits of DuckDB geospatial while remaining in the Python ecosystem (yes, we will do a live demo). This is an introductory talk; everyone is welcome, and no previous experience with spatial databases or geospatial workflows is needed.

Earth, Ocean, Geo, and Atmospheric Science
Room 315