Jay Chia
Jay is a cofounder of Eventual and a primary contributor to the Daft open-sourced project. Prior to Eventual, he was a software engineer building large scale ML data systems for computational biology at Freenome and self-driving cars at Lyft. He hails from the sunny island nation of Singapore, and used to command a platoon of tanks in the Singapore military.
Sessions
Python is a popular language for data engineering workloads. In data engineering, developers must use a "Query Engine" to efficiently retrieve data, run data processing and then send data back out to a destination storage system or application.
The Python API for Apache Spark (PySpark) is currently the most popular framework that most data engineers use for data engineering at large scale. However, PySpark has a heavy dependency on the JVM which causes high friction during the development process.
In this talk, we discuss our work with the Daft Python Dataframe (www.getdaft.io) which is a distributed Python query engine built with Rust. We will perform a deep-dive into Daft architecture, and talk about how the strong synergy between Python and Rust enables key advantages for Daft to succeed as a query engine.