SciPy 2024

Mathieu Guillame-Bert

I am a research engineer at Google Zurich. My work centers around moving ML research into production. Notably, I lead efforts to research and bring decision forest technologies to a production, making them accessible, scalable, and performant.

Before joining Google, I completed a postdoctoral fellowship at Carnegie Mellon University's Auto Lab. In 2012, I earned my doctorate in France at the INRIA Research Lab as part of the PRIMA team. I graduated from Imperial College London and "French Grande Ecole" ENSIMAG in 2009.

In my spare time, I delve into various hobbies such as tinkering with electronics, woodworking, 3D printing, and creating video games.

The speaker's profile picture

Sessions

07-08
08:00
240min
A hands-on forecasting guide: from theory to practice
Ian Spektor, Diego Kiedanski, Mathieu Guillame-Bert

Forecasting is central to decision-making in virtually all technical domains. For instance, predicting product sales in retail, forecasting energy demand, and anticipating customer churn all have tremendous value across different industries. However, the landscape of forecasting techniques is as diverse as it is useful, and different techniques and expertise are adapted to different types and sizes of data.
In this hands-on workshop, we give an overview of forecasting concepts, popular methods, and practical considerations. We’ll walk you through data exploration, data preparation, feature engineering, statistical forecasting (e.g., STL, ARIMA, ETS), forecasting with tabular machine learning models (e.g., decision forests), forecasting with deep learning methods (e.g., TimesFM, DeepAR), meta-modeling (e.g., hierarchical reconciliation and relational modeling, ensembles, resource models), and how to safely evaluate such temporal models.

Tutorials
Ballroom D
07-12
10:45
30min
Safe, fast, and easy time series preprocessing with Temporian
Mathieu Guillame-Bert, Ian Spektor

Temporal data is ubiquitous in data science and plays a vital role in machine learning pipelines and business decisions. Preprocessing temporal data using generic data tools can be tedious, lead to inefficient computation, and be prone to errors.
Temporian is an open-source library for safe, simple, and efficient preprocessing and feature engineering of temporal data. It supports common temporal data types, including non-uniform sampled, multi-variate, multi-index, and multi-source data. Temporian favors interactive development in notebooks and integration with other machine learning tools, and can run at scale using distributed computing.
This talk, aimed at data scientists and machine learning practitioners, will showcase Temporian’s key features along with its powerful API, and demonstrate its advantages over generic data preprocessing libraries for handling temporal data.

Data Science and AI/Machine Learning
Ballroom