SciPy 2024

Enhancing Predictive Analytics with tsbootstrap and sktime
07-08, 13:30–17:30 (US/Pacific), Room 317

Explore tsbootstrap and sktime in our 4-hour tutorial, focusing on enhancing time series forecasting and analysis. Discover how tsbootstrap's bootstrapping methods improve uncertainty quantification in time series data, integrating with sktime's forecasting models. Learn practical applications in various domains, boosting predictive accuracy and insights. This interactive session will provide hands-on experience with these tools, offering a deep dive into advanced techniques like probabilistic forecasting and model evaluation. Join us to expand your expertise in time series analysis, applying innovative methods to tackle real-world data challenges.


In our comprehensive 4-hour tutorial at SciPyCon, participants will delve into tsbootstrap and sktime, essential tools for advanced time series forecasting and analysis. This session is structured to provide a mix of theoretical grounding and hands-on application, ensuring attendees not only understand the concepts but also know how to implement them effectively.

We will commence with an introductory overview, laying a foundation in time series analysis and emphasizing the role of accurate uncertainty quantification—a foundational aspect of dependable forecasting. Subsequently, we will take a deep dive into the inner workings of tsbootstrap, showcasing specialized bootstrapping techniques such as Block, Sieve, and Markov Bootstrap. Participants will learn how these techniques adeptly preserve temporal dependencies, enabling more precise uncertainty estimations.

Next, we showcase the various forecasting models provided by sktime, detailing how its architectural design ensures a smooth interplay with tsbootstrap to enrich your forecasting toolkit. Attendees will be guided through sktime's modular framework, which empowers analysts to tailor and expand the tool's capabilities to meet specific project demands.

The tutorial will also include hands-on exercises and illustrations with stylized use cases, helping participants appreciate the versatility and real-world applicability of tsbootstrap and sktime. We will then delve into the methodologies that underpin these tools, focusing on probabilistic forecasting and the enhancement it receives from tsbootstrap, alongside critical evaluation metrics, model tuning strategies, and the importance of prediction intervals in expressing forecast uncertainty.

The session will demonstrate how to use tsbootstrap and sktime for advanced time series analysis and forecasting. Moreover, the tutorial will serve as a platform for community building, introducing the collaborative ecosystem surrounding these tools and inviting participants to contribute to their ongoing development.

We will wrap up with a forward-looking discussion on the future of time series analysis, touching upon planned features, potential integrations, and the evolving nature of these tools to accommodate an ever-changing data environment. Attendees' perspectives on future directions for the field will be warmly welcomed.

By the conclusion of this tutorial, attendees will be familiar with both tsbootstrap, and sktime. They will have the necessary skills and knowledge to advance in time series analysis, using these tools to contribute significantly in their respective fields.


Prerequisites

Must-Haves:

• Python Proficiency: Basic knowledge of Python programming is essential, including understanding of syntax, control structures, functions, and basic data structures like lists and dictionaries.
• Familiarity with Jupyter Notebooks: Since the tutorial exercises will be conducted in Jupyter Notebooks, attendees should know how to use them for Python scripting.
• Software Setup: Participants need Python installed on their computers, along with Jupyter Notebooks and the tsbootstrap and sktime libraries. Basic installation guidance will be provided at the beginning of the tutorial.
• Enthusiasm to Learn : Active engagement and a willingness to participate in hands-on exercises and discussions are crucial for getting the most out of the tutorial :)

Nice-to-Haves:

• Foundational Statistics Knowledge: Understanding basic statistical concepts related to time series analysis can enrich the learning experience, but it's not a barrier to entry.
• Experience with Data Manipulation: Experience with pandas or similar libraries is advantageous for data handling during the exercises but not mandatory for participation.
• Basic Understanding of Machine Learning: Familiarity with fundamental machine learning concepts will be beneficial for grasping some of the advanced topics, but initial exposure to these concepts will also be provided during the tutorial.

Installation Instructions

Please see the instructions here to set up your computer to run the code locally: https://github.com/astrogilda/tsbootstrap-sktime-tutorial-scipy-2024

I am a machine learning researcher and software developer specializing in time series analysis and constrained optimization. After obtaining my Ph.D. in Astronomy in 2021, I transitioned to industry as an MLE at a V2X SaaS startup. Since then, I've co-founded a consulting firm and am currently enhancing scheduling logistics for an innovative startup.

As an advocate for open-source software, I created tsbootstrap, the first Python library dedicated to time series bootstrapping. I'm actively developing additional libraries focusing on time series analysis and streamlining end-to-end machine learning for astronomers. I thrive on engaging in conferences and workshops within the scientific computing community.

When I'm not immersed in code or data, you'll find me at the gym lifting weights or embracing the thrill of skydiving. I'm always eager to discuss technology, science, or extreme sports – feel free to connect with me during the conference!