SciPy 2025

Building LLM-Powered Applications for Data Scientists and Software Engineers
07-08, 13:30–17:30 (US/Pacific), Ballroom C

This workshop is designed to equip software engineers with the skills to build and iterate on generative AI-powered applications. Participants will explore key components of the AI software development lifecycle through first principles thinking, including prompt engineering, monitoring, evaluations, and handling non-determinism. The session focuses on using multimodal AI models to build applications, such as querying PDFs, while providing insights into the engineering challenges unique to AI systems. By the end of the workshop, participants will know how to build a PDF-querying app, but all techniques learned will be generalizable for building a variety of generative AI applications.

If you're a data scientist, machine learning practitioner, or AI enthusiast, this workshop can also be valuable for learning about the software engineering aspects of AI applications, such as lifecycle management, iterative development, and monitoring, which are critical for production-level AI systems.


This workshop is designed to equip software engineers with the skills to build and iterate on generative AI-powered applications. Participants will explore key components of the AI software development lifecycle through first principles thinking, including prompt engineering, monitoring, evaluations, and handling non-determinism. The session focuses on using LLMs to build applications, such as querying PDFs, while providing insights into the engineering challenges unique to AI systems. By the end of the workshop, participants will know how to build a PDF-querying app, but all techniques learned will be generalizable for building a variety of generative AI applications.

If you're a data scientist, machine learning practitioner, or AI enthusiast, this workshop can also be valuable for learning about the software engineering aspects of AI applications, such as lifecycle management, iterative development, and monitoring, which are critical for production-level AI systems.

What You'll Learn:

  • How to integrate AI models and APIs into a practical application.
  • Techniques to manage non-determinism and optimize outputs through prompt engineering.
  • How to monitor, log, and evaluate AI systems to ensure reliability.
  • The importance of handling structured outputs and using function calling in AI models.
  • The software engineering side of building AI systems, including iterative development, debugging, and performance monitoring.
  • Practical experience in building an app to query PDFs using multimodal models.

What is Unique About This Session:

This workshop uniquely bridges the gap between software engineering and generative AI development. While most AI workshops focus solely on model usage or tuning, this session emphasizes the entire AI software lifecycle — from prompt engineering to monitoring and tracing. Participants will learn how to manage non-determinism and create production-ready AI applications, giving them the knowledge to tackle the software engineering challenges of AI-powered apps. The hands-on approach ensures that attendees walk away with practical skills and a functional app.

Workshop Prerequisite Knowledge:
* Basic programming knowledge in Python.
* Familiarity with REST APIs.
* Experience working with Jupyter Notebooks or similar environments (preferred but not required).
* No prior experience with AI or machine learning is required.
* Most importantly, a sense of curiosity and a desire to learn!

If you have a background in data science, ML, or AI, this workshop will help you understand the software engineering side of building AI applications.

We will introduce you to certain modern frameworks in the workshop but the emphasis be on first principles and using vanilla Python and LLM calls to build AI-powered systems.

All tutorial material will be in this github repository.


Installation Instructions

We will be using Github Codespaces so no installation required!

Prerequisites

Workshop Prerequisite Knowledge:

  • Basic programming knowledge in Python.
  • Familiarity with REST APIs.
  • Experience working with Jupyter Notebooks or similar environments (preferred but not required).
  • No prior experience with AI or machine learning is required.
  • Most importantly, a sense of curiosity and a desire to learn!

Hugo Bowne-Anderson is an independent data and AI consultant with extensive experience in the tech industry. He is the host of the industry Vanishing Gradients, where he explores cutting-edge developments in data science and artificial intelligence.
As a data scientist, educator, evangelist, content marketer, and strategist, Hugo has worked with leading companies in the field. His past roles include Head of Developer Relations at Outerbounds, a company committed to building infrastructure for machine learning applications, and positions at Coiled and DataCamp, where he focused on scaling data science and online education respectively.
Hugo's teaching experience spans from institutions like Yale University and Cold Spring Harbor Laboratory to conferences such as SciPy, PyCon, and ODSC. He has also worked with organizations like Data Carpentry to promote data literacy.
His impact on data science education is significant, having developed over 30 courses on the DataCamp platform that have reached more than 3 million learners worldwide. Hugo also created and hosted the popular weekly data industry podcast DataFramed for two years.
Committed to democratizing data skills and access to data science tools, Hugo advocates for open source software both for individuals and enterprises.

This speaker also appears in: