SciPy 2025

Adam Symington

Dr. Adam Symington is a geospatial data scientist, currently working for Synmax. Adam has published a book on geospatial data visualisation, “PythonMaps, Geospatial Visualization with Python” and runs the PythonMaps project, a project designed to spread the love of geospatial data through eye-catching visualisation. Prior to his career in data science Adam was a computational materials scientist at the University of Bath in England where he used machine learning and other statistical techniques to predict the properties of materials. During Adams time in academia, he taught Python programming, developed a Python programming course and published open source software.

  • Geospatial data visualisation in Python
Adam Thompson
  • AI as a Detector: Lessons in Real Time Pulsar Discovery
Akshay Agrawal

Akshay Agrawal is currently building marimo, a new kind of reactive notebook for Python that is reproducible, git-friendly (stored as Python files), executable as a script, and deployable as an app.

He is both a researcher, focusing on machine learning and optimization, and an engineer, having contributed to several open source projects, including TensorFlow during his time at Google, and CVXPY, of which he is a maintainer. He holds a PhD from Stanford University, where he was advised by Stephen Boyd, as well as a BS and MS in computer science from Stanford.

  • marimo: an open-source reactive Python notebook
Albert Steppi

Albert Steppi (@steppi) is a Senior Software Engineer at Quansight Labs and a maintainer of the SciPy library. Previously he worked as a Machine Learning Scientist at Lendbuzz and before that as a Scientific Software Developer in the Labroratory of Systems Pharmacology at Harvard Medical School. He earned a PhD in Statistics from Florida State University in 2018 with research focusing on network aware bioinformatics analysis and biomedical text mining. He is broadly interested in numerical mathematics and scientific and statistical computing.

  • SciPy’s New Infrastructure for Probability Distributions and Random Variables
Alessio Buccino

I am an engineer and software developer focused on methods and analysis tools for neuroscience research, especially for extracellular electrophysiology. I am passionate about science, software, and engineering, and my mission is to support neuroscientists and facilitate their research efforts by providing state-of-the-art analysis methods and software tools. Among these, I am the core developer of several open-source scientific tools, including SpikeInterface, a widely used software framework to unify and simplify the analysis of extracellular electrophysiology data.

In March 2022, I joined the Allen Institute for Neural Dynamics team as an electrophysiology pipeline development engineer consultant, with the goal of building open-source and computationally efficient processing pipelines to analyze large amounts of electrophysiological data. Since July 2020, I have been working part-time at CatalystNeuro, a consulting company with the mission of facilitating collaborations in neuroscience and standardizing data analysis and data storage solutions.

Previously, I was a Postdoctoral Fellow at the Bio Engineering Lab at ETH, working on multimodal approaches to probe neural activity and to construct detailed biophysical models. Before that I was at the Center for Integrated Neuroplasticity CINPLA, at the University of Oslo, where I received my PhD.

  • SpikeInterface: Streamlining End-to-End Spike Sorting Workflows
Alex Monahan

Hello, I'm Alex! I am a customer software engineer at MotherDuck and I blog for DuckDB Labs. My background is Industrial and Systems Engineering from Virginia Tech, but I've decided I prefer working with data!

I joined MotherDuck 2 years ago after 9 years at Intel. I started at Intel as an industrial engineer, later became a technical analyst, and then jumped into a data scientist role. Back in 2020 I discovered DuckDB while building an internal self service analytics platform and became one of DuckDB's biggest Twitter fans! I have been diving deeper into Duck-themed databases ever since.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
Alexander Kaszynski

Hello! My name is Alex Kaszynski and I'm a
software developer and aerospace engineer, as well as the co-creator of
PyVista and the creator of
PyAnsys.

I'm an American living in the United States working for AFRL and Pasteur Labs & ISI.

Besides coding, I also enjoys presenting and demoing Python libraries, especially 3D
visualization, particularly in its application to
CAE and automation. You can also find some blog articles I've written at
DEV.to.

  • 3D Visualization with PyVista
Alexander Kmoch

Alex is an Associate Professor in Geoinformatics and a Distributed Spatial Systems Researcher with many years of experience in geospatial data management and web- and cloud-based geoprocessing with a particular focus on land use, soils, hydrology, hydrogeology and water quality data. His interests include DIscrete Global Grid Systems (DGGS), OGC standards and web-services for environmental and geo-scientific data sharing, modelling workflows and interactive geo-scientific visualisation. He is also the European co-chair of the OGC DGGS working group.

  • Using Discrete Global Grid Systems in the Pangeo ecosystem
Allison Ding

Allison is a Developer Advocate for GPU-accelerated AI APIs, libraries, and tools at NVIDIA, with a specialization in advanced data science techniques and large language models (LLMs) . She brings over eight years of hands-on experience as a data scientist, focusing on managing and delivering end-to-end data science solutions. Her academic background includes a strong emphasis on data science, natural language processing (NLP), and generative AI. Allison holds a master’s degree in Applied Statistics from Cornell University and a master’s degree in Computer Science from San Francisco Bay University.

  • Unlocking AI Performance with NeMo Curator: Scalable Data Processing for LLMs
  • Scaling Clustering for Big Data: Leveraging RAPIDS cuML
Ana Comesana

As a Scientific Engineering Associate at Lawrence Berkeley National Laboratory, Ana conducts multidisciplinary research focused on the development of innovative solutions, including tools to accelerate jet fuel research or autonomously design semantic models and data infrastructure for buildings. Ana enjoys using machine learning and data science to discover complex patterns and ultimately advance scientific research.

  • Advanced Machine Learning Techniques for Predicting Properties of Synthetic Aviation Fuels using Python
Anant Mittal
  • AI for Scientific Discovery
Angus Hollands

Angus Hollands is an Open Source Applications Engineer at 2i2c. He was previously a post-doctoral researcher in the Computational High Energy Physics group at Princeton University. He has a long-standing history of working collaboratively in open source projects, such as Executable Books, Jupyter, scikit-hep, and Blender. He is motivated by open-source, open-science, and the FAIR principles to build a more accessible, empowering future for scientific research and publication. His scientific background is in nuclear structure, in which he studied a PhD at the University of Birmingham.

Hi/Him.

  • Jupyter Book 2.0 – A Next-Generation tool for sharing for Computational Content
Anil Sharma

Director of Software Engineering with 23 years of IT experience building ML and traditional software applications.

  • Agentic-Ai and latency implications
Anjali Datta
  • Building machine learning pipelines that scale: a case study using Ibis and IbisML
Anne Fouilloux
  • Using Discrete Global Grid Systems in the Pangeo ecosystem
Antoni Liria Sala
  • Retrieval Augmented Generation (RAG) for LLMs
Archit Datar

Research Investigator and Data Scientist at Celanese Corporation (former DuPont)
PhD, Chemical Engineering, Graduate Minor in Statistical Machine Learning

  • Show your work: Tutorial on building and hosting web applications
Avik Basu
  • Open Code, Open Science: What’s Getting in Your Way?
Axel Sirota

Axel Sirota is an experienced AI leader, engineer, educator and consultant for global technology organizations.

For close to 15 years, Axel has been on the forefront of AI. He founded and grew a successful training consultancy, teaching AI, GenAI, and Python to Fortune 500 companies such as Intuit,
Salesforce, Barclays, Netflix, Apple, and Yahoo.
Axel has been a keynote speaker throughout South America and is known as one of the leading voices on AI Safety and AI technologies. He is eager to work with and support leaders around the world make thoughtful decisions around AI policy and how it impacts society.

Axel’s passion for education is evident in his 50+ published online courses across Pluralsight, O’Reilly Media, and LinkedIn Learning.

Axel received his Master’s Degree in Mathematical Sciences - Probability and Statistics from Universidad de Buenos Aires, Buenos Aires.

  • Polyglot RAG: Building a Multimodal, Multilingual, and Agentic AI Assistant
Bane Sullivan
  • 3D Visualization with PyVista
Ben Miller

Lead developer for the the Rydberg Interactive Quantum Module (RydIQule), an open-source tool for research labs to model a broad range of Quantum Sensing experiments. Working for the Army Research Lab researching Quantum Sciences. Passion for bridging the gap between physics and software, and building open tools to accelerate research. Current interests include quantum technology (including its limitations) and classical simulation of quantum circuits.

  • RydIQule: A Package for Modelling Quantum Sensors
Benoît Bovy
  • Using Discrete Global Grid Systems in the Pangeo ecosystem
  • The brave new world of slicing and dicing Xarray objects.
Brigitta Sipőcz
  • Reliable executable tutorials -- CI/CD challenges
Brodie Vidrine

Brodie Vidrine is a software engineer working for NOAA's National Center for Environmental Information. Before working to modernize government software, Brodie spent 17 years writing code for Ascend Math, an online Math tutorial program.

  • Breaking Out of the Loop: Refactoring Legacy Software with Polars
Bryce Adelstein Lelbach

Bryce Adelstein Lelbach has spent over a decade developing programming languages, compilers, and software libraries. He is a Principal Architect at NVIDIA, where he leads programming language efforts and drives the technical roadmap for NVIDIA's compute compilers and libraries. Bryce is one of the leaders of the C++ community. He has served as chair of INCITS/PL22, the US standards committee for programming languages and the Standard C++ Library Evolution group. Bryce served as the program chair for the C++Now and CppCon conferences for many years. On the C++ Committee, he has personally worked on concurrency primitives, parallel algorithms, executors, and multidimensional arrays. He is one of the founding developers of the HPX parallel runtime system. Outside of work, Bryce is passionate about airplanes and watches.

  • cuTile, the New/Old Kid on the Block: Python Programming Models for GPUs
C.A.M. Gerlach

Python and Spyder core developer, specializing in docs, infra, and UI. Python Docs Team and PEP Editor. Star✦Fleet Commander. Former NASA-funded ML researcher.

  • Remote development for students and indie researchers with Spyder
Cainã Max Couto da Silva

I’m a data scientist and AI engineer with 10+ years of experience across academic research and industry, building GenAI and machine learning solutions for research labs, startups, and Fortune 500 companies. I’m also a passionate educator, contributing to data training programs as a professor and consultant, and an active open-source contributor and speaker at conferences like PyData.

  • Building an AI Agent for Natural Language to SQL Query Execution on Live Databases
Carlos Cordoba

I got involved with Spyder in 2010 as a volunteer, and became its maintainer in 2013. I worked for Anaconda from 2015 to 2017. The first two years there I created conda packages for Qt, VTK, Boost, Pandoc, Graphviz, CMake, etc (most of my recipes were used by the Conda-forge team when the project started). In my final year I led a team of three developers working on Spyder. I joined Quansight in 2018 and left in 2022. There I managed a team of five developers (hired by Quansight), also in charge of maintaining Spyder. Since then I've been working on Spyder full time thanks to a CZI grant awarded to the project at the end of 2022.

  • Remote development for students and indie researchers with Spyder
Carol Willing

Carol Willing is a three-time Python Steering Council member, a Python Core Developer, PSF Fellow, and a Project Jupyter core contributor. In 2019, she was awarded the Frank Willison Award for technical and community contributions to Python. As part of the Jupyter core team, Carol was awarded the 2017 ACM Software System Award for Project Jupyter's lasting influence. She's also a leader in open science and open-source governance serving on Quansight Labs Advisory Board and the CZI Open Science Advisory Board. She's driven to make open science accessible through open tools and learning materials. She recently served as Noteable's VP of Engineering,

  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Charles R Harris

Charles has been a foundational contributor to the scientific Python ecosystem since 2002, with work spanning Numeric, Numarray, NumPy, and SciPy. He has been a frequent participant of SciPy, having only missed two conference since 2003. Charlies brings both deep historical knowledge and a steady commitment to the community. Before retiring in 2013, Charles worked as a mathematician and problem solver at the Space Dynamics Laboratory. He continues to serve as a core NumPy maintainer and is currently the NumPy release manager, a role he sees as "keeping the wheels turning" while supporting the next generation of contributors. Charles holds a B.A. in Physics from Columbia College and a Ph.D. in Mathematics from the University of Utah.

  • Keynote by Charles R. Harris
Charles Turner

Charles is a Research Software Engineer at ACCESS-NRI, where he works in the Model Evaluation and Diagnostics team, helping make it easier to access and analyse climate data. He previously worked in Air Quality, where he produced tools to analyse air pollution data, and has a PhD in Oceanography.

When not in front of a computer, he enjoys routinely injuring himself in a variety of sports.

  • Python for Climate Science: Using Intake to provide easy access to Climate Model data
Charlie Becker

Associate Research Scientist in the MILES (Machine Integration and Learning for Earth Systems) group at NSF NCAR.

  • Physical XAI - Going Beyond Traditional XAI Methods in Earth System Science
Charlotte Wickham

Charlotte Wickham is a Developer Educator at Posit, where she focuses on Quarto. Before Posit, she taught Statistics and Data Science at Oregon State University.

  • From One Notebook to Many Reports: Automating with Quarto
Chirag

Chirag Shah is an Environmental Data Science/ full-stack software engineer with a keen interest in climate science and cutting-edge technologies. As an integral part of both the Atmospheric Radiation Measurement Facility (ARM) Data Center and the U.S. Geological Survey (USGS) Core Science Systems group, he works closely with product owners, scientists, and researchers to design and develop next-generation tools for managing, analyzing, and visualizing scientific data. These tools are aimed at advancing research in Earth, Climate, and Environmental sciences.

Chirag has a keen interest in various areas of software development, such as data management, data analytics, artificial intelligence, machine learning, the Internet of Things (IoT), and machine-to-machine communication. Chirag aims to harness technology's power to solve real-world challenges and is committed to staying at the forefront of technological advancements.

  • Accelerating scientific data releases: Automated metadata generation with LLM agents
Chris Holdgraf

Chris is the Executive Director of 2i2c. He is on the Jupyter Executive Council as well as the Jupyter Foundation Board. He has been a co-lead of several projects within the Jupyter ecosystem for over ten years (particularly the JupyterHub and Binder projects, as well as the Jupyter Book project), with a focus on how infrastructure can support interactive computing workflows in research and education. He’s interested in the boundary between technology, open-source software, and research and education workflows, as well as how open communities can support and extend these workflows in a way that makes science more impactful and inclusive. He was previously a post-doctoral researcher in the Department of Statistics at UC Berkeley, and a Community Architect with the Division of Data Science at Berkeley. His background is in cognitive and computational neuroscience, where he used predictive models to understand the auditory system in the human brain.

  • Jupyter Book 2.0 – A Next-Generation tool for sharing for Computational Content
Christopher Lamb, VP of Software for Compute Platforms at NVIDIA

Christopher Lamb is VP of Software for Compute Platforms at NVIDIA, where he runs a worldwide team building platforms for AI and parallel high-performance computing deployed across cloud, enterprise datacenter, edge and embedded applications. His team is responsible for the CUDA platform and the DGX/HGX cluster stack for AI. He holds six patents in parallel processing and earned a BS in Computer Engineering from the University of Illinois at Urbana-Champaign.

  • Python at the Speed of Light: Accelerating Science with CUDA Python
Daniel Chen

Daniel Chen is a data science educator working in developer relations at Posit, PBC, and a lecturer at the University of British Columbia. He specializes in teaching and advocating for modern data science tools and workflows.

  • Shiny for Python: Building Production-Ready Dashboards in Python
Debarshi Datta

Debarshi Datta, PhD is an Assistant Professor in Data Science at the Christine E. Lynn College of Nursing, Florida Atlantic University. He completed PhD in Experimental Psychology at the Charles E. Schmidt College of Science, Florida Atlantic University. Dr. Datta has experience developing AI-driven decision support systems in healthcare data, including understanding problem statements, handling disputes, exploratory data analysis, building models, data visualization, and data storytelling. Dr. Datta’s current research focuses on data-driven domains like AI/ML to understand a population-based disease prognosis. His primary research contribution has been finding the severity of the disease, decision-making for developing a model that comprehends the most significant features predicting mortality, and severity of disease utilizing traditional AI/ML techniques such as decision trees, random forest classifier, XGB boost, and deep learning. In other research, he is building a model to identify early prediction of Dementia. Dr. Datta received many intramural grants, including Early Prediction of Alzheimer's Disease and Related Dementias on Preclinical Assessment Data using Machine Learning tools, Seed Funding from Smart Health for COVID-19 research, NSF I-Corps Customer Discovery Funding, ALL of US Institutional Champion Award, among many others.

  • A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance
Deepak Cherian

Deepak Cherian is an Xarray maintainer and Forward Engineer at Earthmover. Previously he was an oceanographer at the National Center for Atmospheric Research. He helps build and maintain many parts of the scientific Python ecosystem, includinh Xarray, dask, zarr and related projects.

  • Hierarchical Data Analysis with Xarray DataTree & Zarr
  • The brave new world of slicing and dicing Xarray objects.
Deepyaman Datta

Deepyaman is an experienced data practitioner and tool builder. He was a Senior Staff Software Engineer at Voltron Data on the Ibis team. Before their acquisition by Voltron Data, he was a Founding Machine Learning Engineer at Claypot AI, working on their real-time feature engineering platform. Prior to that, he led data engineering teams and asset development across a range of industries at QuantumBlack, AI by McKinsey.

Deepyaman is passionate about building and contributing to the broader open-source data ecosystem. Outside of his day job, he helps maintain Kedro, an open-source Python framework for building production-ready data science pipelines.

  • Building machine learning pipelines that scale: a case study using Ibis and IbisML
  • Python is all you need: an overview of the composable, Python-native data stack
Denis Leshchev

Denis Leshchev is Senior Application Engineer, NVIDIA. Dr. Leshchev joined NVIDIA in 2024 and works as an application engineer for computational instruments where he focuses on customer adoption of hardware and software platforms targeting real-time AI, autonomous instruments, and tying high speed sensor I/O to GPU-accelerated compute. Dr. Leshchev has an extensive background in building scientific instrumentation, data acquisition and control systems, as well as data processing pipelines. He holds a PhD in Physics from Universite Grenoble Alpes (Grenoble, France).

  • Edge processing of X-ray ptychography: enabling real-time feedback for high-speed data acquisition
Dr. Malvika Sharan

Dr Malvika Sharan is co-Director of Open Life Science (OLS) and a senior researcher at The Alan Turing Institute. Her expertise includes designing people-centered collaboration and knowledge sharing models for data science and AI research communities. Through initiatives such as The Turing Way and OLS's training and capacity building programs, she has co-created and supported inclusive communities that empower researchers at all career stages to learn from and participate in open source and open science initiatives. Malvika is an active contributor to several open science initiatives, a Software Sustainability Institute fellow and one of the 2024 100 Brilliant Women in AI Ethics™. Originally from India, Malvika spent over a decade in Germany and now resides in the UK.

  • Keynote by Dr. Malvika Sharan
Dr. Subhosit Ray
  • A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance
Dr. Yusra AlSayyad

Yusra AlSayyad is the Deputy Associate Director of the Data Management subsystem for the Vera C. Rubin Observatory, where she coordinates for the Science Pipelines and Campaign Management teams, and still writes Python daily. She is a researcher at Princeton University and has been part of Rubin’s data management effort since 2012. She holds a Ph.D. in astronomy from the University of Washington and a master’s in computer science. Before grad school, she worked in information management for a consulting firm specializing in turnaround and restructuring. With Rubin soon entering survey operations, she brings lessons from over a decade of building scientific Python infrastructure for an observatory, astronomers, and the broader research community.

  • Keynote by Dr. Yusra AlSayyad
Draga Doncila Pop

I'm currently a PhD student working on timelapse microscopy data analysis, and I've been learning and working with Python for almost a decade now! I love the open-source community and everything it has to offer the world, and I've been lucky enough to make my own contributions to the community as a core developer for napari - an n-dimensional image viewer written entirely in Python. I'm passionate about making coding more accessible for scientists who want to make their own lives easier, and I love teaching everything from the fundamentals to the nitty gritty.

  • Create custom image visualization and analysis tools with napari
Dylan Wootton
  • Vega-Altair: A Structured Way to Build Interactive Charts
Elise Chavez

I am a graduate student at the University of Wisconsin-Madison working on a PhD in High Energy Experimental Particle Physics with the CMS experiment. I am primarily interested in software development for science and I hope to go into research software engineering so I can support scientific development through robust and sustainable software. I also work with Fermi National Accelerator Laboratory (Fermilab).

  • Enabling Innovative Analysis on Heterogeneous Clusters through HTCdaskgateway
Elliot Marx

Elliot Marx is one of the co-founders of Chalk. He started his career at Affirm, where he built the early risk and credit data infrastructure system (the inspiration for Chalk). He then co-founded Haven Money, which Credit Karma acquired to power its banking products. He holds a B.S. and M.S. in Computer Science from Stanford University.

  • Real-time ML: Accelerating Python for inference (< 10ms) at scale
Emily Dorne

Emily Dorne is a lead data scientist at DrivenData where she develops machine learning models for social impact. Her expertise lies in classifying animals in camera trap videos to support conservationists, identifying harmful algal blooms to support water quality managers, and helping data scientists consider the ethical implications of their work. She is passionate about using data for social good and has previously worked at the Gates Foundation, Stanford Center for International Development, and the Brookings Institution.

  • Zamba: Computer vision for wildlife conservation
Eniola Awowale
  • Hierarchical Data Analysis with Xarray DataTree & Zarr
Enrique Molina-Giménez

Enrique Molina Giménez is a PhD candidate at Universitat Rovira i Virgili (URV), currently contributing to the EU HORIZON EXTRACT and UNICO I+D CLOUDLESS research projects. His work focuses on distributed systems and cloud computing, particularly exploring smart provisioning strategies and infrastructure cost reduction. With a transversal background spanning both industry and academia, Enrique brings hands-on experience from software development to systems engineering.

  • Processing Cloud-optimized data in Python (Dataplug)
  • Processing Cloud-optimized data in Python with Serverless Functions (Lithops, Dataplug)
Eric Ma

As Senior Principal Data Scientist at Moderna Eric leads the Data Science and Artificial Intelligence (Research) team to accelerate science to the speed of thought. Prior to Moderna, he was at the Novartis Institutes for Biomedical Research conducting biomedical data science research with a focus on using Bayesian statistical methods in the service of discovering medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017 and defended his doctoral thesis in the Department of Biological Engineering at MIT in the spring of 2017.

Eric is also an open-source software developer and has led the development of pyjanitor, a clean API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He is also on the core developer team of NetworkX and PyMC. In addition, he gives back to the community through code contributions, blogging, teaching, and writing.

His personal life motto is found in the Gospel of Luke 12:48.

  • Building with LLMs Made Simple
  • Network Analysis Made Simple
Erik Welch

Erik Welch is a senior system software engineer on the RAPIDS cuGraph team at NVIDIA. He has 20 years' experience using Python as a scientist, engineer, and open-source developer on a wide range of data and high-performance computing problems. He primarily works on nx-cugraph, an accelerated backend to NetworkX, and is a primary maintainer of the popular toolz library.

  • Lessons Learned from Adding Backend Dispatching to NetworkX and scikit-image
Fernando Cervantes Sanchez

PhD in Computer Science focused in bioimage understanding through computational intelligence methods.

I currently work as Systems Analyst in the Research IT department of the The Jackson Laboratory, where my main role is assisting people with integration of machine learning methods in their image analysis pipelines.

  • Scaling-up deep learning inference to large-scale bioimage data
  • An Active Learning plugin in Napari to fine tune models for large-scale bioimage analysis
Filippo Balzaretti

Postdoctoral fellow at Stanford / SLAC National Laboratory with expertise in ab-initio computational modeling, surface science simulations, and machine learning techniques. Passionate about advancing interdisciplinary approaches to materials science and energy conversion, leveraging principles from conventional algorithms and Artificial Intelligence frameworks. Excited about developing new quantum chemistry methods to model materials properties and validate them through collaborative experimental work.

  • Can Scientific Python Tools Unlock the Secrets of Materials? The Electrons That Machine-Learning Can't Handle
Frank Strug

I am a physics PhD student at University of Illinois Chicago (UIC) and part of the CMS group there. My areas of research are developing intuitive python tools for accelerating high energy physics analyses and performing measuring entanglement of top quark systems at the LHC.

  • KvikUproot - Reading and Deserializing High Energy Physics Data with KvikIO and CuPy
Franklin Koch

Franklin is Lead Developer at Curvenote and core maintainer of MyST, working to redefine scientific communication.

  • Jupyter Book 2.0 – A Next-Generation tool for sharing for Computational Content
Guen Prawiroatmodjo

Guen is a software engineer at MotherDuck on the Ecosystems team. Guen has broad experience with software engineering, data engineering and data science with Python in the context of scientific data acquisition, analysis and computation for experimental physics. She has given introductory talks and workshops on SQL and DuckDB at various conferences, hackathons and events.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
Heberto Mayorquin

Heberto Mayorquin
Heberto Mayorquin holds a BSc in Physics, an MSc in Complexity Science, and a PhD in Computational Neuroscience. After a brief stint in the private sector optimizing SQL queries, he returned to science by joining CatalystNeuro. At CatalystNeuro, he helps neuroscience labs standardize their data—from extracting information buried in proprietary binary formats to streamlining metadata documentation and optimizing data layouts for long-term cloud storage. Within the organization, he serves as the lead maintainer of NeuroConv and is also a maintainer of SpikeInterface. His focus is on developing open-source tools and workflows that make it easier for researchers to share and reuse their own data, as he believes that open collaboration is a catalyst for scientific progress.

  • SpikeInterface: Streamlining End-to-End Spike Sorting Workflows
Henry Schreiner

Henry Schreiner is a Computational Physicist / Research Software Engineer in High Energy Physics at Princeton University. He specializes in the interface between high-performance compiled codes and interactive computation in Python, in software distribution, and in interface design. He has previously worked on computational cosmic-ray tomography for archaeology and high performance GPU model fitting. He is currently a member of the IRIS-HEP project, developing tools for the next era of the Large Hadron Collider (LHC).

He is a maintainer/core developer for pypa/build, scikit-build, cibuildwheel, pybind11, meson-python, nox, and plumbum for Python. He is an admin of Scikit-HEP, and a lead designer on boost-histogram, hist, UHI, vector, uproot-browser, Particle, and DecayLanguage packages there. He is also the lead author of the Scientific-Python Development guide and Scientific-Python/cookie. He is the primary author of CLI11, a C++ library used by Microsoft terminal and many others. He is also the lead web developer for IRIS-HEP. He is also the author of Modern CMake and a variety of CMake, GPU, and Python training courses and classes.

  • Packaging a Scientific Python Project
Hon. Kathryn D. Huff, PhD

Dr. Kathryn Huff is currently the Associate Professor at the University of Illinois at Urbana-Champaign in the Department of Nuclear, Plasma, and Radiological Engineering. Previously Dr. Huff led the Office of Nuclear Energy as the Assistant Secretary. Before joining the Department of Energy, she was a professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign, where she led the Advanced Reactors and Fuel Cycles Research Group. She was previously a postdoctoral fellow in both the Nuclear Science and Security Consortium and the Berkeley Institute for Data Science at the University of California - Berkeley. She received her Ph.D. in nuclear engineering from the University of Wisconsin-Madison in 2013 and her undergraduate degree in physics from the University of Chicago. Her research has focused on modeling and simulation of advanced nuclear reactors and fuel cycles.

She has previously been an active member of the American Nuclear Society, a past chair of the Nuclear Nonproliferation and Policy Division as well as the Fuel Cycle and Waste Management Division, and recipient of both the Young Member Excellence and Mary Jane Oestmann Professional Women's Achievement awards. Through leadership within Software Carpentry, SciPy, the Hacker Within, and the Journal of Open-Source Software, she has also advocated for best practices in open, reproducible scientific computing.

  • Keynote by Hon. Kathryn D. Huff, PhD
Ian Hunt-Isaak

I recently completed my PhD, during which I built software to manage the acquisition of combined epifluorescence and single-cell Raman spectroscopy time-lapse data. I extensively used Xarray and Zarr in both the data acquisition and analysis of this project. I have also presented multiple workshops on using Python for scientific data analysis and on using the SciPy stack (including Xarray) for microscopy data.

During graduate school, I discovered a passion for contributing to open-source scientific projects, which led me to my current role as a Xarray Community Developer at Earthmover. In this role, I am focused on improving Xarray for use cases in biological research.

  • Hierarchical Data Analysis with Xarray DataTree & Zarr
  • Xarray across biology. Where are we and where are we going?
Inessa Pawson

Inessa is building bridges between people, open science, and open source software. She is passionate about making Python accessible for learners at all levels and has led numerous newcomer sprints, study groups, and tutorials. Inessa currently serves on the NumPy Steering Council and PyOpenSci Advisory Board. In her role as Open Source Program Manager at OpenTeams, Inessa has launched and actively supports several educational initiatives focused on widening the open source contributor pipeline. She is perpetually fascinated by incentive design, collaborative intelligence, and jazz.

  • Open Code, Open Science: What’s Getting in Your Way?
  • AI for Scientific Discovery
  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Irina Demeshko

Irina Demeshko is a senior software engineer at NVIDIA working on cuNumeric and Legate projects. Before NVIDIA, Irina was a research scientist and team leader of the Co-Design team at the Los Alamos National Laboratory. Her work and research interests are in the area of new HPC technologies and programming models.

  • Scaling NumPy for Large-Scale Science: The cuPyNumeric Approach
Jacob Tomlinson

Jacob Tomlinson is a senior software engineer at NVIDIA. His work involves maintaining open source projects including RAPIDS and Dask. He also tinkers with kr8s in his spare time. He lives in Exeter, UK.

  • Teaching Python with GPUs: Empowering educators to share knowledge that uses GPUs
  • EffVer: Versioning code by the effort required to upgrade
Jay Qi

Jay Qi is a lead data scientist at DrivenData where he helps mission-driven organizations and institutions leverage machine learning, data science, and data engineering for social impact. He has worked on applying machine learning to a wide range of scientific contexts, including hydrological modeling, spacecraft dynamics, and wildlife conservation. Before DrivenData, Jay modeled failures of industrial machines using sensor data at Uptake, and he has a background in aerospace engineering and computational fluid dynamics. Jay is also an active open source software maintainer and contributor, working on projects including cookiecutter-data-science, cloudpathlib, and erdantic.

  • Zamba: Computer vision for wildlife conservation
Jean-Marc Delouis

Senior research engineer at CNRS, specialize in data analysis across astrophysics and oceanography. With extensive expertise in AI, scattering transform, and statistical modeling, I contribute to cutting-edge projects such as Planck’s SRoll algorithm and the development of the FOSCAT library on PyPi. I currently lead projects integrating deep learning and dimensionality reduction techniques for Earth and space science applications, including sea ice analysis, galaxy mapping and turbulence modeling.

  • Using Discrete Global Grid Systems in the Pangeo ecosystem
Jeremiah Paige
  • Open Code, Open Science: What’s Getting in Your Way?
  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Jim Kitchen

Jim Kitchen is the lead engineer on the team that built the Anaconda Toolbox and Anaconda Code add-ins for Excel

  • Develop Pythonic spreadsheets running Python in and out of the grid
Jim Pivarski
  • Thinking in arrays
Joe Cheng

Joe Cheng is CTO and first employee at Posit (formerly known as RStudio) and the creator of Shiny, a reactive web framework for creating data and AI applications using Python or R. He has been writing and maintaining open source software at the intersection of data analysis and the web for over 15 years.

  • Keeping LLMs in Their Lane: Focused AI for Data Science and Research
Joe Hamman

Joe Hamman is a climate scientist, engineer, and the co-founder and CTO of Earthmover, where he leads the development of Arraylake, a cloud platform for scientific data teams. Previously, he was a founder and Technology Director at CarbonPlan and a scientist at the Climate and Global Dynamics Laboratory at the National Center for Atmospheric Research. He holds a Ph.D. in Civil and Environmental Engineering from the University of Washington, and is a licensed Professional Engineer in Washington State. He co-founded the Pangeo Project and is a core developer of both the Xarray and Zarr-Python projects.

  • Hierarchical Data Analysis with Xarray DataTree & Zarr
John Kirkham
  • Reproducible Machine Learning Workflows for Scientists with pixi
Jon Mease
  • Vega-Altair: A Structured Way to Build Interactive Charts
Jonas Eschle

Physicist at CERN with a dedication focus on machine learning, statistical tools and software engineering.

  • zfit: scalable pythonic likelihood fitting
Jonathan Starr
  • AI for Scientific Discovery
Juanita Gomez

Juanita Gomez is a Ph.D. candidate in Computer Science at UC Santa Cruz, where her research focuses on improving the security of scientific open source software in collaboration with the Open Source Program Office (OSPO) at UCSC. She is a passionate programmer, mathematician, and open-source advocate, former developer of Spyder IDE at Quansight and current community leader for the Scientific Python project, a community effort to better coordinate and support scientific Python libraries.

  • Towards Robust Security in Scientific Open Source Projects
Julie Barnum

Hello, my name is Julie Barnum and I work at LASP in the Data Systems division. My background is in Applied Physics (Missouri State University, BS, 2015) and Atmospheric science (Colorado State University, MS, 2018). I've worked at LASP now for ~6.5 years. During my time here I've worn several hats, but currently work as a Project manager on four projects: Lead of the Python in Heliophysics Community (PyHC; pyhc.org), Lead for the Magnetospheric Multiscale (MMS) Science Data Center (SDC), Lead for the developing Heliophysics Software Search Interface (HSSI), and as a Co-coordinator for LASP's Boulder Solar Alliance Research Experience for Undergraduates (BSA REU) program.

My interests lie mainly in open-source software, open science, and community building as relates to the Heliophysics domain. In my spare time, I also greatly enjoy lifting weights, dancing, outdoor mountain activities, travelling, and learning languages (conversationally proficient in French, learning Korean now).

  • Open-source science-specific Research Software Engineer Communities: benefits and lessons learned
Julie Hollek
  • SciPy 2026
  • Organizing Conferences in These Times
Justus Magin

Justus Magin is a research engineer working at the Laboratoire de l’Oceanographie Physique et Spatiale (LOPS) in Brest, France, where he assists scientists in making computations scalable. He is also an Xarray maintainer and contributes to many projects in the Pangeo ecosystem, most notably to pint-xarray and xdggs.

  • Hierarchical Data Analysis with Xarray DataTree & Zarr
  • Using Discrete Global Grid Systems in the Pangeo ecosystem
  • The brave new world of slicing and dicing Xarray objects.
Katrina Riehl

Dr. Katrina Riehl is a Principal Technical Product Manager at NVIDIA supporting CUDA and Python educational initiatives. For over two decades, Katrina has worked extensively in the fields of scientific computing, machine learning, data science, and visualization. Most notably, she has helped lead data science initiatives at the University of Texas Austin Applied Research Laboratory, Anaconda, Apple, Expedia Group, Cloudflare, and Snowflake.

She is an active volunteer in the Python open-source scientific software community, serving as a NumFOCUS Board member 2018-2024 and President 2021-2024. She continues to serve the NumFOCUS community on the NumFOCUS Advisory Council.

  • The Accelerated Python Developer's Toolbox
  • GPU Accelerated Python
Kedar Dabhadkar

Data scientist at Lam Research with >6 years of experience in statistical data analysis, engineering, and machine learning. Independently researches applications of LLMs and statistical modeling to science and engineering domains. Built Fast Dash, an open-source Python library that transforms Python functions into interactive web applications.

  • Show your work: Tutorial on building and hosting web applications
Kevin Lee

Kevin Lee is a senior technical content developer on the Deep Learning Institute Team at NVIDIA. Kevin’s work focuses on raising awareness and driving adoption for GPU-accelerated technologies by creating developer focused hands-on training with an emphasis on Data Science, Computer Vision, and Large Language Models. Prior to NVIDIA, Kevin led a risk analytics team at Morgan Stanley and taught Data Science and Machine Learning at the University of California, Berkeley.

  • Bring Accelerated Computing to Data Science in Python
Kevin Lin
  • Generative AI in Education
Kyle Sunden

Kyle is a Research Software Engineer with Matplotlib under the NASA ROSES grant.

Kyle holds a PhD in Chemistry from the University of Wisconsin where he did nonlinear spectroscopy.

  • Dynamic Data with Matplotlib
Leah Wasser

Leah is the Executive Director and Founder of pyOpenSci, an open source Python community that makes it easier for scientists to make, find, contribute to, and maintain scientific open source software. She is also a maintainer of the package stravalib.

Leah is a heartfelt, open science advocate and is also passionate about breaking down barriers of entry to participating in the open source ecosystem. Education has always been a core part of the programs she's built.

  • Open Code, Open Science: What’s Getting in Your Way?
  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Leland McInnes

Leland McInnes is a researcher at the Tutte Institute for Mathematics and Computing. He is the author of a number of open source tools for data analysis and machine learning, including UMAP, HDBSCAN, PyNNDescent, EVōC, DataMapPlot and Toponymy.

  • DataMapPlot: Rich Tools for UMAP Visualizations
Lorena Barba

Lorena A. Barba is professor of mechanical and aerospace engineering at the George Washington University in Washington, DC. Her research interests include computational fluid dynamics, high-performance computing, and computational biophysics. An international leader in computational science and engineering, she is also a long-standing advocate of open source software for science and education, and is well known for her courses and open educational resources. Barba served (2014–2021) in the Board of Directors for NumFOCUS, a US public charity that supports and promotes world-class open-source scientific software. She is an expert in research reproducibility, and was a member of the National Academies study committee on Reproducibility and Replicability in Science. She served as Reproducibility Chair for the SC19 (Supercomputing) Conference, is Editor-in-Chief of IEEE Computing in Science & Engineering, was founding editor and Associate EiC (2016–2021) for the Journal of Open Source Software, and is EiC of The Journal of Open Source Education. She was General Chair of the global JupyterCon 2020 and was named Jupyter Distinguished Contributor in 2020.

  • Embracing GenAI in Engineering Education: Lessons from the Trenches
Luigi Cruz

Luigi Cruz is a computer engineer working as a staff engineer at the SETI Institute. He created the CUDA-accelerated digital signal processing backend called BLADE currently in use at the Allen Telescope Array (ATA) and Very Large Array (VLA) for beam forming and high-spectral resolution observations. Luigi is also the maintainer of multiple open-source projects like the PiSDR, an SDR-specialized Raspberry Pi image, CyberEther, a heterogenous accelerated signal visualization library, and Radio Core, a Python library for demodulating SDR signals using the GPU with the help of CuPy.

  • AI as a Detector: Lessons in Real Time Pulsar Discovery
Madicken
  • SciPy 2026
Mark Wolfman

Mark is a beamline scientist in the spectroscopy group at the Advanced Photon Source, collaborating closely with visiting researchers to execute cutting edge scientific experiments across a variety of disciplines. His research background emphasizes in-situ measurements, where chemical states are measured inside an operating battery in order to better understand the otherwise inaccessible dynamic processes. Combining high-resolution imaging, spectroscopy, and diffraction provides insights into the local interactions that drive the energy storage in cutting edge battery technologies. This research provides a foundation for future technological development that will deliver faster, more efficient, and safer energy storage solutions.

Mark completed his PhD in the chemistry department at the University of Illinois Chicago with Jordi Cabana, where he studied particle-level dynamics for layered cathodes. During this time, Mark spent a year as a visiting graduate student at the Advanced Photon Source through the U.S. Department of Energy’s Science Graduate Research Program. The result of this collaboration was a new cell for three-dimensional imaging of operating Li-ion batteries. He built upon his graduate research as a postdoctoral appointee in Interfacial Processes group working closely with Tim Fister to include additional 3D imaging in working cells, and high-temperature preparation of cutting edge battery materials. Throughout his work, Mark has written many scientific software packages to aid in data analysis and visualization.

  • Probing the Hidden World of Battery Chemistry With X-rays
Matt Haberland

Matt Haberland is an Associate Professor at Cal Poly, San Luis Obispo, and a maintainer of SciPy and NumPy.

  • SciPy’s New Infrastructure for Probability Distributions and Random Variables
Matthew Feickert

Matthew is a research scientist in experimental high energy physics and data science at the University of Wisconsin-Madison Data Science Institute (a "data physicist"). He works as a member of the ATLAS collaboration on searches for physics beyond the standard model with experiments performed at CERN's Large Hadron Collider (LHC) in Geneva, Switzerland. He also serves on the executive board of the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) where he is a researcher and the Analysis Systems Area lead. He is also a topical editor for physics and data science for the Journal of Open Source Software.

Matthew has served on the SciPy Organizing Committee since 2020, with roles as co-chair of the Physics and Astronomy specialized track and co-chair of the Program Committee.

  • Reproducible Machine Learning Workflows for Scientists with pixi
Michael Chow

I'm a data science tool builder at Posit, where I work on open source tools for data analysis (like siuba).

Previously, I worked as a consultant building out a data team for Caltrans (and love all things GTFS).

I received a Ph.D. in Cognitive Psychology from Princeton University, and am interested in what drives expert data science performance. This led me to build DataCamp Signal, adaptive tests of data science skill.

  • User guides: engaging new users, delighting old ones
Michael Horrell

PhD in Statistics from the University of Chicago

Previously:
- Head of Data Science at Uptake
- Technical Staff at SentiLink

Currently:
- AI and Data Consultant at AlixPartners

  • GBNet: Gradient Boosting packages integrated into PyTorch
Mihai Maruseac

Supply chain security @ Google OSS Security Team

Mihai Maruseac is a member of Google Open Source Security team (GOSST), working on Supply Chain Security, mainly on GUAC. Before joining GOSST, Mihai created the TensorFlow Security team after joining Google, moving from a startup to incorporate Differential Privacy (DP) withing Machine Learning (ML) algorithms. Mihai has a PhD in Differential Privacy from UMass Boston.

  • From Model to Trust: Building upon tamper-proof ML metadata records
Naty Clementi
  • Teaching Python with GPUs: Empowering educators to share knowledge that uses GPUs
Negin Sobhani

Negin Sobhani is a High Performance Computing consultant and computational atmospheric scientist at the National Center for Atmospheric Research (NCAR). She has extensive experience developing and supporting open-source tools and infrastructure that improve the performance and accessibility of Earth System models, bridging the gap between data science, atmospheric science, and software engineering. Her broader work encompasses the development of large-scale distributed training, optimization of resource utilization, and data pipelines across advanced computing environments for geoscience applications.

  • Scaling AI/ML Workflows on HPC for Geoscientific Applications.
  • Hierarchical Data Analysis with Xarray DataTree & Zarr
Nezar Abdennur

I am an Assistant Professor in the Department of Genomics and Computational Biology and the Department of Systems Biology at UMass Chan Medical School.

I lead a computational research group (https://abdenlab.org) with a dual mandate. My group's biological research focuses on the 3D organization of the genome (3C/Hi-C technologies), its relationship to the epigenome, and the resulting manifold influences on cellular fate, differentiation, aging, and disease. My group's open-source interests are in supporting foundational software infrastructure to improve genomic and multi-omic data science, especially in the scientific Python ecosystem.

  • Breaking the silo: composable bioinformatics through cross-disciplinary open standards
Noor Aftab

Noor Aftab is the Global Program Lead at Amazon Web Services (AWS), where she oversees the Strategic Customer Program for Amazon S3, managing some of the largest cloud, data, and analytics workloads globally. With a foundation as a data scientist and extensive experience in cloud technologies, Noor combines technical expertise with strategic leadership to help organizations optimize their data infrastructure and deploy AI-driven solutions at scale.

An advocate for diversity, innovation, and inclusivity in technology, Noor has spoken at 13 global locations, sharing her insights on AI adoption, advancing innovation, and creating equitable pathways for underrepresented groups in tech. Her talks highlight how data and AI can transform industries, empower communities, and foster inclusion across the tech ecosystem.

Noor is the founder of the International Women Economic Council (IWEC), a global platform empowering women through mentorship, networking, and professional development opportunities. By connecting women with industry leaders and fostering career growth, IWEC is dedicated to breaking barriers and promoting gender equality in the workplace.

She actively contributes to the advancement of open science through her involvement with NumFOCUS, PyData, and community-driven initiatives, where she fosters collaboration, inclusivity, and accessibility in data science and AI. Noor also leads initiatives like the IEEE Hour of Power training program, equipping professionals with practical AI and data science skills.

Her contributions to business leadership and women’s empowerment have earned her numerous accolades, including the Australia Alumni Excellence Award and the Asia Pacific HRM Congress Award. Noor’s work has been featured in prominent media outlets such as the BBC, Martha Vineyard Times, and Hindustan Times, highlighting her global impact in advancing diversity and innovation in technology.

GitHub: aftabn81 | Website: www.nooraftab.com

  • Unlocking the Missing 78%: Inclusive Communities for the Future of Scientific Python
Oleksandr Yardas

Oleksandr is a PhD student at the University of Illinois Urbana-Champaign working on numerical methods for time-dependent neutron transport.

  • Burning fuel for cheap! Transport-independent depletion in OpenMC
Patrick Kidger

Patrick is a tech lead on ML for protein optimization at Cradle.bio, and founded much of the open-source scientific JAX ecosystem. He has previously worked as an ML researcher at Google X, held a visiting appointment at Imperial College London, and received a PhD from Oxford on neural differential equations.

  • An introduction to the JAX scientific ecosystem
Pedro Garcia Lopez

Pedro Garcia Lopez is professor of the Computer Engineering and Mathematics Department at the University Rovira i Virgili (Spain).
He leads he “Cloud and Distributed Systems Lab” research group and coordinates large research european projects.
In particular, he leads CloudStars (2023-2027), NearData (2023-2025), CloudSkin (2023-2025),
and he participates as partner in EXTRACT (2023-2025). He also coordinated FP7 CloudSpaces (2013-1015), H2020 IOStack (2015-2017)
and H2020 CloudButton (2019-2022). Pedro Garcia Lopez is one of the main architects and leaders of the Lithops project that was created
in collaboration with IBM in the Cloudbutton.eu project. Pedro is the main author of the "Serverless End Game" and "Dataplug" papers and co-author
of the paper on "Transparent serverless execution of Python multiprocessing applications".

  • Processing Cloud-optimized data in Python (Dataplug)
  • Processing Cloud-optimized data in Python with Serverless Functions (Lithops, Dataplug)
Peter Fackeldey
  • Thinking in arrays
Peter Sobolewski

Peter is a napari core developer and a Systems Analyst in the Imaging Applications and Machine Learning team in Research IT at The Jackson Laboratory. In this day job, Peter supports users of open source imaging applications and workflows, both on local devices and on HPC. He also runs workshops to help users ease into the various applications. As a napari core developer, he focuses on user-facing bugs and issues, UI/UX, documentation, etc.

  • Scaling-up deep learning inference to large-scale bioimage data
  • Create custom image visualization and analysis tools with napari
Quynh L. Nguyen
  • Scaling NumPy for Large-Scale Science: The cuPyNumeric Approach
Raymond Hawkins
  • ReSCU-Nets: recurrent U-Nets for segmentation of multidimensional microscopy data
Rodrigo Fernandez-Gonzalez
  • ReSCU-Nets: recurrent U-Nets for segmentation of multidimensional microscopy data
Rowan Cockett

Rowan is the CEO and founder of Curvenote (https://curvenote.com), where we build tools to free science from static PDF documents such that the scientific community can share more interactive, reproducible, and richly-linked scientific content. Curvenote provides an all-in-one publishing platform for researchers, societies and institutes, with a focus on computational research.

Rowan is also on the steering-council for JupyterBook and MyST Markdown, which is part of Project Jupyter and provides widely used open-source tools for authoring and sharing scientific content. Rowan has a Ph.D. in computational geophysics from the University of British Columbia (UBC). While at UBC, Rowan helped start SimPEG (https://simpeg.xyz), a large-scale simulation and parameter estimation package for geophysical processes (electromagnetics, fluid-flow, gravity, etc.), which is used in industry, national labs, and universities globally.

Rowan has won multiple awards for innovative dissemination of research and open-educational resources, including a geoscience modeling application, Visible Geology, that has been used by more than a million geoscience students to interactively explore conceptual geologic models. In his previous role as the VP of Cloud Architecture at Seequent, Rowan ran a large software team working on computational software platforms, visualization tools, and version control systems for geoscientists.

  • SciPy Proceedings: An Exemplar for Publishing Computational Open Science
  • Jupyter Book 2.0 – A Next-Generation tool for sharing for Computational Content
Ruben Arts

Former robotics engineer now solving package management, so others don't have to experience what I had to go through. I'm a core maintainer of pixi and love sharing our work through talks, podcasts, or videos.

  • Reproducible Science Made Easy: Package Management with Pixi
  • Reproducible Machine Learning Workflows for Scientists with pixi
Ryan C Cooper

Ryan C. Cooper is an Associate
Professor-in-Residence at the University of Connecticut. His background
is in mechanics and materials science with an emphasis on numerical
simulations and engineering education. He has been using Jupyter and
GitHub to enhance the classroom experience for over six years. Prof.
Cooper has developed and free open source materials for computational
work in engineering and volunteered with the NumPy documentation team.
Ryan is an integral part of the AI in the School of Engineering
committee. He has a Ph.D. from Columbia University and spent two and a
half years at Oak Ridge National Laboratory as a Postdoctoral
researcher.

  • Generative AI in Engineering Education: A Tool for Learning, Not a Replacement for Skills
Sanjiban Sengupta

Sanjiban is a Doctoral Student at CERN, affiliated to the University of Manchester. He is researching on optimization strategies for efficient Machine Learning Inference for the High-Luminosity phase of the Large Hadron Collider at CERN within the Next-Gen Triggers Project. Previously, he was a Summer Student at CERN in 2022, and also contributed at CERN-HSF via the Google Summer of Code Program in 2021. In the development of SOFIE, he was particularly involved in the development of the Keras and PyTorch Parser, storage functionalities, machine learning operators based on ONNX standard, Graph Neural Networks support, etc. Moreover, he volunteered as a Mentor for the contributors of Google Summer of Code 2022, and again in 2023, 2024 and 2025, and the CERN Summer Students of 2023 working on CERN’s ROOT Data Analysis Project.

Previously, Sanjiban spoke at PyCon India 2023 about Python interfaces for Meta’s Velox Engine. He also presented a talk on the Velox architecture at PyCon Thailand 2023. He has been contributing to open-source projects on data science and engineering that includes ROOT, Apache Arrow, Substrait, etc.

  • Challenges and Implementations for ML Inference in High-energy Physics
Sanket Verma

I’m a software engineer deeply invested in the open-source scientific ecosystem. I love working with OS communities and contributing to various projects. I wear multiple hats in the open-source software world—writing code, creating processes and workflows, steering various committees, organising technical meet-ups and global conferences, and many more.

When I’m not working, you can find me trekking in the Himalayas. Mountains are my second favourite thing, right after computers. I also play video games, watch movies, play football and run marathons.

  • Learning the art of fostering open-source communities
Sarah Kaiser

Sarah has spent most of her career developing technology in the lab, from virtual reality hardware to satellites. She got her PhD in Physics by starting plasma fires with lasers, Python, and Jupyter Notebooks. She has also written tech books for folks of all ages, including ABCs of Engineering and Learn Quantum Computing with Python and Q#. As a Cloud Developer Advocate for Python at Microsoft and a Python Software Foundation Fellow, she finds all kinds of new ways to build and break OSS tools for data science and machine learning. When not at her split ergo keyboard, she loves boating in the Seattle area, laser cutting everything, and playing with her German Shepard, Chewie.

  • Develop Pythonic spreadsheets running Python in and out of the grid
  • Getting all your snakes in a grid: collaborating and teaching with Python in Excel and the Anaconda Toolbox
Sarah Purpura

Meteorologist and Software Engineer with a passion for leveraging technology to enhance environmental data analysis and decision-making.

  • From Legacy to Leading-Edge: Revamping NCEI Software for the Cloud Era
Sean W. Freeman

Sean Freeman is an Assistant Professor of Atmospheric and Earth Science at The University of Alabama in Huntsville (UAH), having started that appointment in Spring 2023. Before coming to UAH, he received undergraduate degrees in Computer Science and Meteorology from Florida State University and MS and PhD degrees in Atmospheric Science from Colorado State University. Sean's research interests are primarily in clouds and storms, in particular, understanding the kinds of environments that support cloud development and severe weather. He uses numerical weather modeling and advanced data science tools such as cloud tracking to uncover the basic building blocks of convection from models and observations, as well as new measurements of convective inflows and outflows with drones. He has served as a tobac lead developer since 2021 and has chaired the tobac steering committee since 2023. Outside of work, he enjoys photography, hiking, traveling, baking, curling, and watching college football.

  • tobac: Tracking Atmospheric Phenomena on Multiscale, Multivariate Diverse Datasets
Seher Karakuzu

I am a scientific software developer at Brookhaven National Laboratory. I contribute to Bluesky code stack which is a library for experiment control and collection of scientific data and metadata.

  • Edge processing of X-ray ptychography: enabling real-time feedback for high-speed data acquisition
Shaurya Agarwal

Shaurya Agarwal has been tinkering with Data, Cloud technologies, Machine Learning, Data Science and now GenAI for over 21 years. Now a Director with PwC India, Shaurya brings his expertise and experience to solve critical problems across a very wide set of domains.

  • The-Silmaril: Practice #ontology engineering with Python (and other languages).
Simon Adorf
  • GPUs & ML – Beyond Deep Learning
Siu Kwan Lam
  • Numba v2: Towards a SuperOptimizing Python Compiler
Siyu Qian

Data Scientist at Capital One

  • Retrieval Augmented Generation (RAG) for LLMs
Stacy Irwin
  • Keeping Python Fun: Using Robotics Competitions to Teach Data Analysis and Application Development
Stefan Krawczyk
  • Building LLM-Powered Applications for Data Scientists and Software Engineers
Stefanie Molin

Stefanie Molin is a software engineer at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also a core developer of numpydoc and the author of “Hands-On Data Analysis with Pandas: A Python data science handbook for data collection, wrangling, analysis, and visualization,” which is currently in its second edition and has been translated into Korean and Chinese. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

  • (Pre-)Commit to Better Code
  • Introduction to Data Analysis Using Pandas
Steve Purves

A scientific software developer experienced in Python, modern web development and cloud technology and computing, I've spent many years developing tool, research code and products for scientists and researchers in various fields including earth science, semiconductors and bio-medical research.

  • Jupyter Book 2.0 – A Next-Generation tool for sharing for Computational Content
Steve Van Tuyl
  • Real-world Impacts of Generative AI in the Research Software Engineer and Data Scientist Workplace
Sukhada Kulkarni

I am a Senior Manager in Data Science at Capital One, specializing in machine learning, generative AI, and Retrieval-Augmented Generation (RAG). I am passionate about leveraging AI to solve complex business challenges and drive innovation.

  • Retrieval Augmented Generation (RAG) for LLMs
Tetsuo Koyama

Interested in scientific computing and visualization with computer graphics.
Developer team member of PyVista.
Experience as a speaker:
- PyConJP 2019 speaker "Introduction to FEM Analysis with Python"
- PyConJP 2020 speaker "How to plot unstructured mesh file on Jupyter Notebook"
- SciPy Japan 2020 speaker "Translation Project of Mayavi2 documents"
- PyConJP 2021 speaker "Visualize 3D scientific data in a Pythonic way like Matplotlib"

  • 3D Visualization with PyVista
  • Open Code, Open Science: What’s Getting in Your Way?
  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Thomas J. Fan

Thomas J. Fan is a Member of Technical Staff at Modal and a maintainer of scikit-learn, an open-source machine learning library for Python. At scikit-learn, he led the development of DataFrame interoperability and GPU support through PyTorch. Previously, Thomas was a Columbia University researcher who worked on improving interoperability between machine learning frameworks and AutoML systems.

  • Dive into Flytekit's Internals: A Python SDK to Quickly Bring your Code Into Production
Tim Monko

I am a postdoctoral researcher at the University of Minnesota studying brain development in the Department of Pediatrics. When I'm not at the lab bench, I spend most of my time supporting other biologists doing my passion, microscopy and bio-image analysis. I learned Python to unify image processing and data analysis and have fallen in love with the open-source community. To this end, I have begun to contribute to the n-dimensional image viewer napari and its plugin ecosystem with my own tool, napari-ndev, intended to (batch) process bioimages from start to finish with no coding necessary. I hope to bring accessible, high-quality, reproducible science to all, regardless of their experience with programming.

  • Create custom image visualization and analysis tools with napari
  • From the outside, in: How the napari community supports users and empowers transition to contribution
Tina Odaka

Tina Todaka is a research engineer at IFREMER, working in the UMR-LOPS (Laboratoire d’Océanographie Physique et Spatiale). She leads the IAOCEA project, focusing on the hybridization of model, satellite, and in-situ data for oceanography using scattering transform techniques. She also leads the Pangeo-Fish project, a software package leveraging the Pangeo environment to help biologists efficiently compute fish tracks from biologging in-situ data and Earth science datasets.

Her research interests include optimizing scientific computing workflows in oceanography, from high-performance and cloud-based computing to their practical applications in policy decision-making. She actively contributes to open-source geospatial science, developing scalable and reproducible tools for large-scale oceanographic data analysis.

  • Using Discrete Global Grid Systems in the Pangeo ecosystem
Tom Nicholas

Tom Nicholas is a core developer of Xarray, and the original author of xarray.DataTree. He has made numerous contributions throughout the Pangeo stack, including to VirtualiZarr, Cubed, xGCM, and pint-xarray. He currently works on the open-source Pangeo stack full-time at Earthmover. Prior to that he worked at a non-profit on open-source tools for monitoring carbon dioxide removal, and as a Research Software Engineer in Ryan Abernathey's Climate Data Science Lab at Columbia University. He first started using the open-source scientific python stack during his PhD, when he was studying plasma turbulence in nuclear fusion reactors. He has delivered many Xarray tutorials, including at SciPy 2022, 2023, and 2024.

  • VirtualiZarr and Icechunk: How to build a cloud-optimised datacube of archival files in 3 lines of xarray
  • Hierarchical Data Analysis with Xarray DataTree & Zarr
  • Cubed: Scalable array processing with bounded-memory in Python
Tom White

Tom White is an independent software engineer. His long-term professional interest centres around large-scale distributed storage and processing. Over the last few years he has focused on big data infrastructure for scientists, including GATK, Scanpy, sgkit, and most recently Cubed. In a previous life Tom wrote “Hadoop: the Definitive Guide” published by O’Reilly. He lives in the Brecon Beacons in Wales with his family.

  • Cubed: Scalable array processing with bounded-memory in Python
Trevor Manz
  • Breaking the silo: composable bioinformatics through cross-disciplinary open standards
Tudor Garbulet

Software Engineer at Oak Ridge National Laboratory, specializing in Generative AI, and machine
learning applications for scientific data processing. Dedicated to designing scalable AI architectures,
collaborating with research teams, and integrating AI-driven solutions to enhance data workflows.

  • Accelerating scientific data releases: Automated metadata generation with LLM agents
Vyas Ramasubramani

Vyas Ramasubramani is an experienced scientist and open-source developer. He has presented scientific software numerous times, both at SciPy and other academic conferences. He has also presented scientific results at various national conferences. He has been a developer for the cuDF project for four years and currently leads its development along with its various subprojects including cudf.pandas and cudf-polars.

  • Accelerated DataFrames for all: Bringing GPU acceleration to pandas and Polars
Wolf Vollprecht
  • Reproducible Science Made Easy: Package Management with Pixi
Xinling

Data Scientist at Capital One

  • Retrieval Augmented Generation (RAG) for LLMs
Yuvi
  • Towards a more sustainable and reliable mybinder.org
hugo bowne-anderson

Hugo Bowne-Anderson is an independent data and AI consultant with extensive experience in the tech industry. He is the host of the industry Vanishing Gradients, where he explores cutting-edge developments in data science and artificial intelligence.
As a data scientist, educator, evangelist, content marketer, and strategist, Hugo has worked with leading companies in the field. His past roles include Head of Developer Relations at Outerbounds, a company committed to building infrastructure for machine learning applications, and positions at Coiled and DataCamp, where he focused on scaling data science and online education respectively.
Hugo's teaching experience spans from institutions like Yale University and Cold Spring Harbor Laboratory to conferences such as SciPy, PyCon, and ODSC. He has also worked with organizations like Data Carpentry to promote data literacy.
His impact on data science education is significant, having developed over 30 courses on the DataCamp platform that have reached more than 3 million learners worldwide. Hugo also created and hosted the popular weekly data industry podcast DataFramed for two years.
Committed to democratizing data skills and access to data science tools, Hugo advocates for open source software both for individuals and enterprises.

  • Escaping Proof-of-Concept Purgatory: Building Robust LLM-Powered Applications
  • Building LLM-Powered Applications for Data Scientists and Software Engineers
nate stemen

Nate is a Member of Technical Staff at Unitary Foundation working to make quantum computers useful, usable, and accessible. He mostly works on quantum error mitigation tooling, and is passionate about open source and the benefits it provides the scientific and technology ecosystem. In his spare time nate enjoys rock climbing, running, and building community.

  • Noise-Resilient Quantum Computing with Python