SciPy 2024

Aakash Varambhia

Aakash Varambhia works as a Data Science Lead at Johnson Matthey, focusing on delivering innovative imaging data science solutions for research and development. He completed his DPhil at the University of Oxford, specializing in using advanced tools for quantitative electron microscopy. His expertise covers analyzing various data types like images, timeseries, and spectra, all aimed at improving materials design processes.

  • Delivering state of the art imaging data science to aid research and development at Johnson Matthey
Adam Stewart

Adam J. Stewart's research interests lie at the intersection of machine learning and Earth science, especially remote sensing. He is the creator and lead developer of the popular TorchGeo library, a PyTorch domain library for working with geospatial data and satellite imagery. His current research focuses on building foundation models for multispectral imagery and weather forecasting. He received his B.S. from the Department of Earth and Atmospheric Sciences at Cornell University and his Ph.D. from the Department of Computer Science at the University of Illinois Urbana-Champaign. He currently works as a postdoctoral researcher at the Technical University of Munich under the guidance of Prof. Xiaoxiang Zhu.

  • TorchGeo: Advancing Earth Observation Through Machine Learning
Adam Thompson

Adam Thompson is a Principal Technical Product Manager at NVIDIA where he focuses on building hardware and software platforms targeting real-time AI, smart sensors, and tying high speed sensor I/O to GPU-accelerated compute. His work advances edge and datacenter/cloud collaborative workloads that integrate Digital Twins of instruments and AI training/fine-tuning deployments.

Adam is also the creator of cuSignal – a GPU-accelerated signal processing library written in Python. With over 400,000 downloads, cuSignal is widely used in the sensor processing communities, and - as of CuPy v13, is fully integrated within CuPy library.

He holds a Masters degree in Electrical and Computer Engineering from Georgia Tech and a Bachelors Degree in Electrical Engineering from Clemson University.

In his free time, Adam enjoys baking, listening to (and discovering!) indie music, modern lit, pour-over coffee techniques, and teaching.

  • Coming Online: Enabling Real-Time and AI-Ready Scientific Discovery
Akshay Gupta

Akshay Gupta is a Data Scientist and Python developer at Capital One, where he works on building libraries that cover a variety of functions from developer enablement to predictive modelling. Akshay's background is in mathematics & statistics, and he has been in the field for 7 years.

  • Picking your battles: when is a compiled language like Rust beneficial for Data Scientists?
Alex Monahan

Alex is a forward deployed software engineer at MotherDuck and writes blogs and docs part time for DuckDB Labs. He has a bachelor's in Industrial and Systems Engineering from Virginia Tech. Alex recently joined MotherDuck after 9 years at Intel. After starting at Intel as an industrial engineer, Alex later became a technical analyst, and then moved into a data scientist role. Back in 2020 Alex discovered DuckDB while building an internal self-service analytics platform. It was such a perfect fit that he quickly integrated it and began using it in multiple projects. Alex also became one of DuckDB's biggest Twitter fans! He has been diving deeper into duck-themed databases ever since.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
  • How to bootstrap a Data Warehouse with DuckDB
Alexander Kaszynski
  • 3D Visualization with PyVista
Ali Martz

Ali Martz is a Graduate Research Assistant pursuing a M.S. in Mechanical Engineering at Oregon State University. Her research focuses on the optimization of sustainable aviation fuel design.

  • Python for early-stage design of sustainable aviation fuels
Anant Mittal

Anant Mittal is a Ph.D. student at the Paul G. Allen School of Computer Science & Engineering, University of Washington, advised by Prof. James Fogarty. He also holds a graduate research assistant position at the eScience Institute, Scientific Software Engineering Center (SSEC.) His research interests include designing and building interactive systems for real-world human-computer interaction impact and evaluating them through mixed methods. His Ph.D. focuses on building systems for communication and collaboration in settings where multiple stakeholders have different roles. He has presented several papers and posters at conferences and given invited talks.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Andrew Huang

I'm an atmospheric scientist, python developer, and open source contributor working on the HoloViz ecosystem.

I am the lead developer of the Panel chat components to easily build an interface for interacting with Large Language Models (LLMs). I have shared applicable examples of integrating Panel chat components with LangChain, OpenAI, Mistral, LlamaCpp on https://holoviz-topics.github.io/panel-chat-examples/.

Connect with me on https://www.linkedin.com/in/huangandrew12/.

  • From RAGs to riches: Build an AI document inquiry web-app
Ankur Ankan

Ankur Ankan is a postdoctoral researcher at Radboud University in the Netherlands, where his research focuses on causal inference. His main interest lies in developing practical methods for causal inference, along with developing software tools for it. He also started and maintains the Python package pgmpy, which offers tools for probabilistic and causal inference in graphical models.

  • Introduction to Causal Inference using pgmpy
Anshul Tambay

Anshul Tambay is a Technical Program Manager with the UW Scientific Software Engineering Center (SSEC) at the eScience Institute. He aims to develop open-source infrastructure that bolsters research across a variety of disciplines.

Prior to joining SSEC, Anshul worked as a Data Analyst at Northwestern University’s Center for Neighborhood Engaged Research and Science, focusing on community violence intervention programs in Chicago. His other experience includes working in support at a tele-health software company and on a development study in Ethiopia, evaluating a mobile phone-based experience sampling method of measuring time use. Anshul received his B.A. in Economics and Mathematics from Grinnell College in Iowa.

Outside of work, Anshul enjoys pickup sports, reading longform journalism, and cooking. He is a passionate supporter of Bay Area sports and Leeds United and interested in learning more about statistical inference in sports.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Anuj Sinha

Anuj is a graduate student specializing in Data Science at the University of Washington, Seattle. Currently, he is working as a Research Graduate Scholar at the Scientific Software Engineer Center at eScience Institute, UW. Before joining UW, he worked as a Software Developer at Goldman Sachs, India for 4 years.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Bane Sullivan
  • 3D Visualization with PyVista
Benoit Hamelin
  • Vector space embeddings and data maps for cyber defense
Bill Little

Bill Little, creator of GeoVista, is a software engineer working at the UK Met Office and a core developer on SciTools, which includes Cartopy and Iris.

  • 3D Visualization with PyVista
Biola Adeyemi

Bachelors of Engineering in Agricultural and Biosystems Engineering with a career driven by a passion for exploring the intersection of machine learning/AI and biological systems. I am interested in building and testing statistical learning methods that can do a better job of analyzing data collected either through sensors or/and quantitative techniques to understand the relationships between plant, soil & water resources, nutrients & energy and environment.

  • Image analysis and visualization in Python with scikit-image, napari, and friends
Brandyn Lucca

Brandyn Lucca is a posdoctoral scholar at the Applied Physics Laboratory, University of Washington (Seattle, WA). His academic background includes a BSc in marine biology (University of Rhode Island), and both a MSc and PhD in Marine and Atmospheric Science (Stony Brook University). Brandyn's research focuses on using active acoustics to study environmental variability in the spatiotemporal distributions of marine organisms, and better understand how sound scatters from different types of animals through physics-based and numerical methods.

  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Braxton Cuneo

Braxton Cuneo is an Assistant Professor at Seattle University's CS department and a member of the PSAAP Center for Exascale Monte Carlo Neutron Transport (CEMeNT). Braxton specializes in GPU and parallel processing, with a focus upon making resource/execution management more efficient and ergonomic.

  • Dante’s Externo: Injecting Python Functions into a Template-Driven CUDA C++ Framework
Bryan Van de Ven

Bryan is a Senior Systems Software Engineer at NVIDIA, where he works on Python tools for distributed GPU computing. Previously he worked at Microsoft, and also at Anaconda, where he created the conda package manager and co-created the Bokeh visualization library.

  • Interactive data visualizations with Bokeh (in 2024)
CaesarTuguinay
  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Chi Wang

Chi Wang is a principal researcher in Microsoft Research. He has worked on large language model and AI frameworks, automated machine learning, machine learning for systems, scalable solutions for data science and data analytics, and knowledge mining from text data and graph data (with a SIGKDD Data Science/Data Mining PhD Dissertation Award). Chi is the creator of AutoGen, a popular and rapidly growing open-source framework for enabling next-gen AI applications (with an Open100 award and TheSequence’s pick of 5 favorite AI papers in 2023). Chi is the creator of FLAML, a fast open-source library for AutoML & tuning used widely inside and outside Microsoft.

  • Building Multi-Agent Generative-AI Applications with AutoGen
Christine Smit

I'm a principal software engineer at the NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC). Our prime directive is to archive earth science data and make that data available to the public for free. Since joining the GES DISC, I've mainly focused on the services end of public data access, working on tools that allow users to do some initial data exploration and visualization without having to download, understand, and open raw data files. I'm happy to wax poetic about metadata, interoperability, and well designed colorbars.

  • Free, public, standardized Zarr stores of geospatial data in the cloud for all! Now in Beta.
Christopher Davis

Chris is a Professor of Teaching in the Math Department at the University of California, Irvine. Although his research background is in theoretical math, Chris has been teaching introductory programming courses in the math department since 2015. Chris first began contributing to Vega-Altair in 2021, and is currently one of the active maintainers of the library. Sample videos of Chris’s teaching are available at https://youtu.be/n61BNVCuTgM?si=ZYJLh73UgDCAXhZv (from 2013, on cryptography labs using the Sage mathematical software) and at https://youtu.be/Ph--xNiz3kM?si=qtLagdb0oFzHme1t (from a few years ago, on estimating probabilities using Matlab).

  • Data Visualization with Vega-Altair
Cliff Kerr

Dr. Cliff Kerr is a Senior Research Scientist at the Institute for Disease Modeling, part of the Bill & Melinda Gates Foundation, where he works on COVID-19, STIs, and family planning. Previously, he completed a B.S. in neuroscience and a Ph.D. in physics, was a lecturer in scientific computing at the University of Sydney, co-founded two startups (on data analytics and health economics), worked on a DARPA project teaching robots to pick up balls, and developed an algorithm that composes music in real time based on brain activity recordings. He lives in New York.

  • Starsim: A flexible framework for agent-based modeling of health and disease
Connor Stone

I am a postdoctoral fellow at the Université de Montréal in Canada. I apply statistics to astronomical problems and I'm not afraid to develop some open source software along the way! I study Galaxies, strong gravitational lensing, and machine learning.

  • Development of AstroPhot: Fitting Everything Everywhere all at Once in Astronomical Images
Cordero Core

Cordero Core is a highly skilled senior software engineer with over 14 years of experience successfully delivering innovative software solutions to healthcare, e-commerce, aerospace, and security industries. He has achieved a patent for his groundbreaking work in computational microscopy and digital pathology that has revolutionized imaging and analysis techniques for medical professionals. Cordero maintains active involvement in his field through mentoring startups and entrepreneurs as an advisory board member for the Journal of Small Business and Enterprise Development. He is currently focused on creating software solutions that enable scientific research, data management, and collaboration through his role as a Senior Software Engineer at the eScience Institute.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Dan Schult

Dan Schult is the Charles G Hetherington Professor of Mathematics at Colgate University. He comes to scientific computing from studies of spreading processes. In some circles he is better known as a founding developer of NetworkX. He has attended SciPy a number of times and it is always worthwhile.

  • Sparse arrays in scipy.sparse
Darren Vengroff

Dr. Vengroff is a Computer Scientist with 20+ years of experience in Data Science, Machine Learning, Algorithms, and Software Development. He is the creator and principle maintainer of several open-source projects including censusdis and divintseg.

Dr. Vengroff has worked with organizations large and small, ranging from tech startups to the Bill and Melinda Gates Foundation, Microsoft, and Amazon. His recent work centers on metrics of diversity and integration (e.g. an interactive map of diversity and integration in the U.S.), and modeling techniques to identify systematic bias in areas including home valuation, eviction, and food accessibility. He holds a B.S.E. from Princeton University and an Sc.M. and Ph.D. from Brown University.

Dr. Vengroff's blog can be found at https://datapinions.com.

  • Working with U.S. Census Data in Python: Discovery, Analysis, and Visualization
  • An Introduction to Impact Charts
Dewey Dunnington

Dewey Dunnington (Ph.D., P.Geo.) is a software engineer and geoscientist based in Nova Scotia, Canada. As a software engineer he works on all things Apache Arrow at Voltron Data, Inc., including standards for geospatial data connectivity, R bindings for Apache Arrow, and Arrow Database Connectivity (ADBC). As a geoscientist, he has worked in contaminated site remediation, taught Applied Geomorphology at Acadia University, and has authored more than a dozen articles on lake water and sediment geochemistry. Dewey is an Apache Arrow Project Management Committee member, an RStudio-certified tidyverse instructor, an NSERC Postgraduate Scholarship (Doctoral) recipient, and maintainer of dozens of R, Python, C, and C++ libraries at the intersection of geoscience, geospatial data, and enterprise data connectivity.

  • Introducing nanoarrow: the world's tiniest Arrow Implementation
Dharhas Pothina

Dharhas Pothina is the CTO at Quansight where he helps clients wrangle their data using the PyData stack. His background includes expertise in computational modeling, big data/high performance computing, visualization, and geospatial analysis. He has been part of the Holoviz (HvPlot) and Dask communities for over 10 years and has given many talks and workshops on distributed computing and big data visualization and actively leads large-scale data science projects at Quansight.

  • Data of an Unusual Size (2024 edition): A practical guide to analysis and interactive visualization of massive datasets
  • From RAGs to riches: Build an AI document inquiry web-app
Dhavide Aruliah

Dhavide Aruliah has been teaching & mentoring both in academia and in industry for three decades. His career has grown around bringing learners from where they are to where they need to be mathematically & computationally. He was a university professor (Applied Mathematics & Computer Science) at Ontario Tech University before moving to industry where he oversaw training programs supporting the PyData stack at Anaconda Inc. and later at Quansight LLC. He has taught over 40 undergraduate- & graduate-level courses at five Canadian universities as well as numerous Software Carpentry & PyData tutorial workshops. Video examples of his teaching include:

  • https://www.youtube.com/watch?v=CW8hTe21LPg
  • https://www.youtube.com/watch?v=LjQlmee58hg
  • https://www.youtube.com/watch?v=uZFQsv4WO8M
  • Determining Climate Risks with NASA Earthdata Cloud
Diego Kiedanski
  • A hands-on forecasting guide: from theory to practice
Dingrui Lei
  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Don Setiawan

Don Setiawan is a Senior Research Software Engineer at the University of Washington, eScience Institute, Scientific Software Engineering Center (SSEC). He has expertise in Python programming, web development, geospatial data analytics, and cloud-based data engineering. He is interested in building scalable, open software to facilitate scientific discovery across fields and enforce software best practices. He has been involved with various open-source software projects with Ocean Observatory Initiative (OOI), U.S. Integrated Ocean Observing System (IOOS), National Oceanic and Atmospheric Administration (NOAA), and National Aeronautics and Space Administration (NASA).

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Doris Lee

Doris Lee is currently leading Python and data science product efforts at Snowflake. Previously, Doris is the CEO and co-founder of Ponder, the company behind the open source project Modin. Ponder was acquired by Snowflake in 2023. Doris received her Ph.D. from the UC Berkeley RISE Lab and School of Information in 2021, where she developed tools that help data scientists explore and understand their data. She is the recipient of Forbes 30 under 30 for Enterprise Technology in 2023.

  • Scaling your data science workflows with Modin
Elena Felder

Elena Felder works on ecosystem integrations at MotherDuck, a DuckDB-powered serverless data warehouse. Diving into data, and helping rows, columns and numbers tell interesting stories has always been a personal hobby! After working mostly in Java for many years, Elena keeps being awed at how dozens of lines of Java code can often be solved by a Python one liner. Prior to embarking on a duck-shaped startup adventure, she led Cloud Java frameworks integrations at Google.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
Emily Dorne

Emily Dorne is a lead data scientist at DrivenData where she develops machine learning models for social impact. Her expertise lies in classifying animals in camera trap videos to support conservationists, identifying harmful algal blooms to support water quality managers, and helping data scientists consider the ethical implications of their work. She is passionate about using data for social good and has previously worked at the Bill & Melinda Gates Foundation, Stanford Center for International Development, and the Brookings Institution.

  • Using Satellite Imagery to Identify Harmful Algal Blooms and Protect Public Health
Eniola Awowale

I am an earth scientist and software developer at NASA. I create tools to help solve ecological and environmental issues.

  • Simplifying analysis of hierarchical HDF5 and NetCDF4 files with xarray-datatree
Eric Heiden

Eric is a research scientist at NVIDIA where he develops the open-source Python library Warp. His research interests lie in the intersection of simulation and robotics, particularly differentiable simulators that can be used to reduce the reality gap and control dynamical systems through optimization.
He received his Ph.D. in Computer Science from the University of Southern California under the supervision of Prof. Gaurav Sukhatme.

  • Warp: Advancing Simulation AI with Differentiable GPU Computing in Python
Eric Ma

As Principal Data Scientist at Moderna Eric leads the Data Science and Artificial Intelligence (Research) team to accelerate science to the speed of thought. Prior to Moderna, he was at the Novartis Institutes for Biomedical Research conducting biomedical data science research with a focus on using Bayesian statistical methods in the service of discovering medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017 and defended his doctoral thesis in the Department of Biological Engineering at MIT in the spring of 2017.

Eric is also an open-source software developer and has led the development of pyjanitor, a clean API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He is also on the core developer team of NetworkX and PyMC. In addition, he gives back to the community through code contributions, blogging, teaching, and writing.

His personal life motto is found in the Gospel of Luke 12:48.

  • How to foster an open source culture within your data science team
  • LlamaBot: a Pythonic interface to Large Language Models
Erick Martins Ratamero

I manage the Imaging Applications team at The Jackson Laboratory, working in the Research IT department and providing support on imaging data analysis and management to 60+ research groups. My main area expertise is in microscopy data analysis and management, with my previous life being the data person at the CAMDU core facility at Warwick Medical School. I have a PhD in Analytical Science, developing computational models of bacterial cell division, from the University of Warwick.

  • Image analysis and visualization in Python with scikit-image, napari, and friends
  • Expanding the OME ecosystem for imaging data management
Franz Kiraly
  • Enhancing Predictive Analytics with tsbootstrap and sktime
Fritz Lekschas

Fritz Lekschas is a computer scientist researching scalable visual exploration of biomedical data. As the Head of Visualization Research at Ozette Technologies, he is leading the development of web-based data visualization and exploration tools for analyzing high-dimensional single-cell data. Fritz earned his PhD in computer science from Harvard University, where he was advised by Hanspeter Pfister and Nils Gehlenborg. He has published more than twenty peer-reviewed papers and his work has been recognized with several awards.

In his free time, Fritz likes to work on open-source tools for visual data exploration like Jupyter Scatter.

  • Bring your __repr__’s to life with anywidget
Gil Forsyth

Gil Forsyth is a software engineer at Voltron Data. He followed the common career path of Japanese language specialist -> administrative assistant -> mechanical engineer -> computational fluid dynamicist -> data scientist -> software engineer -> machine learning engineer -> software engineer. Gil contributes to several projects in the PyData ecosystem and is a core maintainer of xonsh and Ibis. He served as the program chair for the Scientific Computing with Python (SciPy) conference from 2017 to 2020.

  • Ibis: because SQL is everywhere and so is Python
  • Intro to Ibis: blazing fast analytics with DuckDB, Polars, Snowflake, and more, from the comfort of your Python repl.
Giordon Stark
  • How the Scientific Python ecosystem helps answering fundamental questions of the Universe
Gordon Watts

Gordon Watts is a professor of physics at the University of Washington, Seattle, and a member of the ATLAS experiment at the Large Hadron Collider at CERN and deputy director of the National Science Foundation's IRIS-HEP Software Institute. He has extensive lecture and tutorial teaching experience in classrooms, labs, and informal tutorial settings. One of his main ATLAS responsibilities is helping to bring python-based analysis techniques to the ~3000 physicists who are part of the ATLAS experiment.

  • How the Scientific Python ecosystem helps answering fundamental questions of the Universe
  • Thinking In Arrays
Guen Prawiroatmodjo

Guen is a software engineer at MotherDuck on the Ecosystems team. Previously, she was a Sr. Quantum Measurement Engineer at Microsoft. Guen has broad experience with software engineering, data engineering and data science with Python in the context of scientific data acquisition, analysis and computation for experimental physics and biotech. She has given introductory talks and workshops on quantum computing with Python at various conferences, hackathons and events.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
  • How to bootstrap a Data Warehouse with DuckDB
Hajime Takeda

Hajime is a data professional with five years of expertise in marketing, retail, and eCommerce, working across Japan and the United States.

As a Data Analyst at Procter and Gamble and MIKI HOUSE Americas, Hajime has led data-driven strategy formulation and implemented technology initiatives such as e-commerce expansion, advertising optimization, and the identification of growth opportunities.

  • Introduction to Causal Inference with Machine Learning
Heinrich Peters
  • Model Share AI: An Integrated Toolkit for Collaborative Machine Learning Model Development, Provenance Tracking, and Deployment in Python
Henry Schreiner
  • Scikit-build-core: A modern build-backend for CPython C/C++/Fortran/Cython extensions.
Ian Spektor

Lead Machine Learning Engineer @ Tryolabs | Founding Engineer @ Puppeteer AI | CTO @ Buen Provecho

Currently building
- Temporian, an open-source Python library for preprocessing and feature engineering of temporal data
- Puppeteer, an actually useful AI platform for the healthcare industry
- Buen Provecho, a startup fighting food waste in Latin America

  • Safe, fast, and easy time series preprocessing with Temporian
  • A hands-on forecasting guide: from theory to practice
Irfan Alibay

Dr Irfan Alibay has been a maintainer for the MDAnalysis project (https://www.mdanalysis.org/) since 2020. For his day job, he acts as the Science Lead for the Open Free Energy initiative (https://openfree.energy/) where he leverages open source tools to accurately estimate protein-ligand binding affinities.

Examples of previous talks from Irfan include an MDAnalysis Bioexcel webinar: https://www.youtube.com/watch?v=1Wot83DSt4E and the MDAnalysis 2023 User Group Meeting state of the Union: https://zenodo.org/records/8388971.

  • Towards MDAnalysis 3.0: a fast, interoperable, and extensible community-driven ecosystem for handling molecular simulation data
Isaac Corley

Isaac Corley is a Ph.D. student in Electrical Engineering at the University of Texas at San Antonio (UTSA) under the supervision of Prof. Peyman Najafirad. His research is at the intersection of 3D computer vision and remote sensing for single and multiview reconstruction of buildings using aerial and satellite imagery. He is passionate about open-source software development and a maintainer of TorchGeo.

  • TorchGeo: Advancing Earth Observation Through Machine Learning
Ishika Khandelwal

Ishika is a graduate Student at the UW - Seattle, specializing in Data Science. She is currently working as a Research Graduate Scholar at the Scientific Software Engineer Center at eScience Institute, UW. Before this, Ishika worked as a Decision Analytics Associate at ZS Associates, India wherein she played a pivotal role in informing strategic decision-making processes and driving impactful outcomes for a pharma client, ultimately contributing to the success of their business objectives. Her passion lies at the confluence of Software Development Engineering (SDE) and Data Science best practices and is driven by a relentless pursuit of knowledge and a thirst for mastery in these dynamic disciplines.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Jacob Smith

Jacob Smith is a data scientist in the nuclear nonproliferation division at Oak Ridge National Laboratory. He completed his B.S. in computer science at Tennessee Technological University in 2018. His research interests include visual analytics, automated analysis, and machine learning.

  • Supporting Greater Interactivity in the IPython Visualization Ecosystem
James A. Bednar

Dr. James A. Bednar is the Director of Professional Services at Anaconda, Inc. Over a 10-year career of lecturing at the University of Edinburgh (UK), he received multiple nominations for teaching awards and published more than 50 scientific papers. He founded the HoloViz project and manages the Datashader, Param, and Colorcet packages within it.

  • hvPlot and Panel: Easy data visualization, data exploration, and data apps
Jay Chia

Jay is a cofounder of Eventual and a primary contributor to the Daft open-sourced project. Prior to Eventual, he was a software engineer building large scale ML data systems for computational biology at Freenome and self-driving cars at Lyft. He hails from the sunny island nation of Singapore, and used to command a platoon of tanks in the Singapore military.

  • Building Daft: Python + Rust = a better distributed query engine
Jean-Christophe Fillion-Robin

Jean-Christophe Fillion-Robin is an open-source enthusiast known as the original author of scikit-build. Currently, he holds the position of distinguished engineer at Kitware Inc, where he spearheads the development of commercial applications leveraging "3D Slicer". Additionally, Jean-Christophe maintains python-cmake-buildsystem, providing a CMake-based alternative build system tailored for CPython.

  • Scikit-build-core: A modern build-backend for CPython C/C++/Fortran/Cython extensions.
  • ITK-Wasm: Universal spatial analysis and visualization
Jeremiah Paige
  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
Jessica Scheick
  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Jim Crist-Harif
  • Intro to Ibis: blazing fast analytics with DuckDB, Polars, Snowflake, and more, from the comfort of your Python repl.
Joanna Piper Morgan

Piper is a mechanical engineering PhD student at Oregon State University where she works with Dr. Kyle Niemeyer in the Center for Exascale Monte Carlo Neutron Transport (CEMeNT). Her current work focuses on enabling both Nvidia and AMD GPU support on Monte Carlo / Dynamic Code (MC/DC) a Python, high performance, fully transient neutron transport application, built for rapid numerical methods exploration. She also works on developing novel transient deterministic neutron transport algorithms for GPU accelerators. She has previously worked at Advanced Miro Devices (AMD) and Los Alamos, Argonne, and Thomas Jefferson National Labs. Piper has her MS in mechanical engineering from Oregon State University and her BS in mechanical engineering from the Oregon Institute of Technology. She expects to graduate with her PhD sometime in 2025.

  • Monte Carlo/Dynamic Code: Performant and Portable High-Performance Computing at Scale via Python and Numba
Jon Mease

Jon is a visualization software engineer with past experience leading and contributing to a variety of open source Python visualization projects including plotly.py, HoloViews, Datashader. Jon is an active maintainer of Vega-Altair and the creator of VegaFusion (which scales Vega-Altair to large datasets) and VlConvert (which provides static image export of Vega-Altair visualizations). Jon has shared his experience through a variety of talks at past SciPy and PyData conferences. A full list of talks is available at https://jonmmease.dev/talks/.

  • Data Visualization with Vega-Altair
Jordão Bragantini
  • Image analysis and visualization in Python with scikit-image, napari, and friends
  • ultrack: large-scale versatile cell tracking in Python under segmentation uncertainty
Jorge Paz Soldan Palma

Postdoctoral fellow at the University of South Carolina. Research interest are thermodynamic modeling, density functional theory, calorimety ,and molten salts.

  • Uncertainty quantification and propagation of the NaCl-KCl-MgCl2 pseudoternary system for molten salt application
Josh Borrow

I am a Research Software Engineer working on the Simons Observatory at the University of Pennsylvania USA, working on data management and visualisation. I also have interests in numerical galaxy formation simulations

I was previously postdoctoral researcher in astrophysics at the MIT Kavli Institute, in Massachusetts, USA. I did my PhD at the Institute for Computational Cosmology at the University of Durham in the UK.

  • Making Research Data Flow with Python
Joshua Taillon

Dr. Joshua Taillon is a staff scientist within the NIST Office of Data and Informatics, working in the Data Science group as a Materials Research Engineer. Drawing on his extensive background in materials characterization, his professional interests lie at the intersection of materials characterization and data science, utilizing machine learning, artificial intelligence, and state-of-the art signal/data processing techniques to facilitate greater understanding of material systems.

Prior to this appointment, Josh was an NRC Postdoctoral Associate in NIST's Microscopy and Microanalysis Research Group. During this time, his research included the development and application of novel data acquisition and processing schemes in both electron and ion-beam microscopy. He received a B.S. from Cornell University, and as an NSF Graduate Research Fellow, received his Ph.D. in Materials Science and Engineering from the University of Maryland where he specialized in analytical transmission electron microscopy and focused ion beam nanotomography.

  • HyperSpy – Your Multidimensional Data Analysis Toolbox
Juan Cabanela

I am a Astrophysicist by training and a Computer Programmer by necessity. I have recently transitioned from teaching primarily Physics and Astrophysics to teaching Computer Science.

  • Building Complex Web Apps with Jupyter Widgets
Justin Braaten

Justin is a developer relations engineer supporting the Google Earth Engine project. Justin has a background in satellite image data processing for ecological applications and a passion for educating and inspiring people to use geospatial technologies to tackle environmental challenges.

  • Bridging the gap between Earth Engine and the Scientific Python Ecosystem
Karthik Venkataramani

Karthik Venkataramani is a postdoctoral scholar working in the Civil and Environmental Engineering department and the eScience institute at the University of Washington, Seattle. Dr. Venkataramani's research work focuses on developing machine learning tools and models for geospatial applications, and he is currently working on refining Digital Elevation Models (DEMs) using deep learning approaches. Prior to this, Dr. Venkataramani worked as a Postdoctoral Researcher at the NASA Jet Propulsion Laboratory on the Observational Products for End-Users from Remote Sensing Analysis project, which generates a near-global suite of analysis ready data products from synthetic aperture radar (SAR) and optical data. Dr. Venkataramani received his MS and PhD in Electrical and Computer Engineering from Virginia Tech.

GitHub: https://github.com/kvenkman
LinkedIn: https://www.linkedin.com/in/karthikvenkataramani/

  • Determining Climate Risks with NASA Earthdata Cloud
Kevin Lacaille

Kevin Lacaille works as a senior software engineer at Spexi Geospatial, a crowd-sourced drone imagery company, where he combines the worlds of GIS and computer vision to create an open marketplace for ultra high-resolution Earth imagery. Kevin specializes in building image processing solutions for science and educational teams and communicates his workflows with blog posts and webinars. Kevin has over 8 years of public speaking experience ranging from conferences to academic lecturing. Recently, he presented hands-on tutorials at PyCon 2023 and SciPy 2022. In the past, Kevin has presented at L3Harris Engineers Week 2021, and the Canadian Astronomical Society Conference 2019. Portfolio: www.lacaille.dev

  • Hobby Drones, Urban Forests: A Geospatial Journey to Greener Cities
Kyle Barron

Kyle is a software engineer at Development Seed where he builds open source tools and infrastructure that process and visualize geospatial data. He has expertise in cloud-native geospatial vector data formats, speeding up Python and JavaScript applications from Rust, spatial indexes, and efficient data pipelines. Kyle holds a B.A. in Economics, minoring in Mathematics from the University of California, Los Angeles which he earned in 2017.

Kyle previously worked as a software engineer at Unfolded and then Foursquare, building browser-based geospatial data visualizations on the web for vector and raster data.

  • Lonboard: Fast, interactive geospatial vector data visualization in Jupyter
Kyle Niemeyer
  • Monte Carlo/Dynamic Code: Performant and Portable High-Performance Computing at Scale via Python and Numba
  • Python for early-stage design of sustainable aviation fuels
Lars Grüter

Lars is currently working as a freelance and core developer for the image processing library scikit-image. With an education in electrical engineering and a focus in health and sensor technologies, he has worked as a research assistant on adaptive ultrasound imaging at the TU Dresden. As a student, he started contributing to the scientific Python ecosystem and discovered his interest for signal processing, Linux, and especially Python’s scientific ecosystem.

  • Image analysis and visualization in Python with scikit-image, napari, and friends
Leah Wasser

I am the Executive Director and Founder of pyOpenSci - a non profit organization that is devoted to helping scientists tackle the world's greatest challenges by empowering them with the skills and tools needed to make their science more open and collaborative. We run an open peer review process for scientific Python software and also develop training resources around open science topics. We have been doing significant work in the Python ecosystem to bridge the technical understanding gap between the broader packaging community and what scientists need.

I've been teaching data-intensive topics for almost 20 years and am passionate about translating technical topics to beginners. I'm also a maintainer of the package stravalib. When i'm not working on all things Python, i'm outside on the trails, climbing mountains with my rescue pup or at the gym doing cross fit.

  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
  • The power of community in solving scientific Python’s most challenging problems
Lucas Sterzinger
  • Simplifying analysis of hierarchical HDF5 and NetCDF4 files with xarray-datatree
Luigi Cruz

Luigi Cruz is a computer engineer working as a staff engineer at the SETI Institute. He created the CUDA-accelerated digital signal processing backend called BLADE currently in use at the Allen Telescope Array (ATA) and Very Large Array (VLA) for beam forming and high-spectral resolution observations. Luigi is also the maintainer of multiple open-source projects like the PiSDR, an SDR-specialized Raspberry Pi image, CyberEther, a heterogenous accelerated signal visualization library, and Radio Core, a Python library for demodulating SDR signals using the GPU with the help of CuPy.

  • Coming Online: Enabling Real-Time and AI-Ready Scientific Discovery
Luis Lopez

Luis A. López is a Research Software Engineer at the National Snow and Ice Data Center (NSIDC) in Boulder, Colorado. He is a passionate advocate of open science, open-source and a collaborator in projects like NASA Openscapes and ITS_LIVE He’s always happy to help scientists find ways to make their workflows simpler and more efficient.

  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Maarten Breddels

Maarten Breddels is an entrepreneur and ex-scientist mainly working with Python, C++, and Javascript in the Jupyter ecosystem. He is the creator of Solara, ipyvolume, and Vaex and Co-founder of Widgetti. His expertise includes fast numerical computation, API design, 3D visualization, and building data apps. He has a Bachelor's in ICT, a Master's, and Ph.D. in Astronomy, and he likes to solve real problems.

  • Building Complex Web Apps with Jupyter Widgets
Madhav Kashyap

Madhav Kashyap is a Graduate Student at the University of Washington majoring in Computational Linguistics and Natural Language Processing. His recent work as a Graduate Research Assistant at the UW Scientific Software Engineering Center (eScience Institute) has been to develop open-source software used by oceanographers in measuring seafloor tectonic shifts to the centimeter level. His Thesis focuses on system optimizations for faster Information Retrieval in Large Language Model workflows. As a Backend Software Engineer at Akamai, he has industry experience coding robust Python and Go systems powering enterprise cybersecurity.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Mathieu Guillame-Bert

I am a research engineer at Google Zurich. My work centers around moving ML research into production. Notably, I lead efforts to research and bring decision forest technologies to a production, making them accessible, scalable, and performant.

Before joining Google, I completed a postdoctoral fellowship at Carnegie Mellon University's Auto Lab. In 2012, I earned my doctorate in France at the INRIA Research Lab as part of the PRIMA team. I graduated from Imperial College London and "French Grande Ecole" ENSIMAG in 2009.

In my spare time, I delve into various hobbies such as tinkering with electronics, woodworking, 3D printing, and creating video games.

  • Safe, fast, and easy time series preprocessing with Temporian
  • A hands-on forecasting guide: from theory to practice
Matt Craig

I teach Physics and Astronomy at a small undergraduate-only state university. I started using Python over 10 years ago, just at the time project Jupyter was announced. Since then, I've been an enthusiastic user of Jupyter widgets in performing accessible, reproducible science and have contributed to the core ipywidgets package.

  • Building Complex Web Apps with Jupyter Widgets
Matt McCormick

Matt McCormick, Ph.D. is a distinguished engineer on Kitware’s Medical Computing Team located in Carrboro, North Carolina. His experience spans multiple medical, biological, material science, and geospatial imaging applications. As a subject matter expert, he manages and makes technical contributions to scientific image analysis projects. He has been a principal investigator and a co-investigator of several research grants from the National Institutes of Health (NIH), led engagements with United States national laboratories, and he has led various commercial projects providing advanced software for medical devices.

Matt specializes in diagnostic ultrasound imaging, with an emphasis on radio-frequency-based signal characterization. Many of his projects focus on designing and developing innovative, artificial intelligence (AI) solutions for tissue characterization, elastography, and low-cost, portable systems.

In addition to his projects, Matt also leads the development of the Insight Toolkit (ITK), a high performance, N-dimensional image processing library written in C++ with interfaces in Python and JavaScript. In development for over two decades, ITK is a collaborative effort of the world’s best research software engineers. The open source project provides hundreds of advanced algorithm modules, covering topics in image processing, registration, segmentation, quantification, and reconstruction. Matt has led the community through two major revisions, ITK 4.0 and ITK 5.0, and he has coordinated over 30 releases of the toolkit, each comprising developments from 30 to 60 individual contributors.

Recently, Matt architected a port of the toolkit to WebAssembly, called itk.js, to couple ITK with interactive browser visualizations built on vtk.js. These are built into a next generation open-source software system for medical and scientific image, mesh, and point set visualization, the itk-vtk-viewer.

Matt is also involved in Kitware’s Open Source Software Sustainability RoadShow which visits organizations like the Chan-Zuckerberg Initiative and NVIDIA to present our lessons learned on software sustainability.

Matt also serves as a reviewer for journals, such as the Institute of Electrical and Electronics Engineers (IEEE) Transactions on Medical Imaging and the Journal of Open Source Software. He is also an active contributor to conferences for the Medical Imaging Computing and Computer Assisted Intervention (MICCAI) and Scientific Computing in Python (SciPy) communities. Additionally, Matt mentors Kitware interns and serves on Ph.D. thesis committees.

During his studies at the University of Wisconsin-Madison, Matt’s research covered vascular mechanics, signal processing, medical imaging physics, and computing. His doctoral thesis focused on the characterization of carotid plaque (a primary cause of stroke) with diagnostic ultrasound. The principal aim of his research was to develop algorithms to quantify local deformation in the plaque from raw ultrasound image data. The code developed for this purpose makes extensive use of ITK and the Visualization Toolkit (VTK).

While earning his bachelor’s degree at Marquette, Matt interned at Boston Scientific Corporation, where he worked on peripheral vascular nitinol stents.

Matt received his Ph.D. and master’s degree in biomedical engineering from the University of Wisconsin-Madison in 2011 and 2007, respectively. In 2005, he received his bachelor’s degree in biomedical engineering from Marquette University.

2 / 2

GPT
Matt McCormick, Ph.D., is an esteemed member of Kitware’s Medical Computing Team based in Carrboro, North Carolina, with extensive experience in medical, biological, material science, and geospatial imaging applications. As a subject matter expert, he not only manages scientific image analysis projects but also contributes technically, having led numerous research grants from the NIH, engagements with U.S. national laboratories, and commercial projects aimed at developing advanced software for medical devices. His specialization lies in diagnostic ultrasound imaging, particularly in the radio-frequency-based signal characterization, and he is heavily involved in creating AI solutions for tissue characterization, elastography, and the development of low-cost, portable systems.

In his role, Matt leads the development of the Insight Toolkit (ITK), a comprehensive, N-dimensional image processing library that supports C++, Python, and JavaScript interfaces. Having been a part of ITK's development for over twenty years, Matt has overseen two major toolkit revisions (ITK 4.0 and ITK 5.0) and coordinated over 30 releases, incorporating contributions from 30 to 60 individual contributors for each release. Furthermore, he recently architected itk.js, a port of ITK to WebAssembly, to enhance ITK with interactive browser visualizations utilizing vtk.js, contributing to the next-generation software system for medical and scientific image analysis and visualization.

Matt’s academic background includes a Ph.D. and a master’s degree in biomedical engineering from the University of Wisconsin-Madison, where his doctoral research focused on using diagnostic ultrasound to characterize carotid plaque, utilizing ITK and VTK extensively in his algorithms. His career began with a bachelor’s degree in biomedical engineering from Marquette University, supplemented by an internship at Boston Scientific Corporation. Beyond his project contributions, Matt actively engages with the academic and open-source communities by reviewing for journals, presenting at conferences, mentoring interns, and serving on Ph.D. thesis committees, highlighting his commitment to advancing medical imaging and open-source software development.

  • ITK-Wasm: Universal spatial analysis and visualization
Matthew Feickert

Matthew is a postdoctoral researcher in experimental high energy physics and data science at the Data Science Institute at the University of Wisconsin-Madison (a “data physicist”). He works as a member of the ATLAS collaboration on searches for physics beyond the standard model with experiments performed at CERN's Large Hadron Collider (LHC) in Geneva, Switzerland. He also serves on the executive board of the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) where he is a researcher and the Analysis Systems Area lead. Matthew has been involved with the SciPy conference since 2019 and serves as a member of the SciPy 2024 Organizing Committee.

  • How the Scientific Python ecosystem helps answering fundamental questions of the Universe
Matthew Rocklin

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled to improve Python's scalability with Dask for large organizations.

Matthew holds a bachelors degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.

  • Website: https://matthewrocklin.com
  • Dask: https://dask.org/
  • Coiled: https://coiled.io
  • Dask in Production
Max Jones
  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Mehdi Ouazza

Mehdi (aka mehdio) is a data enthusiast with nearly a decade of experience in data engineering for companies of all sizes. He's not your average data guy, injecting humor and fun into his work to make complex topics easier to digest. When he's not actively contributing to the data community through his blog, YouTube, and social media, you can find him off-beat, marching to the beat of his own data drum. In 2023, Mehdi joined Motherduck as a developer advocate, bringing his data eng expertise to supercharge DuckDB.

  • All the SQL a Pythonista needs to know: an introduction to SQL and DataFrames with DuckDB
Mine Çetinkaya-Rundel

I am a Professor of the Practice at the Department of Statistical Science and an affiliated faculty in the Computational Media, Arts, and Cultures program at Duke University. Additionally, I work as a Developer Educator at Posit, PBC. My work focuses on innovation in statistics and data science pedagogy, with an emphasis on computing, reproducible research, student-centered learning, and open-source education.

  • Unlocking Dynamic Reproducible Documents: A Quarto Tutorial for Scientific Communication
Nathan Goldbaum

I am a software engineer at Quansight Labs where I help maintain NumPy and contribute to open source software on behalf of consulting clients. My background is in astrophysics, I completed my PhD in 2015 at UC Santa Cruz and worked as a postdoc and research scientist at the University of Illinois. During my academic career I become increasingly involved in community open source projects, contributing to projects across the scientific python ecosystem and as a maintainer of the yt project. Since then I've transitioned from academia to industry, but I still believe strongly in open science and the importance of community research software to building reproducible scientific workflows.

  • My NumPy year: From no CPython C API experience to shipping a new DType in NumPy 2.0
Nathan Martindale

Nathan Martindale is a data scientist in the nuclear nonproliferation division at Oak Ridge National Laboratory. He completed both his B.S. (2018) and M.S. (2020) degrees in computer science at Tennessee Tech University, with his graduate studies focusing on machine learning. His research interests include natural language processing, interactive machine learning, visual analytics, and HCI.

  • Supporting Greater Interactivity in the IPython Visualization Ecosystem
Naty Clementi

Naty is a senior software engineer at Voltron Data. She is a former academic with a Masters in Physics and PhD in Mechanical and Aerospace Engineering to her name. She is currently contributing to Ibis, but in the past has also contributed and maintained Dask. She is also an active member of Pyladies and a one of the directors of Women Who Code DC.

  • Ibis + DuckDB geospatial: a match made on Earth
  • Intro to Ibis: blazing fast analytics with DuckDB, Polars, Snowflake, and more, from the comfort of your Python repl.
Negin Sobhani
  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Nezar Abdennur

I am an Assistant Professor in the Department of Genomics and Computational Biology and the Department of Systems Biology at UMass Chan Medical School.

I lead a computational research group (https://abdenlab.org) with a dual mandate. My group's biological research focuses on the 3D organization of the genome (3C/Hi-C technologies), its relationship to the epigenome, and the resulting manifold influences on cellular fate, differentiation, aging, and disease. My group's open-source interests are in supporting foundational software infrastructure to improve genomic and multi-omic data science, especially in the scientific Python ecosystem.

  • Bring your __repr__’s to life with anywidget
Nicholas Ursa

Software Engineer at MotherDuck. Previously data at The New York Times, Better.com. M.Sc CompSci (Columbia)

  • How to bootstrap a Data Warehouse with DuckDB
Nick Lenssen
  • Simplifying analysis of hierarchical HDF5 and NetCDF4 files with xarray-datatree
Nicole Brewer

Nicole Brewer is a research software engineer and PhD student in History and Philosophy of Science at Arizona State University. Her dissertation research focuses on current and better practices for computational research workflows in Jupyter Notebooks. She was a Better Scientific Software fellow in 2023, which resulted in the website Jupyter4Science, a knowledge base of original and curated content about Jupyter Notebooks as they are used in scientific contexts. Site content includes the first widgets tutorial she gave at SciPy 2023.

  • Building Complex Web Apps with Jupyter Widgets
Niki Burggraf

Niki Burggraf is a Senior Software Engineer for the UW Scientific Software Engineering Center (SSEC) at eScience Institute. With over 6 years of experience building and maintaining cloud web services, Niki is excited to bring her industry knowledge to the scientific software engineering sphere.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Ondřej Čertík

Principal Compiler Engineer at GSI Technology. Former scientist at Los Alamos National Laboratory.

Ondřej is the original author of SymPy, SymEngine, LFortran and LPython.

Website: https://ondrejcertik.com/

  • LPython: Novel, Fast, Retargetable Python Compiler
Patricia A. Loto

Since 2021, Patricia Loto collaborate on various projects as a member the Metadocencia accessibility team, including the Science Core Bilingual Development project and the Mapping of Communities, Organizations, and Open Science Resources in Latin America. She holds a Bachelor's degree in Information Systems and a Diploma in Data Science, Machine Learning, & its Applications from the FAMAF of the National University of Cordoba. She has taught computational tools and data analysis at varying levels — i.e., for researchers, students, and even people with no formal programming background – at numerous workshops, conferences, and at the Department of Statistical Calculus and Biometry at the Faculty of Agrarian Sciences of the National University of the Northeast. She is also certified to teach programming by The Carpentries and a Tidyverse Instructor by Rstudio. She enjoys learning in community and is an active member of communities such as R-Ladies, the Carpentries, Latin-R, OLS, and The Turing Way, where she contributes and learns from others.

• Machine Learning with Tidymodels in LatinR: https://www.youtube.com/watch?v=1ATHGwDPXQs

• First Steps in R: https://www.youtube.com/watch?v=plE4owAKYNA

• Linkedin: https://www.linkedin.com/in/patricia-loto/

• Website: https://patricia-loto.netlify.app/

• Github: https://github.com/PatriLoto

  • Determining Climate Risks with NASA Earthdata Cloud
Patrick Hoefler

Patrick Hoefler is a member of the pandas core team and a Dask maintainer. He is currently working at Coiled where he focuses on Dask development and the integration of a logical query planning layer into Dask. He holds a Msc degree in Mathematics and works towards a Msc in Software engineering at the University of Oxford.

  • Pandas + Dask DataFrame 2.0 - Comparison to Spark, DuckDB and Polars
Pavithra Eswaramoorthy

Pavithra Eswaramoorthy is a Developer Advocate at Quansight, where she works to improve the developer experience and community engagement for several open source projects in the PyData community. Currently, she contributed to the Bokeh visualization library, and contributes to the Nebari (adjacent to the Jupyter community), conda-store (part of the conda ecosystem), and Ragna (a RAG orchestration framework) projects. Pavithra has been involved in the open source community for over 5 years, notable as a maintainer of the Dask library and an administrator for Wikimedia’s OSS programs. In her spare time, she enjoys a good book and hot coffee. :)

  • Data of an Unusual Size (2024 edition): A practical guide to analysis and interactive visualization of massive datasets
  • Interactive data visualizations with Bokeh (in 2024)
  • From RAGs to riches: Build an AI document inquiry web-app
Pawan Negi

Pawan Negi is a post-doctoral researcher at the Department of Applied
Mathematics, Illinois institute of Technology. He has been teaching
mathematics courses to undergraduate students since last year. He earned his
PhD at IIT Bombay. He has used ParaView extensively for various visualization
tasks for many years.

  • Automate your research with automan
Peter Sun

Dr. Peter Sun is a Physical Chemistry Postdoc in the John Marohn Group at Cornell University. His research focuses on developing simulation and reconstruction of 3D images and fabricating nanoscale detection devices for magnetic resonance force microscopy experiments.

  • mrfmsim: a modular simulation platform for magnetic resonance force microscopy experiments
Phillip Cloud

I'm Phillip Cloud, a software engineer. I work on Ibis full-time at Voltron Data. I like a lot of things, including Dune, jazz and puns. Let's chat!

  • Intro to Ibis: blazing fast analytics with DuckDB, Polars, Snowflake, and more, from the comfort of your Python repl.
Pierre Raybaut

Pierre Raybaut is a long-term advocate of Python in a scientific context, renowned as the creator of Spyder, the Scientific Python IDE, and other pivotal projects like Python(x,y) and WinPython. These tools have been instrumental in making Python a leading language for scientific computing.

Pierre's academic journey began with an engineer's degree from the Institut d’Optique Graduate School, specializing in laser physics. He further advanced his expertise by earning a PhD in optics and photonics from Université Paris-Saclay, where he developed software for simulating regenerative amplification in ultra-short pulse lasers.

Professionally, Pierre has held diverse and impactful roles. He served as a research engineer at THALES Avionics, a principal software developer at CEA (the French Alternative Energies and Atomic Energy Commission), and managed the Laser MégaJoule timing and fiducial system project. Eventually, he became the Head of a Research Laboratory at CEA before transitioning to Codra, an industrial software company, where he currently holds the position of Chief Technology Officer (CTO).

Beyond his work on Spyder, Pierre is deeply involved in the open-source software community. He has created tools such as guidata, PlotPy, PlotPyStack and DataLab, and has contributed to numerous other projects.

  • From Spyder to DataLab: 15 years of scientific software crafting in Python
Prabhu Ramachandran

Prabhu Ramachandran is a faculty member at the Department of Aerospace
Engineering, IIT Bombay. He has run several workshops at SciPy which have been
generally well received. See here https://www.youtube.com/watch?v=r6OD07Qq2mw
and https://www.youtube.com/watch?v=2dd4BduDkG8. Prabhu has been using Python
for more than two decades and has been teaching Python and Python related
tools in various capacities for many years. Prabhu is also the main author of
automan which he wrote to save himself from dealing with the drudgery of
management of hundreds of simulation results for one of his papers. Prabhu
also gave a talk on automan at SciPy 2022 titled "The (Surprising) Road to
Reproducibility: Automation!" which you can see here
(https://www.youtube.com/watch?v=zvBotV6r9AY).

  • Automate your research with automan
Pryce Turner

Bioinformatics Solutions Architect at Union AI

  • Orchestrating Bioinformatics Workflows Across a Heterogeneous Toolset with Flyte
Qiusheng Wu

Dr. Qiusheng Wu is an Associate Professor in the Department of Geography & Sustainability at the University of Tennessee, Knoxville. In addition, he holds positions as an Amazon Visiting Academic and a Senior Research Fellow at the United Nations University. Specializing in geospatial data science and open-source software development, Dr. Wu is particularly focused on leveraging big geospatial data and cloud computing to study environmental changes, with an emphasis on surface water and wetland inundation dynamics. He is the creator of several open-source packages designed for advanced geospatial analysis and visualization, including geemap, leafmap, and segment-geospatial. For a closer look at his open-source contributions, please visit his GitHub repositories at https://github.com/opengeos.

  • SAMGeo: Automated Segmentation of Remote Sensing Imagery with the Segment Anything Model
  • Bridging the gap between Earth Engine and the Scientific Python Ecosystem
Quinn Brencher

Quinn is a PhD student in Civil and Environmental Engineering at the University of Washington. Quinn's research involves developing methods to study changing arctic and alpine landscapes with satellite remote sensing data, particularly radar data. This work is at the intersection of data science, geoscience, and remote sensing. Some of Quinn's previous experiences include TAing a Geospatial Data Analysis in Python course, leading a project at the GeoSmart Machine Learning Hackweek, and collaborating on open source remote sensing software (e.g. https://github.com/SnowEx/spicy-snow).

  • Github Actions for Scientific Data Workflows
Reka Anna Horvath

Freelancing software engineer.
Current focus areas: code re-usability, workflow automation, AI-assisted coding.
Experience ranging from big fintech organizations with microservice architectures to a startup focused on developer productivity tools.

  • Cookiecutter: Project Templates and Much More
Richard Iannone

Rich is a software engineer that focuses on writing software packages focused on data analysis and data visualization workflows. Through this, he really wants to help people accomplish things that were difficult before. He’s been at Posit Software for six years and has been immensely enjoying his time there. Before that, he did many science-y things before switching into full-time open source development. As far as outdoor activities are concerned, Rich enjoys meeting up with friends and wandering through the many valleys and ravines of the Greater Toronto Area.

  • Great Tables for Everyone
Rick Ratzel
  • No-Code-Change GPU Acceleration for Your Pandas and NetworkX Workflows
Ryan C Cooper

Ryan C. Cooper is an Associate Professor-in-Residence at the University of Connecticut. His background
is in mechanics and materials science with an emphasis on numerical simulations and engineering education. He has been using Jupyter and GitHub to enhance the classroom experience for over six years. Prof. Cooper has developed and free open source materials for computational work in engineering and volunteered with the NumPy documentation team. Ryan is an integral part of the AI in the School of Engineering committee. He has a Ph.D. from Columbia University and spent two and a half years at Oak Ridge National Laboratory as a Postdoctoral researcher.

  • Teaching and Learning Scientific Computing in the age of ChatGPT
Samapriya Roy

A Google Developer Expert for Google Earth Engine and Senior Product Manager at MAXAR, I lead Developer Relations and champion open data access apart from working on core APIs and infrastructure. I leverage geospatial expertise as an affiliate Faculty at the University of Hawaiʻi at Mānoa and I am a Designated Campus Colleague at the University of Arizona. Passionate about community building, I created the "Awesome Google Earth Engine Community Catalog," a thriving data commons. My research explores big data analysis and geospatial applications, while I advocate for science communication and empower researchers through collaborative platforms and speaking engagements.

  • Bridging the gap between Earth Engine and the Scientific Python Ecosystem
Sankalp Gilda

Dr. Sankalp Gilda is a distinguished data scientist and software developer with a profound passion for enhancing the domain of time series analysis. Holding an academic foundation in Astrophysics, Dr. Gilda has cultivated an extensive understanding of predictive modeling, data analysis, and statistical methodologies.

A fervent proponent of open-source software, Dr. Gilda thrives on engaging with the global developer community, aiming to refine and introduce innovations within existing technologies. His active participation in conferences and workshops underscores his commitment to sharing insights and absorbing new knowledge from his peers.

Beyond his professional pursuits, Dr. Gilda is a true adventure enthusiast, a trait he believes fosters a balanced and inventive approach to his technical work. His enthusiasm for scuba diving and nascent interest in skydiving reflect his zest for life and continual quest for new experiences.

  • Enhancing Predictive Analytics with tsbootstrap and sktime
Santiago Soler

Physicist with a PhD in Geophysics. Currently postdoc at UBC. Develops Fatiando a Terra and SimPEG: open-source Python libraries for Geophysics.

  • Pooch: a friend to fetch your data files
Scott Henderson

Scott Henderson is senior research scientist in the University of Washington (UW) Department of Earth and Space Sciences and data science fellow at the eScience Institute. He has worked on numerous NASA-funded efforts to develop open Cloud computing solutions for data intensive research. He is a lead organizer for ‘Hackweeks’ hosted by the UW eScience institute which are designed as participant-driven events to promote collaboration and open science.

  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Sean Law

I am a Principal Data Scientist currently working with a multi-talented R&D team at a Fortune 500 finance firm. I have experience producing cutting edge methodologies, building high-performance predictive models, developing rapid prototypes, and I am an inventor on several finance-related patents.

Additionally, I co-organize the monthly PyData Ann Arbor data science meetup and I am also the creator and core maintainer of STUMPY, a powerful and scalable open source Python package for modern time series analysis.

  • STUMPY: Modern Time Series Analysis with Matrix Profiles
Sebastian Raschka

My name is Sebastian Raschka, and I am a machine learning and AI researcher. Next to being a researcher, I also have a strong passion for education and am best known for my bestselling books on machine learning using open-source software.

After my PhD, I joined the University of Wisconsin-Madison as a professor in the Department of Statistics, where I focused deep learning and machine learning research until 2023.

Taking a yearlong break from academia, I joined Lightning AI in 2022, where I am now a Staff Research Engineer focusing on the intersection of AI research, software development, and large language models (LLMs).

If you are interested in learning more about me or my projects, please visit my website at https://sebastianraschka.com

  • Pretraining and Finetuning LLMs from the Ground Up
Soham Kishor Butala

I'm Soham, a Data Science Graduate from the University of Washington. With four years of diverse experience at Deloitte and AWS, I've delved into software engineering, data engineering, and application security. I'm deeply passionate about Data Engineering and always eager to embrace new technologies. Beyond the screen and code, I find solace in the great outdoors; hiking is not just an activity for me but a way to rejuvenate my spirit. And when it comes to mental exercises, who can resist the allure of a thrilling game of chess? Looking forward to connecting and exploring the vast horizons of technology and beyond.

  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Stanley Seibert

Stanley Seibert is the Senior Director of Community Innovation at Anaconda, where he manages many of Anaconda's open source project maintainers. He has been a contributor to the Numba Python compiler project for 10 years, and has worked with a wide range of technologies on past projects, including GPU computing, sparse linear algebra for graph analytics, and the MLIR compiler framework. Stanley was previously trained as a high-energy physicist, working on experimental software and hardware for the study of neutrinos and dark matter.

  • PIXIE: Blending Just-in-time and Ahead-of-time compilation in scientific Python applications
Stuart Archibald
  • PIXIE: Blending Just-in-time and Ahead-of-time compilation in scientific Python applications
Tetsuo Koyama

Interested in scientific computing and visualization with computer graphics.
Developer team member of PyVista.
Experience as a speaker:
- PyConJP 2019 speaker "Introduction to FEM Analysis with Python"
- PyConJP 2020 speaker "How to plot unstructured mesh file on Jupyter Notebook"
- SciPy Japan 2020 speaker "Translation Project of Mayavi2 documents"
- PyConJP 2021 speaker "Visualize 3D scientific data in a Pythonic way like Matplotlib"

  • 3D Visualization with PyVista
Tim Diller

Tim holds B.S.., and Ph.D. degrees in Mechanical Engineering from The University of Texas at Austin and an M.S. in Course 2 (Mechanical Engineering) from the Massachusetts Institute of Technology. Between Master's and Doctoral degrees, Tim spend 5 years working at the Michelin Americas Research & Development Corporation in Greenville, South Carolina, first as a test engineer, instrumenting tire / vehicle systems and writing software to manage the flow of test data, and eventually doing modeling and simulations of tire/vehicle systems for handling performance.

After returning to his roots to pursue a Ph.D. in Austin, measuring and modeling the emission of particulates from diesel engines, Tim signed on with Enthought and spent 12½ years writing software for clients in engineering disciplines from consumer products to oil exploration and chemical manufacturing, then managing software teams, then managing digital transformations for large customer accounts in semiconductor and specialty materials manufacturing.

Early in his career at Enthought, Tim started teaching courses in Python for mid-career scientists and engineers and helped to develop the curriculum for what is now the Enthought Academy. Throughout his career with its many turns, Tim has exhibited a passion for engineering, software, and improving human potential through education.

In October 2023, Tim founded Diller Digital to serve the market for high-quality interactive training using the Enthought Academy curriculum in scientific computing in Python after Enthought refocused their business on consulting and product offerings. Tim's goal in founding Diller Digital is to elevate the value and dignity of the work of scientists and engineers by giving them digital tools and the skills to learn new tools or even build their own to take their work to a new level.

  • A Practical Introduction to NumPy
Timo Metzger

Timo is a technical writer and project manager at makepath. He started contributing to Bokeh in 2020 and loves helping others succeed in the world of Open Source.

  • Interactive data visualizations with Bokeh (in 2024)
Tom Nicholas

Tom currently works at [C]Worthy, a non-profit building the computation tools needed to ensure safe, effective ocean-based carbon dioxide removal.

Before that he was a Research Software Engineer working in Ryan Abernathey's Climate Data Science Lab at Lamont Doherty Earth Observatory, Columbia University.

He first started using the open-source scientific python stack during his PhD, when he was studying plasma turbulence in nuclear fusion reactors.

He is a member of the xarray core development team, and also works on Cubed, xGCM, pint-xarray, and xarray-datatree. He is heavily involved with the Pangeo community for Big Data Geoscience.

  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
  • Simplifying analysis of hierarchical HDF5 and NetCDF4 files with xarray-datatree
Tom Vo

Hi, my name is Tom Vo. I am a software engineer in the LLNL Climate Program and a member of the Energy Exascale Earth System Model (E3SM) and Simplifying ESM Analysis Through Standards (SEATS) projects. I contribute to numerous open-source scientific software for climate science, including leading the development of Xarray Climate Data Analysis Tools (xCDAT). I am formerly a member of the Earth System Grid Federation (ESGF) Project where I was the lead full-stack web developer for MetaGrid, ESGF’s next-generation search portal for climate data.

My interests and expertise are in scientific Python package development, full-stack web-development, and DevOps engineering.

  • xCDAT (Xarray Climate Data Analysis Tools): A Python package for simple climate data analysis on structured grids
Trevor Manz

Hi, I'm Trevor 👋 I'm a PhD student and visualization researcher in the HIDIVE Lab at Harvard Medical School. My work focuses on developing interactive visualization tools that enable computational biologists to more effectively explore their data.

I aspire to make the web platform more accessible to Python hackers, and help other Pythonistas enrich their existing workflows with interactive visualizations. I maintain anywidget and am involved in the Zarr community. Last year was my first SciPy and I'm very excited to be returning.

  • Bring your __repr__’s to life with anywidget
  • anywidget: custom Jupyter Widgets made easy
Valentina Staneva

Valentina Staneva is a Senior Data Scientist and Data Science Fellow at the eScience Institute, Paul G. Allen School of Computer Science & Engineering, University of Washington. As part of her role she collaborates with researchers from a wide range of domains on extracting information from large data sets of various modalities, such as time series, images, videos, audio, text, etc. She is involved in data science education for audiences at broad level of experience, and regularly teaches workshops on introductory and advanced topics. She supports open science and reproducible research, and strives to help others adopt better data science workflows.

  • Github Actions for Scientific Data Workflows
  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Vangelis Kourlitis

Vangelis is a postdoctoral researcher at the Technical University of Munich and a member of the ATLAS Collaboration at CERN. He currently directs the data analytics group of ATLAS providing technical leadership on the development of the data analysis software and formats producing the results of hundreds of physics publications per year. His research is focused on enabling efficient analysis of terabytes of experimental data through array-oriented programming methods.

  • How the Scientific Python ecosystem helps answering fundamental questions of the Universe
  • Thinking In Arrays
Vani Mandava

Vani Mandava is the Head of Engineering for the UW Scientific Software Engineering Center within eScience Institute. She is responsible for setting up the SSEC organization and working with PIs to define the priorities and scope of software infrastructure that will strengthen the scientific software community. Before joining UW in 2022, Vani spent over two decades at Microsoft. Her career spanned engineering and product roles in client, server, and services products across Microsoft Office, Bing AdCenter, Microsoft Academic Search, and Microsoft Research Open Data. As Director for Data Science at Microsoft Research, she led Cloud, Data Science, and Trustworthy AI research collaborations with partners in academia and government.

  • Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Vi Rapp
  • Python for early-stage design of sustainable aviation fuels
Victor Dibia

Victor Dibia is a Principal Research Software Engineer at the Human-AI eXperiences (HAX) team at Microsoft Research. Victor is a core contributor to the AutoGen project and developed AutoGen Studio - a low code tool for building multi-agent applications.
Victor's research has been published at conferences such as ACL, EMNLP, AAAI, and CHI, receiving multiple best paper awards. His work has gained recognition in media outlets like the Wall Street Journal and VentureBeat.

  • Building Multi-Agent Generative-AI Applications with AutoGen
Vyas Ramasubramani
  • No-Code-Change GPU Acceleration for Your Pandas and NetworkX Workflows
Wietze Suijker

Wietze works at Space Intelligence as a Product Architect, providing data on forest coverage and carbon storage to achieve zero deforestation and mass restoration.
He is a data engineer and the scrum master of the team that maintains Space Intelligence’s data platform.
The data platform builds on the Pangeo stack and uses machine learning workflows to provide satellite data products at scale.
Previously, he worked at a Water and IT company as the technical lead for a range of satellite-data-driven projects.

  • Xarray: Friendly, Interactive, and Scalable Scientific Data Analysis
Wu-Jung Lee

Wu-Jung Lee is a scientist at the Applied Physics Laboratory, University of Washington in Seattle, WA, USA. She has an interdisciplinary background, including undergraduate degrees in Electrical Engineering and Life Science from National Taiwan University and a PhD from the MIT-WHOI Joint Program in Oceanography. Her research spans two primary areas, acoustical oceanography and animal echolocation, with a goal of advancing acoustic sensing technology to better observe and understand the marine ecosystem. Dr. Lee loves going to sea despite being very prone to motion sickness. Outside of work, she enjoys spending time in the mountains and drawing.

  • Echostack: An open-source Python software toolbox that democratizes water column sonar data and processing
Zac Hatfield-Dodds

Zac grew up in Australia eating dark chocolate, reading books, and occasionally indulging in both at the same time. By day he works at Anthropic, an AI safety and research company in San Francisco; by night he (co-) maintains Hypothesis and Pytest, and contributes to a range of other Python projects. You can read more about him at https://zhd.dev/

  • Introduction to Property-Based Testing
eli knaap

Eli is a Senior Research Scientist and the Associate Director of the Center for Open Geographical Science at San Diego State University. He is a spatial data scientist trained in stratification sociology, urban economics, and quantitative geography, whose research focuses on social inequality and spatial structure in neighborhoods, cities, and regions. Eli is a core developer for PySAL, QuantEcon, and OTURNS

  • geosnap: The Geospatial Neighborhood Analysis Package
isabel zimmerman

Isabel Zimmerman is a software engineer at Posit, PBC where she works primarily on building open-source Python tools for MLOps tasks. She also serves as an Editor at pyOpenSci, where she helps facilitate reviewing open scientific software in the Python ecosystem. Outside of computers, Isabel spends most of her time teaching her dogs new tricks or trying to learn how to sew.

  • Create Your First Python Package: Make Your Python Code Easier to Share and Use
  • From Code to Clarity: Using Quarto for Python Documentation
nterrel

5th year PhD candidate in Physical Chemistry working in Adrian Roitberg's lab at the University of Florida. Research interests involve deep learning techniques and applications to chemical computation and simulation

  • Atomistic uncertainty driven data generation in ANI neural network potentials