SciPy 2024

Starsim: A flexible framework for agent-based modeling of health and disease
07-10, 10:45–11:15 (US/Pacific), Room 317

Agent-based models (ABMs) are powerful tools for understanding how people behave and interact. However, many ABMs are slow, cumbersome to use, or both. Here we introduce Starsim, an open-source, high-performance ABM specialized for modeling different diseases (such as HIV, STIs, and tuberculosis). Built on NumPy, Numba, and SciPy, Starsim's performance rivals ABMs implemented in Java or C++, but with the convenience provided by Python: specifically, the ability to quickly implement and refine new disease modules. Starsim can also be extended to other applications in which people interact on timescales from days to decades, including economics and social science.


Background

Agent-based models (ABMs) allow for detailed simulation of people and their interactions. Over the last several years, we developed a suite of ABMs that have been widely used in different health areas, including COVID-19 (Covasim), human papillomavirus (HPVsim), and family planning (FPsim). Recently, we codified the principles of these models into a standalone tool called "Starsim". Starsim is capable of simulating communicable and non-communicable diseases (such as HIV and diabetes), plus other health areas (such as reproductive health). It is also capable of simulating vital dynamics, making it suitable for applications in which human populations change and interact over time, such as social science and economics.

Methods

Starsim is a modeling framework containing modules representing different diseases (including their natural history, transmission, and effects of co-infection), contact networks (including sexual, respiratory, and maternal transmission), demographics (including births and deaths), and interventions (including testing, treatment, and vaccines). These modules are flexible and can interact. For example, HIV and tuberculosis can be modeled within a single simulation, with interventions targeting both.

We designed Starsim to be easy to use for both end-users and developers. We follow a philosophy of "Common tasks should be simple" to ensure that the library's API is as straightforward as possible. All of our code is open-source. To simplify development, we use Sciris, which is an open-source library for scientific computing that provides additional flexibility and ease-of-use on top of NumPy, including array operations, parallel computing, and high-performance data types. We use NetworkX to implement interactions between agents, and SciPy to provide flexible distributions of random numbers for determining the outcome of stochastic events (such as infections or deaths).

Results

Agent-based models are typically simulated by looping over all agents and then looping over time, but this approach is slow in Python. Instead, we use NumPy arrays to represent properties of each agent, producing a roughly 30x performance gain. For operations that cannot be represented as array operations, we use Numba to achieve a further 10x performance gain. Together, these optimizations result in a roughly 300x performance gain over a typical Python ABM. In practice, this means that realistic scenarios of hundreds of thousands of agents over thousands of timesteps can be simulated in less than a minute on a standard laptop.

Starsim's most notable technical feature is how it handles random numbers. Rather than use a central random number generator (RNG) to determine event outcomes, Starsim uses a separate RNG for each timestep and event type, and also draws separate numbers for each agent – an approach called common random numbers. This means that in Starsim, a change to one part of the simulation does not affect any unrelated parts: independent events stay independent. This means that scenarios can be compared much more precisely, and avoids the stochastic noise that is usually present when comparing ABM outputs.

Conclusions

We developed Covasim, the original tool in the Starsim suite of ABMs, to rapidly respond to policy questions early in the COVID-19 pandemic. Since 2020, we have trained over 200 academics and health officials in more than a dozen countries to use these tools. We found that the Starsim approach is simple enough that most users were able to quickly learn these tools, and flexible enough to model users' requested policy scenarios. Starsim can be rapidly adapted to new contexts thanks to its modular structure, array-based computation, parallelization, and pre-loading of commonly used population data. While not intended to be a completely general-purpose ABM like Mesa or AgentPy, we believe Starsim fills an important role due to its high performance, modular structure, and careful handling of independent events.

Dr. Cliff Kerr is a Senior Research Scientist at the Institute for Disease Modeling, part of the Bill & Melinda Gates Foundation, where he works on COVID-19, STIs, and family planning. Previously, he completed a B.S. in neuroscience and a Ph.D. in physics, was a lecturer in scientific computing at the University of Sydney, co-founded two startups (on data analytics and health economics), worked on a DARPA project teaching robots to pick up balls, and developed an algorithm that composes music in real time based on brain activity recordings. He lives in New York.