SciPy 2024

The power of community in solving scientific Python’s most challenging problems
07-10, 16:05–16:35 (US/Pacific), Room 317

Scientific software drives open research. However, developing and maintaining a Python package is a tricky endeavor. You need to navigate a thorny packaging ecosystem, often in an academic environment that doesn’t traditionally value software. pyOpenSci has learned that an inclusive community can be empowered to make Python packaging more accessible, and that constructive peer review supports maintainers in creating better software, while also providing academic credit. In this talk you’ll learn:

  1. How to build consensus around thorny topics like packaging.
  2. Where to find beginner-friendly packaging support.
  3. How constructive peer review can support better code.
  4. How to get involved with pyOpenSci.

Background

Scientific software drives open research. However, developing and maintaining a Python package is a tricky endeavor. You need to navigate a thorny packaging ecosystem, often in an academic environment that doesn’t traditionally value software. pyOpenSci has learned that an inclusive community can be empowered to make Python packaging more accessible, and that constructive peer review supports maintainers in creating better software, while also providing academic credit.

This talk shares pyOpenSci’s knowledge in building a constructive open peer review process that supports maintainers, and also resources that make Python packaging more accessible and easier-to-navigate. Our insights are based on five years of experience working with over 200 community members, who have evaluated 58 packages developed by over 56 package maintainers.

I’ll discuss how we:

  • Built an inclusive and supportive volunteer-led software review process, providing support and credit to maintainers developing scientific Python software.
  • Guided our community to create accessible, accurate and community-vetted (pure Python) packaging resources.

The talk concludes with a discussion of synergistic partnerships with other open source communities (e.g. Astropy) to leverage knowledge and resources.

Methods

We created our peer review program in 2018, spending several years piloting review approaches that integrated the needs of the scientific Python community. We then developed a peer review guide that defined roles and processes to empower volunteers to drive the review process, followed by carving out space for scientists from diverse backgrounds to participate through a mentorship program.

To establish clear and accessible packaging guidance we conducted a needs assessment through conversations with scientists to understand the tool landscape and pain points. Our collaborations with packaging experts, maintainers, and users of all levels allowed us to reach consensus on best practices and to ensure accuracy. We facilitated inclusive discussions and fostered agreement on packaging recommendations by prioritizing beginner-friendly content and tone moderation.

Results

As of Feb 2024, ~200 people have contributed to pyOpenSci. Our volunteer-led peer review program has processed over 58 packages, with 39 packages (26 accepted, 13 in review) in our pipeline. Through collaboration and moderation, complex packaging tools and approaches have been distilled into a beginner-friendly and accurate Packaging Guide. Built on open reviews, some guidebook pages received hundreds of comments leveraging community expertise and building consensus.

We're actively forging partnerships with open source communities (e.g. Astropy, Sunpy, PyHeliophysics). These collaborations exemplify how open-source communities can synergistically leverage each other's expertise.

Conclusion

pyOpenSci has learned a tremendous amount from the community over the past 5 years, and we’re just getting started. We’re excited to share some of our most valuable lessons learned in this talk. Python software is the backbone of open science, but community is the foundation for the scientific Python ecosystem. Attend this talk to both learn more about our journey and to share your journey with us. Our goal is for you to leverage our work; and that together, we can pave the way for a smoother, more streamlined packaging experience. We know that community supported software leads to better science.

I am the Executive Director and Founder of pyOpenSci - a non profit organization that is devoted to helping scientists tackle the world's greatest challenges by empowering them with the skills and tools needed to make their science more open and collaborative. We run an open peer review process for scientific Python software and also develop training resources around open science topics. We have been doing significant work in the Python ecosystem to bridge the technical understanding gap between the broader packaging community and what scientists need.

I've been teaching data-intensive topics for almost 20 years and am passionate about translating technical topics to beginners. I'm also a maintainer of the package stravalib. When i'm not working on all things Python, i'm outside on the trails, climbing mountains with my rescue pup or at the gym doing cross fit.

This speaker also appears in: