SciPy 2023

Thar Be Dragons - Ethical, Legal, and Policy Challenges when Measuring Open Source
07-12, 14:35–15:05 (America/Chicago), Amphitheater 204

Open source researchers are increasingly challenged while navigating the data which open source communities inherently create when working in the open. While mining software repositories for insights into open source practices isn't new, moving beyond code analysis into ecosystems-level research does not have a clear path. This talk will outline the current ethical, legal, and policy challenges community leaders, as well as researchers in academia and industry face and the ambiguous areas decision makers should be aware of.


Challenges to outline can include:

Ethical
- Academia - quantitative + qualitative open source data is not (usually) subject to IRB
- Does anti-aliasing across datasets potentially create opportunities for harm for members of open source communities?

Legal
- When does information become a dataset?
- Can I use this data? Which license for what?

Policy
- Can umbrella foundations "opt-in" communities and projects into ecosystem scale research?
- How can communities and projects create clear boundaries about how and where they want the "data exhaust" they release to be used?

amanda casari is a developer relations engineer in the Open Source Programs Office at Google, where she is co-leading research and engineering to better understand risk and resilience in open source ecosystems. She was named an External Faculty member of the Vermont Complex Systems Center in 2021. amanda is persistently fascinated by the difference between the systems we aim to create and the ones that emerge, and pie.