SciPy 2024

Wu-Jung Lee

Wu-Jung Lee is a scientist at the Applied Physics Laboratory, University of Washington in Seattle, WA, USA. She has an interdisciplinary background, including undergraduate degrees in Electrical Engineering and Life Science from National Taiwan University and a PhD from the MIT-WHOI Joint Program in Oceanography. Her research spans two primary areas, acoustical oceanography and animal echolocation, with a goal of advancing acoustic sensing technology to better observe and understand the marine ecosystem. Dr. Lee loves going to sea despite being very prone to motion sickness. Outside of work, she enjoys spending time in the mountains and drawing.

The speaker's profile picture

Sessions

07-11
14:20
30min
Echostack: A flexible and scalable open-source software suite for echosounder data processing
Don Setiawan, CaesarTuguinay, Soham Kishor Butala, Brandyn Lucca, Valentina Staneva, Wu-Jung Lee, Dingrui Lei

Water column sonar data collected by echosounders are essential for fisheries and marine ecosystem research, enabling the detection, classification, and quantification of fish and zooplankton from many different ocean observing platforms. However, the broad usage of these data has been hindered by the lack of modular software tools that allow flexible composition of data processing workflows that incorporate powerful analytical tools in the scientific Python ecosystem. We address this gap by developing Echostack, a suite of open-source Python software packages that leverage existing distributed computing and cloud-interfacing libraries to support intuitive and scalable data access, processing, and interpretation. These tools can be used individually or orchestrated together, which we demonstrate in example use cases for a fisheries acoustic-trawl survey.

Earth, Ocean, Geo, and Atmospheric Science
Room 315
0min
Prefect Workflows for Scaling Scientific Data Pipelines
Valentina Staneva, Soham Kishor Butala, Don Setiawan, Wu-Jung Lee

With the influx of large data from multiple instruments and experiments, scientists are wrangling complex data pipelines that are context-dependent and non-reproducible. In this talk, we will share our experience leveraging the Prefect orchestration framework to allow scientists and data managers without cyberinfrastructure experience to execute complex data workflows on a variety of local and cloud platforms by editing existing recipes. We hope this will serve as a guide to others embarking on streamlining workflows through Prefect or simply wanting to see how modern orchestration tools can be applied in the scientific context.

General