SciPy 2023

vak: a neural network framework for researchers studying animal acoustic communication
07-12, 10:45–11:15 (America/Chicago), Grand Salon C

Research on animal acoustic communication is being revolutionized by deep learning. In this talk we present vak, a framework that allows researchers in this area to easily benchmark deep neural network models and apply them to their own data. We'll demonstrate how research groups are using vak through examples with TweetyNet, a model that automates annotation of birdsong by segmenting spectrograms. Then we'll show how adopting Lightning as a backend in version 1.0 has allowed us to incorporate more models and features, building on the foundation we put in place with help from the scientific Python stack.


Are humans unique among animals? We speak languages, but is speech somehow like other animal behaviors, such as birdsong? Questions like these are answered by studying how animals communicate with sound. This research requires cutting edge computational methods and big team science across a wide range of disciplines, including ecology, ethology, bioacoustics, psychology, neuroscience, linguistics, and genomics. As in many other domains, this research is being revolutionized by deep learning algorithms. Deep neural network models enable answering questions that were previously impossible to address, in part because these models automate analysis of very large datasets. Within the study of animal acoustic communication, multiple models have been proposed for similar tasks, often implemented as research code with different libraries, such as Keras and Pytorch. This situation has created a real need for a framework that allows researchers to easily benchmark models and apply trained models to their own data. To address this need, we developed vak [1], a neural network framework designed for this research community, built with core libraries of the scientific Python stack such as numpy, scipy, pandas and dask. In this talk, we will show how vak makes it easy for researchers to work with neural network models through a simple command-line interface and TOML configuration files. As an example, we will demonstrate how we used vak to benchmark a neural network model, TweetyNet [2], that automates annotation of birdsong by segmenting spectrograms. Using vak allowed us to tune hyperparameters and determine the minimal amount of expensive human-annotated data we needed for accurate model performance. We will show how TweetyNet and vak made it possible to relate the complex syntax of canary song to the hidden states of neural activity in the canary brain, and how these tools are being used by other researchers in neuroscience and bioacoustics. Then we will demonstrate how in version 1.0 of vak we have significantly extended its generality, in large part by adopting the Lightning library as a backend. We will show how we are using version 1.0 of vak to reduce the segment error rate of TweetyNet, minimizing the need to clean up predictions with post-processing. In addition we'll walk through how we're using vak to compare performance of TweetyNet with other neural network architectures proposed for similar tasks. Finally we will show work in progress incorporating other families of neural network models into vak, generative and unsupervised learning algorithms for dimensionality reduction and similarity measurements. Both authors are experienced public speakers [3], and the combination of cutting edge neural network models in Python with studies of birds, their song, and the vocalizations of other charismatic animals are sure to make for an entertaining and informative talk.

[1] https://github.com/vocalpy/vak
[2] https://elifesciences.org/articles/63853, https://github.com/yardencsGitHub/tweetynet
[3] https://nicholdav.info/talks/, https://yardencsgithub.github.io/talks/

Engineer with Embedded Intelligence, a research and development group in the DC area. Developer, maintainer of https://github.com/vocalpy. More at https://nicholdav.info/

Researcher of living and artificial neural systems, behavior, memory, and computation. Assistant professor at the Weizmann Institute of Science, Israel.
https://www.weizmann.ac.il/brain-sciences/labs/cohen/