SciPy 2025

An Active Learning plugin in Napari to fine tune models for large-scale bioimage analysis
07-10, 11:25–11:55 (US/Pacific), Room 317

The “napari-activelearning” plugin provides a framework to fine tune deep learning models for large-scale bioimage analysis, such as digital pathology Whole Slide Images (WSI). This plugin was developed with the motivation of easing the integration of deep learning tools into bioimage analysis workflows. This plugin implements the concept of Active Learning for reducing the time spent on labeling samples when fine tuning models. Because this plugin is integrated into Napari and leverages the use of next generation file formats (Zarr), it is suitable for fine tuning deep learning models on large-scale images with little image preparation.


Introduction

This talk introduces the “napari-activelearning” plugin and briefly describes the implemented human-in-the-loop method from Active Learning. The talk is intended for people interested in using transfer learning to fine-tune models, such as Cellpose, for large-scale image analysis. The target audience are attendees with some understanding of machine learning in general, but not any advanced knowledge on the transfer learning field.

Attendees will learn about the Active Learning framework implemented in this plugin and about the capabilities of this open-source project. The code of the “napari-activelearning” plugin can be found at https://github.com/TheJacksonLaboratory/activelearning or be installed via pip from its PyPI project website. A previous demo of this plugin was presented during the Virtual I2K 2024 (https://www.youtube.com/watch?v=mllzxHQuIY0&list=PLdA9Vgd1gxTbvxmtk9CASftUOl_XItjDN&index=12).

Motivation

Adoption of deep learning methods for image analysis has grown exponentially in recent years. Part of such success is thanks to transfer learning methods that enable using models trained with large volumes of data in tasks where annotated data is scarce.
However, transferring learning from one task to another still requires human-labeled data of quality. This becomes a challenge when the target domain offers large volumes of data that could overwhelm the annotator, e.g. tissue labeling in high-resolution microscopy Whole Slide Images (WSI).
The “napari-activelearning” plugin was developed with the purpose of easing the constraints of applying transfer learning methods to large volumes of large-scale image data.

Methods

Next Generation File Formats, such as Zarr, have been increasingly adopted by the bioimage analysis community. Zarr format stores large-scale image data as independent n-dimensional tiles, also called chunks, either on local disk or cloud storage. By using chunks as units of storage the amount of data required to be loaded into memory when accessing specific regions of the image is reduced. This is useful when applying a model for inference in larger-than-memory image data.
On the other hand, to reduce the amount of data presented to the human annotator, concepts from the Active Learning framework are used. This field studies methods for human-in-the-loop learning workflows that prevent overwhelming the annotator. This is achieved through computation of Acquisition Functions that assist the selection of samples predicted with low confidence, and when annotated by a human, these could improve the model’s performance after fine-tuning.
Napari is a user-friendly visualizer for n-dimensional data which capabilities are extensible through plugins. This visualizer already offers tools for data annotation and is compatible with Next Generation File Formats such as Zarr.

Results

A brief overview of the “napari-activelearning” plugin’s graphical interface shows the tool integrated in Napari and general usage of the plugin controls.

Conclusion

While this plugin was developed with the goal of easing adoption of deep learning models in bioimage analysis projects, it is not restricted to these imaging modalities. Moreover, it can be applied as a transfer learning tool for methods that lack from an existing user interface or that are not adapted to work with large-scale image data.

PhD in Computer Science focused in bioimage understanding through computational intelligence methods.

I currently work as Systems Analyst in the Research IT department of the The Jackson Laboratory, where my main role is assisting people with integration of machine learning methods in their image analysis pipelines.

This speaker also appears in: