SciPy 2024

Model Share AI: An Integrated Toolkit for Collaborative Machine Learning Model Development, Provenance Tracking, and Deployment in Python
07-12, 13:15–13:45 (US/Pacific), Ballroom

Model Share AI (AIMS) is an easy-to-use Python library designed to streamline collaborative ML model development, model provenance tracking, and model deployment, as well as a host of other functions aiming to maximize the real-world impact of ML research. AIMS features collaborative project spaces, allowing users to analyze and compare their models in a standardized fashion. Model performance and various model metadata are automatically captured to facilitate provenance tracking and allow users to learn from and build on previous submissions. Additionally, AIMS allows users to deploy ML models built in Scikit-Learn, TensorFlow Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps with minimal code. The ability to deploy models with minimal effort and to make them accessible to non-technical end-users through web apps has the potential to make ML research more applicable to real-world challenges.


Background/Motivation

Machine learning (ML) has the potential to revolutionize a wide range of research areas and industries, providing data-driven solutions to important societal problems. However, researchers and practitioners lack easy-to-use, structured pathways to collaboratively develop and rapidly deploy ML models. Model Share AI (AIMS) tackles this issue by supporting collaborative model development, model metadata analytics, and model deployment, as well as a host of other functions aiming to maximize the real-world impact of ML research.

Method/Implementation

AIMS is an open-source Python library that generates user-owned AWS cloud backend resources, allowing users to evaluate, compare, and analyze their ML models, and to deploy models into live REST APIs. It also enables the creation of Model Playground pages - collaborative project spaces - on the AIMS website, where users can track model performance and model metadata and access automatically generated web apps to generate predictions from deployed models. The ModelPlayground() class acts as a local representation of a Model Playground page and its associated REST API. It provides a range of methods to configure, change, and query Model Playground resources. The cloud backend hosts model objects and associated artifacts in S3 storage, while REST APIs are deployed into serverless Lambda functions in user-owned AWS accounts. Runtime models are automatically packaged into Docker containers that run on Lambda. Additionally, certain metadata are stored in a centralized Redis database that powers the AIMS website. The website hosts user profile pages, model pages, web apps, example code, and an official documentation page, as well as user-generated code and documentation for their specific ML projects.

Results/Functionality

A key feature of AIMS is its focus on collaborative model development and crowd-sourced model improvement, enabling teams to iterate quickly by allowing collaborators to build on each other's progress, even across ML libraries. For supervised learning tasks, users can collaboratively submit models into Experiments or Competitions associated with a Model Playground project in order to track model performance and rank submissions in standardized leaderboards. Model versions are available for each Model Playground, and comprehensive model metadata are automatically extracted for each submitted model. In addition to standard evaluation metrics, this includes all hyperparameter settings for Scikit-Learn models and model architecture data (such as layer types and dimensions, number of parameters, optimizers, loss function, memory size) for Keras and Pytorch models. Users can also submit any additional metadata they choose to capture. Model metadata are integrated into Competition and Experiment leaderboards, enabling users to analyze which types of models tend to perform well for a specific ML task. Users can either visually explore leaderboards on their Model Playground page or they can download leaderboards into Pandas data frames to run their own analyses. There is also a set of AIMS methods designed to visualize model metadata. For example, models can be compared in a color-coded layout showing differences in model architectures and hyperparameter settings. Furthermore, users can locally instantiate models from the AIMS model registry. Moreover, AIMS allows users to deploy ML models built in Scikit-Learn, TensorFlow Keras, PyTorch, and ONNX into live rest APIs rapidly with minimal code. Each deployed model is associated with a Model Playground page on the AIMS website and a REST API endpoint hosted in a serverless AWS backend. End-users can either manually upload data to make predictions using an automatically generated web app on the Model Playground Page, or they can programmatically query the REST API associated with the model. Taken together, these functions are designed to streamline the management, collaboration, and deployment of ML models, enhancing their discoverability, reproducibility, and traceability throughout their lifecycle.

Impact

AIMS provides a versatile yet easy-to-use approach to collaborative model development, model metadata analytics, and model deployment in a single tightly integrated workflow. Compared to existing solutions, AIMS significantly lowers the barriers of entry to ML research, making it attractive to a large group of researchers, educators, and data scientists.