07-09, 11:25–11:55 (US/Pacific), Room 315
GBNet
Gradient Boosting Machines (GBMs) are widely used for their predictive power and interpretability, while Neural Networks offer flexible architectures but can be opaque. GBNet is a Python package that integrates XGBoost and LightGBM with PyTorch. By leveraging PyTorch’s auto-differentiation, GBNet enables novel architectures for GBMs that were previously exclusive to pure Neural Networks. The result is a greatly expanded set of applications for GBMs and an improved ability to interpret expressive architectures due to the use of GBMs.
Github: https://github.com/mthorrell/gbnet
Audience
GBNet significantly expands the functionality of XGBoost and LightGBM, some of the most popular Machine Learning packages. The talk will be of interest to almost any data scientists, ML practitioners, and researchers who use GBMs. Practitioners primarily using Neural Networks will also be interested because GBM robustness and interpretability may be attractive features in the building blocks they use to approach problems.
The audience will learn about GBNet Modules and how to use them, primarily via examples. The examples will focus on model building and interpretability. Forecasting and ordinal regression are examples in the GitHub page (https://github.com/mthorrell/gbnet/tree/main/examples). Embedding examples will be part of the talk.
In addition, GBM users will learn more about PyTorch, and PyTorch users will learn more about GBMs.
Outline
Background & Motivation
Gradient Boosting Machines (GBMs) such as XGBoost and LightGBM are the most popular and powerful general purpose Machine Learning (ML) algorithms. However, existing implementations of GBMs are architecturally limited. Applications just off the path of standard problems for GBMs (primarily regression, ranking and classification) are not solvable out-of-the-box with standard packages. Deep neural networks (DNNs), on the other hand, offer rich architectural possibilities, but, at least for tabular problem types, lack predictive power, interpretability and robustness. GBNet provides PyTorch modules wrapping XGBoost and LightGBM enabling new and rich architectural possibilities for users of XGBoost and LightGBM. GBNet allows GBMs to be applied to new problem types bringing strong predictive performance, better interpretability and improved robustness.
Software Description
GBNet provides PyTorch Modules that wrap XGBoost and LightGBM for insertion into PyTorch’s computational graph. The wrappers feed GBM predictions into the PyTorch graph and retrieve resulting gradients and hessians for GBM updates. GBNet provides exact information to both packages efficiently such that GBNet models can fit the same models as XGBoost and LightGBM and are roughly as scalable as those underlying packages.
Building with GBNet is nearly the same as building with PyTorch. GBNet Modules can be mixed and combined with standard PyTorch Modules to create expressive architectures that rely on, for example, PyTorch Linear components and an XGBoost component and a LightGBM component simultaneously.
Because GBNet wraps XGBoost and LightGBM, native features of those packages also come for free in GBNet. In particular, (1) categorical inputs are supported in GBNet without using PyTorch embeddings and (2) SHAP values can be generated for interpretability.
Use Cases
A limited number of use cases will be covered in the talk:
- Forecasting - A hybrid model combining linear trends with seasonal patterns modeled by XGBoost achieves superior performance compared to standard methods like Meta’s Prophet. SHAP values can be used to understand periodicity and trend.
- Embeddings & Contrastive Learning - GBNet allows embeddings to be trained using tree-based methods, supporting applications such as contrastive learning and word embeddings—tasks traditionally dominated by neural networks. Low dimensional embeddings can be fit to understand exact model dynamics.
PhD in Statistics from the University of Chicago
Previously:
- Head of Data Science at Uptake
- Technical Staff at SentiLink
Currently:
- AI and Data Consultant at AlixPartners