A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance SciPy 2025

A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance
.ical

2025-07-07 08:00–12:00, Ballroom D

The advancement of AI systems necessitates the need for interpretability to address transparency, biases, risks, and regulatory compliance. The workshop teaches core techniques in interpretability, including SHAP (game-theoretic feature attribution), GINI (decision tree impurity analysis), LIME (local surrogate models), and Permutation Importance (feature shuffling), which provide global and local explanations for model decisions. With hands-on building of interpretability tools and visualization techniques, we explore how these methods enable bias detection and clinical trust in healthcare diagnostics and develop the most effective strategies in finance. These techniques are essential in building interpretable AI to address the challenges of the black-box models.

The rapid adoption of artificial intelligence (AI) systems across industries has created an urgent need for transparency in algorithmic decision-making. As organizations deploy machine learning (ML) models for critical applications ranging from healthcare diagnostics to financial risk assessment, the opacity of these systems poses significant challenges to accountability, fairness, and regulatory compliance. Contemporary AI systems achieve remarkable predictive accuracy at the cost of interpretability. A 2025 analysis of Fortune 500 companies revealed that 78% of deployed ML models function as black boxes, with decision processes inaccessible even to their developers.
Interpretable AI (XAI) sheds light on AI-based decision processes that are comprehensible to human stakeholders and, thus, is a critical bridge between advanced computational capabilities and ethical implementation. In this workshop, we will explore the technical foundations, methodological innovations, and practical implementations of interpretable ML techniques, with a particular focus on SHAP (Shapley Additive explanations), GINI impurity-based analysis, LIME (Local Interpretable Model-agnostic Explanations), and Permutation Importance. Through detailed analysis of real-world applications, theoretical frameworks, and emerging research directions, we demonstrate how these tools enable practitioners to maintain model performance while meeting growing demands for explainability in high-stakes environments.
Interpretable AI has great potential to develop transparency by highlighting which features are important for the black box model to decide. This can increase trust in healthcare, finance, and criminal justice sectors directly affecting human lives. By understanding how an AI model makes decisions, stakeholders can ensure it is fair toward minority classes, such as individuals from protected groups (defined by race, religion, gender, disability, or ethnicity).
This course is intended for data scientists and analysts who want to understand how to interpret black-box models, such as ensemble models, decision trees, and random forests. The course outlines different approaches to Interpretability methods, such as model-dependent and model-agnostic interpretability techniques. Model-dependent techniques such as GINI rely on the algorithm of the black-box model. In contrast, model-agnostic techniques such as SHAP, Lime, and Permutation Importance can analyze any models after training.
The course will provide a thorough hands-on experience by teaching how to code these methods and work through real-world examples. Emphasis will be placed on numerous visualization techniques, such as interpreting Summary Plots, Beeswarms Plots, Waterfall Plots, and Interaction Feature Maps (Dependency Plots), to understand how each feature individually and their interactions influence the model outcomes.
Prior experience in model interpretability is not required. Attendees must possess basic knowledge of ML models to maximize the tutorial's benefits. Familiarity with core Python data science libraries, such as NumPy, Pandas, and Scikit-Learn, is essential. The tutorial will be presented in Jupyter Notebook, enabling participants to follow along, execute examples, and finish exercises independently. A GitHub repository will be available after completion of the workshop, providing instructions for setting up the Python environment and the required packages.
By the end of the tutorial, attendees will understand how to work with different interpretability techniques, compare and contrast each of their unique strengths and weaknesses, and develop a strong foundation for applications in real-world scenarios.

Prerequisites:

Python 3.x
Libraries: scikit-learn, shap, lime, matplotlib, seaborn, numpy, pandas
Basic knowledge of machine learning models and feature importance

See also: Research Paper (120.7 KB)

Dr. Debarshi Datta

Debarshi Datta, PhD is an Assistant Professor in Data Science at the Christine E. Lynn College of Nursing, Florida Atlantic University. He completed PhD in Experimental Psychology at the Charles E. Schmidt College of Science, Florida Atlantic University. Dr. Datta has experience developing AI-driven decision support systems in healthcare data, including understanding problem statements, handling disputes, exploratory data analysis, building models, data visualization, and data storytelling. Dr. Datta’s current research focuses on data-driven domains like AI/ML to understand a population-based disease prognosis. His primary research contribution has been finding the severity of the disease, decision-making for developing a model that comprehends the most significant features predicting mortality, and severity of disease utilizing traditional AI/ML techniques such as decision trees, random forest classifier, XGB boost, and deep learning. In other research, he is building a model to identify early prediction of Dementia. Dr. Datta received many intramural grants, including Early Prediction of Alzheimer's Disease and Related Dementias on Preclinical Assessment Data using Machine Learning tools, Seed Funding from Smart Health for COVID-19 research, NSF I-Corps Customer Discovery Funding, ALL of US Institutional Champion Award, among many others.

Dr. Subhosit Ray

Subhosit Ray, Ph.D., is a Postdoctoral Fellow in Data Science at the Florida Atlantic University Christine E. Lynn College of Nursing. His research areas include interpretable machine learning for healthcare applications, Alzheimer’s Disease, infectious diseases such as COVID-19, and geospatial analysis of GPS Data.

A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance .ical 2025-07-07 08:00–12:00, Ballroom D

A Hands-on Tutorial towards building Explainable Machine Learning using SHAP, GINI, LIME, and Permutation Importance
.ical

2025-07-07 08:00–12:00, Ballroom D