AIExpInfra

Mentored by: Applied Materials

Framework and application for evaluating explainability methods in CV

Python

PyTorch

TensorFlow

Captum

Quantus

gRPC

PyQt6

Docker

SQLite

GitHub Presentation

Description

End-to-end explainability evaluation system supporting GradCAM, IG, SmoothGrad, FusionGrad, segmentation-based explainability, metric evaluation (Sensitivity, Completeness, Non-Sensitivity, Sparsity), robustness checks, dual-channel SEM, and integration with Quantus/Captum.

Mentors

Hadassah Heber

Software Engineer

Applied Materials

Odeliah Movadat

Software Engineer

Applied Materials

Team Members

Cohort: Data Science Bootcamp 2025 (Data)

Nechama N.

Responsibilities:

Designed and implemented a unified Python framework for evaluating explainability metrics across classification, regression, segmentation, and detection models.
Standardized heterogeneous model outputs to enable consistent Quantus-based XAI evaluation.
Built a Factory–Registry adapter layer that auto-detects and loads TensorFlow, PyTorch, and ONNX models with a unified prediction/gradient API.
Developed detection wrappers (YOLO, Faster-RCNN, RetinaNet) that convert bounding boxes into scalar targets compatible with gradient-based XAI.
Upgraded legacy TensorFlow SavedModels to TF 2.19 to restore gradient flow for Integrated Gradients and CAM-based methods.
Integrated proprietary Applied Materials models and internal explainers into the unified evaluation pipeline.
Implemented a preprocessing/normalization pipeline for multi-channel SEM and Topography TIFF datasets while preserving gradient paths.
Added support for both single-image and site-level evaluation modes with consistent preprocessing semantics.
Built an automated metric-execution engine for running explainers and computing faithfulness, robustness, localisation, complexity and axiomatic metrics.
Generated automated radar-plot summaries and streamed progress/results through the backend job-runner architecture.

...and more contributions not listed here

Dive in 🚀

Tamar S.

Responsibilities:

Researching and learning the Axiomatic category in Qunatus XAI metrics
Fix the NonSensitivity metric and implement it in the application. The PR of the correct NonSensitivity is here Fix NonSensitivity metric
Study and explanation for the group of the Detecting Concepts chapter from the book Interpretable ML.
Creating wrappers for AMAT's explain functions to fit the format of the Quantus metrics library.
Studying the complexity metric for implementation in the application.
Try to create a system that catches errors from metrics in a generic way and displays them to the user.

...and more contributions not listed here

Dive in 🚀

Leah P.

Responsibilities:

Designed a stable 0–1 scoring function that can wrap any regression model, so faithfulness metrics produce accurate and comparable scores across different models.
Implemented a consistent input-processing layer supporting PyTorch, TensorFlow, YOLO, and ONNX models, ensuring repeatable and framework-agnostic explainability results.
Designed a wrapper that converts multi-channel mask outputs into a single differentiable scalar, allowing segmentation models to be explained using gradient-based methods.
Exposed and wired key explainer and metric parameters so users can highlight and tune the most influential settings directly from the application UI.
Developed a custom pixel-removal pipeline for 2-channel SEM images, iterating through multiple experiments and refinements until achieving a stable faithfulness evaluation
Led the research and integration of Quantus faithfulness metrics into the application, enabling reliable quantitative evaluation of explanation quality across a wide range of computer vision models (both Applied and external).
After exploring alternative designs, implemented a loading mode that preserves gradient flow and exposes selected internal layers, enabling gradient-based explainability without changing model code.
Built GUI components and a heatmaps library that display original images, overlays, and metric tables, and allow users to save and revisit multiple explanation runs per model.
Designed and implemented a dynamic form system that fetches explainer and metric metadata from the server and sends back validated, structured configurations, keeping the frontend and backend loosely coupled.
Designed and ran experiments comparing several explainability methods on Applied’s regression and segmentation models, analyzing their behavior and summarizing insights for the team.

...and more contributions not listed here

Dive in 🚀

Tova A.

Responsibilities:

Structured the backend flow (model → explanation → metrics → UI streaming), making the system modular, scalable, and capable of running long XAI jobs smoothly.
developed a unified model interface that enables supporting multiple model formats and tasks within one consistent explainability pipeline.
Implemented model-loading logic that preserves differentiability across formats, ensuring gradient-based XAI methods operate reliably.
Designed normalization methods that stabilize Quantus metrics such as Complexity and Sparsity, enabling consistent, comparable scores across diverse models and heatmap behaviors.
Integrated the Randomisation metric into the evaluation flow and verified its behavior by checking model kernels and ensuring results reflect reliable model-behavior validation.
Implemented spider-chart generation and supporting logic to present multi-metric results clearly and enable intuitive comparison.
Added live progress, heatmap, and metric streaming to the UI, enabling responsive interaction during long-running explainability evaluations.
Built metric postprocessing that aggregates, scales, and normalizes outputs into uniform, interpretable 0–1 evaluation ranges.

...and more contributions not listed here

Dive in 🚀