Implemented Monitors¶

ASH¶

🔗 https://openreview.net/pdf?id=ndYXTEL6cZz

💻 monitizer.monitors.ASHMonitor

ℹ️ The method, Activation Shaping (ASH), modifies a model’s internal activations at test time by pruning a large portion of the highest activation values at a chosen layer. It then rescales or redistributes the remaining activations to preserve their overall contribution. This reshaping process emphasizes the structural differences between in-distribution and OOD inputs at the feature level.

⚙️ -m ash or -m ash_b or -m ash_p or -m ash_s or -m ash_rand (for the pruning strategies)

🔧 Parameters: threshold, layer, percentile

Box¶

🔗 https://ebooks.iospress.nl/publication/55170

💻 monitizer.monitors.BoxMonitor

ℹ️ This paper introduces an OOD detection method that monitors a neural network’s hidden layer activations by constructing multidimensional “boxes” representing the range of activation values seen during training. At inference, inputs are flagged as OOD if their activations fall outside these learned boxes, indicating novel or unseen behavior.

⚙️ -m box

🔧 Parameters: layer_indices, clustering_threshold

DICE¶

🔗 https://arxiv.org/abs/2111.09805

💻 monitizer.monitors.DICEMonitor

ℹ️ This paper proposes a sparsification-based OOD detection method called DICE, which ranks network weights by their contribution and selectively uses only the most salient weights during inference. By pruning less important weights, the method reduces noise in the network’s output, leading to more stable and distinct responses for out-of-distribution inputs. This weight-level sparsification sharpens the output distribution, improving the separability between in-distribution and OOD data.

⚙️ -m dice

🔧 Parameters: threshold, percentile

Energy¶

🔗 https://arxiv.org/abs/2302.02914

💻 monitizer.monitors.EnergyMonitor

ℹ️ The method introduces a unified framework for OOD detection that utilizes an energy score instead of the traditional softmax confidence.

⚙️ -m energy

🔧 Parameters: threshold, temperature

Entropy¶

🔗 https://arxiv.org/abs/1908.05569

💻 monitizer.monitors.EntropyMonitor

ℹ️ The method introduces the IsoMax loss function, a direct replacement for the standard SoftMax loss, designed to address issues inherent in traditional OOD detection approaches.

⚙️ -m entropy

🔧 Parameters: threshold

Gaussian¶

🔗 https://doi.org/10.1007/978-3-030-88494-9_14

💻 monitizer.monitors.GaussMonitor

ℹ️ The method introduces a lightweight approach for OOD detection in deep neural networks by modeling the activation patterns of neurons using Gaussian distributions. It monitors the activation values of neurons during inference and compares them to the learned Gaussian models to identify deviations indicative of OOD inputs.

⚙️ -m gauss

🔧 Parameters: layer_indices, thresholds

GradNorm¶

🔗 https://proceedings.neurips.cc/paper_files/paper/2021/file/063e26c670d07bb7c4d30e6fc69fe056-Paper.pdf

💻 monitizer.monitors.GradNormMonitor

ℹ️ The method introduces a novel approach for OOD detection by leveraging information from the gradient space. Specifically, it computes the vector norm of gradients backpropagated from the Kullback-Leibler (KL) divergence between the softmax output and a uniform distribution.

⚙️ -m gradnorm

🔧 Parameters: threshold, temperature

KLMatching¶

🔗 https://arxiv.org/abs/1911.11132

💻 monitizer.monitors.KLMatchingMonitor

ℹ️ This method matches the test distribution with a known distribution and calculates the minimum KL-divergence between the softmax and the mean class-conditional distributions

⚙️ -m klmatching

🔧 Parameters: threshold

K-Nearest-Neighbors¶

🔗 https://proceedings.mlr.press/v162/sun22d.html

💻 monitizer.monitors.KNNMonitor

ℹ️ The method detects OOD samples by measuring distances between a test input’s deep feature representation and those of the training set. It identifies a sample as OOD if its nearest-neighbor distance in the feature space exceeds a defined threshold.

⚙️ -m knn

🔧 Parameters: threshold

Mahalanobis Distance¶

🔗 https://papers.nips.cc/paper_files/paper/2018/file/abdeb6f575ac5c6676b747bca8d09cc2-Paper.pdf

💻 monitizer.monitors.MDSMonitor

ℹ️ The method detects OOD samples by modeling the feature representations of a pre-trained neural network using class-conditional Gaussian distributions. It computes the Mahalanobis distance between a test sample’s features and the closest class distribution, using this distance as a confidence score.

⚙️ -m mds

🔧 Parameters: threshold, perturbation, layer_indices, weights

Maximum Softmax Probability¶

🔗 https://arxiv.org/abs/1610.02136

💻 monitizer.monitors.MSPMonitor

ℹ️ This method detects OOD inputs by examining the maximum softmax probability output by a neural network. It operates on the principle that in-distribution samples tend to produce higher maximum softmax probabilities compared to misclassified or OOD samples.

⚙️ -m msp

🔧 Parameters: threshold

Maximum Logit¶

🔗 https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Decoupling_MaxLogit_for_Out-of-Distribution_Detection_CVPR_2023_paper.pdf

💻 monitizer.monitors.MaxLogitMonitor

ℹ️ Uses the highest logit value for scoring confidence.

⚙️ -m maxlogit

🔧 Parameters: threshold

ODIN¶

🔗 https://arxiv.org/abs/1706.02690

💻 monitizer.monitors.ODINMonitor

ℹ️ The method enhances OOD detection by applying two key techniques to a pre-trained neural network: temperature scaling and input perturbation. Temperature scaling adjusts the softmax outputs to be less confident, while input perturbation involves adding small, carefully crafted noise to the input, which amplifies the distinction between in-distribution and OOD samples.

⚙️ -m odin

🔧 Parameters: threshold, perturbation, confidence_threshold

RMD¶

🔗 https://arxiv.org/pdf/2106.09022.pdf

💻 monitizer.monitors.RMDMonitor

ℹ️ This method enhances the Mahalanobis distance by normalizing the Mahalanobis distance relative to the distribution of distances observed during training, effectively adjusting for class-specific variability and improving robustness to hyperparameter choices.

⚙️ -m rmd

🔧 Parameters: threshold, layer_indices

ReAct¶

🔗 https://arxiv.org/abs/2111.12797

💻 monitizer.monitors.ReActMonitor

ℹ️ The method mitigates the overconfidence of NNs on unfamiliar inputs by clipping high-magnitude activations in the penultimate layer during inference, effectively suppressing anomalous activation patterns characteristic of OOD data.

⚙️ -m react

🔧 Parameters: threshold, layer, maximum

SHE¶

🔗 https://openreview.net/pdf?id=KkazG4lgKL

💻 monitizer.monitors.SHEMonitor

ℹ️ The method detects OOD samples by leveraging the energy function from Modern Hopfield Networks in a store-then-compare framework. It stores class-wise average representations from the penultimate layer of a neural network as prototypes for ID data. At inference, the method computes the Hopfield energy between a test sample’s representation and these stored prototypes; a higher energy indicates a greater discrepancy, suggesting the sample is OOD. This approach is hyperparameter-free and computationally efficient, relying solely on in-distribution data patterns without requiring model retraining.

⚙️ -m she

🔧 Parameters: threshold

Temperature Scaling¶

🔗 https://arxiv.org/pdf/1706.04599.pdf

💻 monitizer.monitors.TemperatureScalingMonitor

ℹ️ The method addresses the miscalibration often observed in modern NNs, where predicted probabilities do not accurately reflect true correctness likelihoods. It introduces a single scalar parameter, the temperature TT, applied to the logits (pre-softmax outputs) of the model. By dividing the logits by TT before applying the softmax function, the method adjusts the confidence levels of the predictions without altering the model’s accuracy.

⚙️ -m temperature or -m scaling

🔧 Parameters: threshold

VIM¶

🔗 https://arxiv.org/abs/2203.10807

💻 monitizer.monitors.VIMMonitor

ℹ️ The method detects OOD by creating a virtual logit that represents an unknown class, computed by projecting input features onto the orthogonal complement of the principal feature subspace and rescaling. This virtual logit is appended to the original class logits, and its softmax probability is used as the OOD score.

⚙️ -m vim

🔧 Parameters: threshold, dimension

Implemented Monitors¶

ASH¶

Box¶

DICE¶

Energy¶

Entropy¶

Gaussian¶

GradNorm¶

KLMatching¶

K-Nearest-Neighbors¶

Mahalanobis Distance¶

Maximum Softmax Probability¶

Maximum Logit¶

ODIN¶

RMD¶

ReAct¶

SHE¶

Temperature Scaling¶

VIM¶

Monitizer

Navigation

Related Topics