Why Normalizing Flows Fail to Detect Out-of-Distribution Data

Summary

The aim of this paper is to understand the fundamental reason for why normalizing flows fail at OOD detection, rather than treat flows as black-box. This paper supposes that it is the inductive biases of the model that determine its OOD performance. After some experiments and visualization about the transformations learned by individual coupling layers, they find that flows are biased towards learning graphical properties, such as local pixel correlations and generic image-to-latent-space transformations, rather than semantic properties. Based on these observations, the authors discuss two simple ways of changing the inductive biases for better OOD detection: changing masking strategy and st-networks with bottleneck. Moreover, they prove that flows are much better at OOD detection on image embeddings than on the original image datasets.

Motivation

To understand the fundamental reasons for why normalizing flows fail at OOD detection.

Background

Normalizing flows seem to be ideal candidate for OOD detection as a simple generative model that provide an exact likelihood.
normalizing flows, among other deep generative models, can assign higher likelihood to OOD data.
The two work on OOD detection with deep generative models: group anomaly detection (GAD) and Point anomaly detection (PAD).

Conclusion

Flows tend to learn representations that achieve high likelihood through generic graphical features and local pixel correlations, rather than discovering semantic structure that would be specific to the training distribution.

Future work:

a careful study of the sampling in normalizing flows;

A full study of other types of flows.

Note

Abstract:

Flows fails to OOD detection;

Flows (based on coupling layers) learn local pixel correlations and generic image-to-latent-space transformations; (careful study rather than treat flows as black-box.)

It is the inductive biases of the model that determine its OOD performance. We can bias flows towards learn the semantic information;

Flows generate high-fidelity is harmful to OOD detection. (we connect our findings with prior work and provide new insights.)

Why flows fail to detect OOD data?

Semantica is needed as inductive bias for OOD detection. But normalizing flows are biased towards learning graphical properties rather than semantic properties.

Flow latent space

Through visualization, the authors finds that the flow represents images based on their graphical appearance rather than semantic content.

Transformations learned by coupling layers

Study the transformations learned by individual coupling layers.

The likelihood for a given image will be high whenever the coupling layers can accurately predict masked pixels

checkerboard mask: pixel correlations

Horizontal mask: layer co-adaptation

Changing biases in flows for better OOD detection

Discuss two simple ways of changing the inductive biases for better OOD detection.

Changing masking strategy:checkerboard mask, horizontal mask and cycle-mask
st-networks with bottleneck:to prevent coupling layer co-adaptation is to restrict the capacity of the st-networks.

While not solve the failure of OOD detection, but support the observation about preventing the flows from pixels correlations and co-adaptation.

Out-of-distribution detection using image embeddings

Normalizing flows can detect OOD images when trained on high-level semantic representations instead of raw pixels.

Flows are much better at OOD detection on image embeddings than on the original image datasets.

一泓喜悲vv