Affordable Access

Access to the full text

KS(conf): A Light-Weight Test if a Multiclass Classifier Operates Outside of Its Specifications

Authors
  • Sun, Rémy1
  • Lampert, Christoph H.2
  • 1 ENS Rennes, Bruz, France , Bruz (France)
  • 2 IST Austria, Klosterneuburg, Austria , Klosterneuburg (Austria)
Type
Published Article
Journal
International Journal of Computer Vision
Publisher
Springer-Verlag
Publication Date
Oct 10, 2019
Volume
128
Issue
4
Pages
970–995
Identifiers
DOI: 10.1007/s11263-019-01232-x
Source
Springer Nature
Keywords
License
Green

Abstract

We study the problem of automatically detecting if a given multi-class classifier operates outside of its specifications (out-of-specs), i.e. on input data from a different distribution than what it was trained for. This is an important problem to solve on the road towards creating reliable computer vision systems for real-world applications, because the quality of a classifier’s predictions cannot be guaranteed if it operates out-of-specs. Previously proposed methods for out-of-specs detection make decisions on the level of single inputs. This, however, is insufficient to achieve low false positive rate and high false negative rates at the same time. In this work, we describe a new procedure named KS(conf), based on statistical reasoning. Its main component is a classical Kolmogorov–Smirnov test that is applied to the set of predicted confidence values for batches of samples. Working with batches instead of single samples allows increasing the true positive rate without negatively affecting the false positive rate, thereby overcoming a crucial limitation of single sample tests. We show by extensive experiments using a variety of convolutional network architectures and datasets that KS(conf) reliably detects out-of-specs situations even under conditions where other tests fail. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with any classifier that outputs confidence scores, and requires no a priori knowledge about how the data distribution could change.

Report this publication

Statistics

Seen <100 times