Extreme value correction: a method for correcting optimistic estimations in rule learning
- Authors
- Type
- Published Article
- Journal
- Machine Learning
- Publisher
- Springer US
- Publication Date
- Jun 26, 2018
- Volume
- 108
- Issue
- 2
- Pages
- 297–329
- Identifiers
- DOI: 10.1007/s10994-018-5731-3
- Source
- Springer Nature
- Keywords
- License
- Yellow
Abstract
Machine learning algorithms rely on their ability to evaluate the constructed hypotheses for choosing the optimal hypothesis during learning and assessing the quality of the model afterwards. Since these estimates, in particular the former ones, are based on the training data from which the hypotheses themselves were constructed, they are usually optimistic. The paper shows three different solutions; two for the artificial boundary cases with the smallest and the largest optimism and a general correction procedure called extreme value correction (EVC) based on extreme value distribution. We demonstrate the application of the technique to rule learning, specifically to estimating classification accuracy of a single rule, and evaluate it on an artificial data set and on a number of UCI data sets. We observed that the correction successfully improved the accuracy estimates. We also describe an approach for combining rules into a linear global classifier and show that using EVC estimates leads to more accurate classifiers.