Affordable Access

Reduction Techniques for Instance Based Text Categorization

Authors
Publisher
IFIP Advances in Information and Communication Technology (AICT)
Publication Date
Disciplines
  • Computer Science

Abstract

One of the most common problems in instance-based learning of text categorization is high dimensionality of feature space and problem of deciding which instances to store for use during generalization. These problems can be solved with use of reduction methods. In this paper, comparison of three reduction techniques for feature space reduction and one algorithm for reduction of storage requirements is presented. These techniques were combined with k-NN (k-Nearest Neighbors) classifier, which is one of the top-performing methods in the text classification tasks. We describe the benefit of this combination of methods and present results with the Reuters-21578 dataset. Full Text at Springer, may require registration or fee

There are no comments yet on this publication. Be the first to share your thoughts.