Affordable Access

Research of Feature Selection for Text Clustering Based on Cloud Model

Authors
Publisher
ACADEMY PUBLISHER
Publication Date
Keywords
  • Feature Selection
  • Cloud Model
  • Tf-Idf
  • K-Means Algorithm
Disciplines
  • Computer Science
  • Logic

Abstract

Text clustering belongs to the unsupervised machine learning, the discriminability of class attributes cannot be measured in clustering. And the traditional text feature selection methods cannot effectively solve the high-dimensional problem. To overcome the weakness in existing feature selection, this paper proposes a new method which introduces the cloud model theory into feature selection, constructs the clouds filter for clustering documents. The distribution of document words is constructed in a microcosmic level. By employing the cloud model digital characteristics we can better compute the separability between feature words. Experimental results with K-means algorithm show that our method can remarkably improve the accuracy of text clustering.

There are no comments yet on this publication. Be the first to share your thoughts.