Affordable Access

Access to the full text

HINMINE: heterogeneous information network mining with information retrieval heuristics

Authors
  • Kralj, Jan1, 2
  • Robnik-Šikonja, Marko3
  • Lavrač, Nada1, 2
  • 1 Jožef Stefan Institute, Jamova 39, Ljubljana, 1000, Slovenia , Ljubljana (Slovenia)
  • 2 Jožef Stefan International Postgratuate School, Jamova 39, Ljubljana, 1000, Slovenia , Ljubljana (Slovenia)
  • 3 University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, 1000, Slovenia , Ljubljana (Slovenia)
Type
Published Article
Journal
Journal of Intelligent Information Systems
Publisher
Springer US
Publication Date
Jan 28, 2017
Volume
50
Issue
1
Pages
29–61
Identifiers
DOI: 10.1007/s10844-017-0444-9
Source
Springer Nature
Keywords
License
Yellow

Abstract

The paper presents an approach to mining heterogeneous information networks by decomposing them into homogeneous networks. The proposed HINMINE methodology is based on previous work that classifies nodes in a heterogeneous network in two steps. In the first step the heterogeneous network is decomposed into one or more homogeneous networks using different connecting nodes. We improve this step by using new methods inspired by weighting of bag-of-words vectors mostly used in information retrieval. The methods assign larger weights to nodes which are more informative and characteristic for a specific class of nodes. In the second step, the resulting homogeneous networks are used to classify data either by network propositionalization or label propagation. We propose an adaptation of the label propagation algorithm to handle imbalanced data and test several classification algorithms in propositionalization. The new methodology is tested on three data sets with different properties. For each data set, we perform a series of experiments and compare different heuristics used in the first step of the methodology. We also use different classifiers which can be used in the second step of the methodology when performing network propositionalization. Our results show that HINMINE, using different network decomposition methods, can significantly improve the performance of the resulting classifiers, and also that using a modified label propagation algorithm is beneficial when the data set is imbalanced.

Report this publication

Statistics

Seen <100 times