Affordable Access

A Machine Learning Approach for the Classification of Kidney Cancer Subtypes Using miRNA Genome Data

  • Ali, Ali Muhamed
  • Zhuang, Hanqi
  • Ibrahim, Ali
  • Rehman, Oneeb
  • Huang, Michelle
  • Wu, Andrew
Publication Date
Nov 29, 2018
External links


survival. Thus, developing automated tools that can accurately determine kidney cancer subtypes is an urgent challenge. It has been confirmed by researchers in the biomedical field that miRNA dysregulation can cause cancer. In this paper, we propose a machine learning approach for the classification of kidney cancer subtypes using miRNA genome data. Through empirical studies we found 35 miRNAs that possess distinct key features that aid in kidney cancer subtype diagnosis. In the proposed method, Neighbourhood Component Analysis (NCA) is employed to extract discriminative features from miRNAs and Long Short Term Memory (LSTM), a type of Recurrent Neural Network, is adopted to classify a given miRNA sample into kidney cancer subtypes. In the literature, only a couple of kidney subtypes have been considered for classification. In the experimental study, we used the miRNA quantitative read counts data, which was provided by The Cancer Genome Atlas data repository (TCGA). The NCA procedure selected 35 of the most discriminative miRNAs. With this subset of miRNAs, the LSTM algorithm was able to group kidney cancer miRNAs into five subtypes with average accuracy around 95% and Matthews Correlation Coefficient value around 0.92 under 10 runs of randomly grouped 5-fold cross-validation, which were very close to the average performance of using all miRNAs for classification.

Report this publication


Seen <100 times