Affordable Access

deepdyve-link
Publisher Website

Integrative sparse principal component analysis of gene expression data.

Authors
  • Liu, Mengque1
  • Fan, Xinyan1
  • Fang, Kuangnan1
  • Zhang, Qingzhao1, 2
  • Ma, Shuangge1, 2, 3
  • 1 Department of Statistics, School of Economics, Xiamen University, Xiamen, China. , (China)
  • 2 Wang Yanan Institute of Economics Studies, Xiamen University, Xiamen, China. , (China)
  • 3 Department of Biostatistics, Yale University, New Haven, Connecticut, United States of America. , (United States)
Type
Published Article
Journal
Genetic Epidemiology
Publisher
Wiley (John Wiley & Sons)
Publication Date
Dec 01, 2017
Volume
41
Issue
8
Pages
844–865
Identifiers
DOI: 10.1002/gepi.22089
PMID: 29114920
Source
Medline
Keywords
License
Unknown

Abstract

In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance.

Report this publication

Statistics

Seen <100 times