Affordable Access

Access to the full text

Cluster and propensity based approximation of a network

Authors
  • Ranola, John Michael1
  • Langfelder, Peter2
  • Lange, Kenneth1, 2, 3
  • Horvath, Steve2, 4
  • 1 University of California, Biomathematics, Los Angeles, CA, USA , Los Angeles (United States)
  • 2 UCLA, Human Genetics, Los Angeles, CA, USA , Los Angeles (United States)
  • 3 UCLA, Statistics, Los Angeles, CA, USA , Los Angeles (United States)
  • 4 UCLA, Biostatistics, Los Angeles, CA, USA , Los Angeles (United States)
Type
Published Article
Journal
BMC Systems Biology
Publisher
Springer (Biomed Central Ltd.)
Publication Date
Mar 14, 2013
Volume
7
Issue
1
Identifiers
DOI: 10.1186/1752-0509-7-21
Source
Springer Nature
Keywords
License
Green

Abstract

BackgroundThe models in this article generalize current models for both correlation networks and multigraph networks. Correlation networks are widely applied in genomics research. In contrast to general networks, it is straightforward to test the statistical significance of an edge in a correlation network. It is also easy to decompose the underlying correlation matrix and generate informative network statistics such as the module eigenvector. However, correlation networks only capture the connections between numeric variables. An open question is whether one can find suitable decompositions of the similarity measures employed in constructing general networks. Multigraph networks are attractive because they support likelihood based inference. Unfortunately, it is unclear how to adjust current statistical methods to detect the clusters inherent in many data sets.ResultsHere we present an intuitive and parsimonious parametrization of a general similarity measure such as a network adjacency matrix. The cluster and propensity based approximation (CPBA) of a network not only generalizes correlation network methods but also multigraph methods. In particular, it gives rise to a novel and more realistic multigraph model that accounts for clustering and provides likelihood based tests for assessing the significance of an edge after controlling for clustering. We present a novel Majorization-Minimization (MM) algorithm for estimating the parameters of the CPBA. To illustrate the practical utility of the CPBA of a network, we apply it to gene expression data and to a bi-partite network model for diseases and disease genes from the Online Mendelian Inheritance in Man (OMIM).ConclusionsThe CPBA of a network is theoretically appealing since a) it generalizes correlation and multigraph network methods, b) it improves likelihood based significance tests for edge counts, c) it directly models higher-order relationships between clusters, and d) it suggests novel clustering algorithms. The CPBA of a network is implemented in Fortran 95 and bundled in the freely available R package PropClust.

Report this publication

Statistics

Seen <100 times