Affordable Access

deepdyve-link deepdyve-link
Publisher Website

Model based clustering for tandem mass spectrum quality assessment.

Authors
  • Ding, Jiarui
  • Shi, Jinhong
  • Wu, Fang-Xiang
Type
Published Article
Journal
Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Publication Date
Jan 01, 2009
Volume
2009
Pages
6747–6750
Identifiers
DOI: 10.1109/IEMBS.2009.5332499
PMID: 19963684
Source
Medline
License
Unknown

Abstract

Several computational methods have been proposed to assess the quality of tandem mass spectra. These methods range from supervised to unsupervised algorithms, discriminative to generative models. Unsupervised learning algorithms for tandem mass spectra are not probabilistic model based and they don't provide probabilities for spectra quality assessment. In this study, the distribution of high quality spectra and poor quality spectra are modeled by a mixture of Gaussian distributions. The Expectation Maximization (EM) algorithm is used to estimate the parameters of the Gaussian mixture model. A spectrum is assigned to the high quality or poor quality cluster according to its posterior probability. Experiments are conducted on two datasets: ISB and TOV. The results show about 57.64% and 66.38% of poor quality spectra can be removed without losing more than 10% of high quality spectra for the two spectral datasets, respectively. This indicates clustering as an exploratory data analysis tool is valuable for the quality assessment of tandem mass spectra without using a pre-labeled training dataset.

Report this publication

Statistics

Seen <100 times