Affordable Access

deepdyve-link deepdyve-link
Publisher Website

Boosting with missing predictors.

Authors
  • Wang, C Y
  • Feng, Ziding
Type
Published Article
Journal
Biostatistics (Oxford, England)
Publication Date
Apr 01, 2010
Volume
11
Issue
2
Pages
195–212
Identifiers
DOI: 10.1093/biostatistics/kxp052
PMID: 19948743
Source
Medline
License
Unknown

Abstract

Boosting is an important tool in classification methodology. It combines the performance of many weak classifiers to produce a powerful committee, and its validity can be explained by additive modeling and maximum likelihood. The method has very general applications, especially for high-dimensional predictors. For example, it can be applied to distinguish cancer samples from healthy control samples by using antibody microarray data. Microarray data are often high-dimensional and many of them are incomplete. One natural idea is to impute a missing variable based on the observed predictors. However, the calculation of imputation for high-dimensional predictors with missing data may be rather tedious. In this paper, we propose 2 conditional mean imputation methods. They can be applied to the situation even when a complete-case subset does not exist. Simulation results indicate that the proposed methods are superior than other naive methods. We apply the methods to a pancreatic cancer study in which serum protein microarrays are used for classification.

Report this publication

Statistics

Seen <100 times