Affordable Access

Bias in misspecified mixtures.

Authors
Type
Published Article
Journal
Biometrics
Publication Date
Volume
50
Issue
2
Pages
457–470
Identifiers
PMID: 8068845
Source
Medline
License
Unknown

Abstract

A finite mixture is a distribution where a given observation can come from any of a finite set of components. That is, the density of the random variable X is of the form f(x) = pi 1f1(x) + pi 2f2(x) + ... + pi kfk(x), where the pi i are the mixing proportions and the fi are the component densities. Mixture models are common in many areas of biology; the most commonly applied is a mixture of normal densities. Many of the problems with inference in the mixture setting are well known. Not so well documented, however, are the extreme biases that can occur in the maximum likelihood estimators (MLEs) when there is model misspecification. This paper shows that even the seemingly innocuous assumption of equal variances for the components of the mixture can lead to surprisingly large asymptotic biases in the MLEs of the parameters. Assuming normality when the underlying distributions are skewed can also lead to strong biases. We explicitly calculate the asymptotic biases when maximum likelihood is carried out assuming normality for several types of true underlying distribution. If the true distribution is a mixture of skewed components, then an application of the Box-Cox power transformation can reduce the asymptotic bias substantially. The power lambda in the Box-Cox transformation is in this case treated as an additional parameter to be estimated. In many cases the bias can be reduced to acceptable levels, thus leading to meaningful inference. A modest Monte Carlo study gives an indication of the small-sample performance of inference procedures (including the power and level of likelihood ratio tests) based on a likelihood that incorporates estimation of lambda. A real data example illustrates the method.

Statistics

Seen <100 times