Artificial neural networks (ANNs) are being used increasingly for the prediction of clinical outcomes and classification of disease phenotypes. A lack of understanding of the statistical principles underlying ANNs has led to widespread misuse of these tools in the biomedical arena. In this paper, the authors compare the performance of ANNs with that of conventional linear logistic regression models in an epidemiological study of infant wheeze. Data on the putative risk factors for infant wheeze have been obtained from a sample of 7318 infants taking part in the Avon Longitudinal Study of Parents and Children (ALSPAC). The data were analysed using logistic regression models and ANNs, and performance based on misclassification rates of a validation data set were compared. Misclassification rates in the training data set decreased as the complexity of the ANN increased: h = 0: 17.9%; h = 2: 16.2%; h = 5: 14.9%, and h = 10: 9.2%. However, the more complex models did not generalise well to new data sets drawn from the same population: validation data set misclassification rates: h = 0: 17.9%; h = 2: 19.6%; h = 5: 20.2% and h = 10: 22.9%. There is no evidence from this study that ANNs outperform conventional methods of analysing epidemiological data. Increasing the complexity of the models serves only to overfit the model to the data. It is important that a validation or test data set is used to assess the performance of highly complex ANNs to avoid overfitting.