Background: The golden standard for sleep classification uses manual scoring of polysomnography despite points of criticism such as oversimplification, low inter-rater reliability and the standard being designed on young and healthy subjects. New method: To meet the criticism and reveal the latent sleep states, this study developed a general and automatic sleep classifier using a data-driven approach. Spectral EEG and EOG measures and eye correlation in 1 s windows were calculated and each sleep epoch was expressed as a mixture of probabilities of latent sleep states by using the topic model Latent Dirichlet Allocation. Model application was tested on control subjects and patients with periodic leg movements (PLM) representing a non-neurodegenerative group, and patients with idiopathic REM sleep behavior disorder (iRBD) and Parkinson's Disease (PD) representing a neurodegenerative group. The model was optimized using 50 subjects and validated on 76 subjects. Results: The optimized sleep model used six topics, and the topic probabilities changed smoothly during transitions. According to the manual scorings, the model scored an overall subject-specific accuracy of 68.3 +/- 7.44 (% mu +/-sigma) and group specific accuracies of 69.0 +/- 4.62 (control), 70.1 +/- 5.10 (PLM), 67.2 +/- 8.30 (iRBD) and 67.7 +/- 9.07 (PD). Comparison with existing method: Statistics of the latent sleep state content showed accordances to the sleep stages defined in the golden standard. However, this study indicates that sleep contains six diverse latent sleep states and that state transitions are continuous processes. Conclusions: The model is generally applicable and may contribute to the research in neurodegenerative diseases and sleep disorders. (C) 2014 Elsevier B.V. All rights reserved.