Abstract The use of vibrational spectroscopy for diagnosis and staging of cancer is extremely attractive, promising many benefits over the currently used histopathology methods. The hypothesis underlying this approach is that cancers have characteristic biochemical fingerprints that can be captured using spectroscopy. To relate complex multivariate spectra to disease state, machine-learning methods are typically used to recognize diagnostic spectral patterns. This article provides an extensive review of this field. The average diagnostic performance of the reviewed studies is impressive (>90% sensitivity and specificity) but most studies were small (<40 samples). Furthermore, diagnostic performance has often been calculated using methods now known to be overoptimistic. We conclude that, if the combination of spectroscopy and machine learning is to translate into clinical practice, larger studies are needed and researchers should routinely provide spectral data in support of their publications so that the data can be reanalyzed by other groups.