BACKGROUND:Prostate cancer (PC) is the most frequently diagnosed cancer in North American men. Pathologists are in critical need of accurate biomarkers to characterize PC, particularly to confirm the presence of intraductal carcinoma of the prostate (IDC-P), an aggressive histopathological variant for which therapeutic options are now available. Our aim was to identify IDC-P with Raman micro-spectroscopy (RμS) and machine learning technology following a protocol suitable for routine clinical histopathology laboratories. METHODS AND FINDINGS:We used RμS to differentiate IDC-P from PC, as well as PC and IDC-P from benign tissue on formalin-fixed paraffin-embedded first-line radical prostatectomy specimens (embedded in tissue microarrays [TMAs]) from 483 patients treated in 3 Canadian institutions between 1993 and 2013. The main measures were the presence or absence of IDC-P and of PC, regardless of the clinical outcomes. The median age at radical prostatectomy was 62 years. Most of the specimens from the first cohort (Centre hospitalier de l'Université de Montréal) were of Gleason score 3 + 3 = 6 (51%) while most of the specimens from the 2 other cohorts (University Health Network and Centre hospitalier universitaire de Québec-Université Laval) were of Gleason score 3 + 4 = 7 (51% and 52%, respectively). Most of the 483 patients were pT2 stage (44%-69%), and pT3a (22%-49%) was more frequent than pT3b (9%-12%). To investigate the prostate tissue of each patient, 2 consecutive sections of each TMA block were cut. The first section was transferred onto a glass slide to perform immunohistochemistry with H&E counterstaining for cell identification. The second section was placed on an aluminum slide, dewaxed, and then used to acquire an average of 7 Raman spectra per specimen (between 4 and 24 Raman spectra, 4 acquisitions/TMA core). Raman spectra of each cell type were then analyzed to retrieve tissue-specific molecular information and to generate classification models using machine learning technology. Models were trained and cross-validated using data from 1 institution. Accuracy, sensitivity, and specificity were 87% ± 5%, 86% ± 6%, and 89% ± 8%, respectively, to differentiate PC from benign tissue, and 95% ± 2%, 96% ± 4%, and 94% ± 2%, respectively, to differentiate IDC-P from PC. The trained models were then tested on Raman spectra from 2 independent institutions, reaching accuracies, sensitivities, and specificities of 84% and 86%, 84% and 87%, and 81% and 82%, respectively, to diagnose PC, and of 85% and 91%, 85% and 88%, and 86% and 93%, respectively, for the identification of IDC-P. IDC-P could further be differentiated from high-grade prostatic intraepithelial neoplasia (HGPIN), a pre-malignant intraductal proliferation that can be mistaken as IDC-P, with accuracies, sensitivities, and specificities > 95% in both training and testing cohorts. As we used stringent criteria to diagnose IDC-P, the main limitation of our study is the exclusion of borderline, difficult-to-classify lesions from our datasets. CONCLUSIONS:In this study, we developed classification models for the analysis of RμS data to differentiate IDC-P, PC, and benign tissue, including HGPIN. RμS could be a next-generation histopathological technique used to reinforce the identification of high-risk PC patients and lead to more precise diagnosis of IDC-P.