BackgroundSkin fibrosis is the clinical hallmark of systemic sclerosis (SSc), where collagen deposition and remodeling of the dermis occur over time. The most widely used outcome measure in SSc clinical trials is the modified Rodnan skin score (mRSS), which is a semi-quantitative assessment of skin stiffness at seventeen body sites. However, the mRSS is confounded by obesity, edema, and high inter-rater variability. In order to develop a new histopathological outcome measure for SSc, we applied a computer vision technology called a deep neural network (DNN) to stained sections of SSc skin. We tested the hypotheses that DNN analysis could reliably assess mRSS and discriminate SSc from normal skin.MethodsWe analyzed biopsies from two independent (primary and secondary) cohorts. One investigator performed mRSS assessments and forearm biopsies, and trichrome-stained biopsy sections were photomicrographed. We used the AlexNet DNN to generate a numerical signature of 4096 quantitative image features (QIFs) for 100 randomly selected dermal image patches/biopsy. In the primary cohort, we used principal components analysis (PCA) to summarize the QIFs into a Biopsy Score for comparison with mRSS. In the secondary cohort, using QIF signatures as the input, we fit a logistic regression model to discriminate between SSc vs. control biopsy, and a linear regression model to estimate mRSS, yielding Diagnostic Scores and Fibrosis Scores, respectively. We determined the correlation between Fibrosis Scores and the published Scleroderma Skin Severity Score (4S) and between Fibrosis Scores and longitudinal changes in mRSS on a per patient basis.ResultsIn the primary cohort (n = 6, 26 SSc biopsies), Biopsy Scores significantly correlated with mRSS (R = 0.55, p = 0.01). In the secondary cohort (n = 60 SSc and 16 controls, 164 biopsies; divided into 70% training and 30% test sets), the Diagnostic Score was significantly associated with SSc-status (misclassification rate = 1.9% [training], 6.6% [test]), and the Fibrosis Score significantly correlated with mRSS (R = 0.70 [training], 0.55 [test]). The DNN-derived Fibrosis Score significantly correlated with 4S (R = 0.69, p = 3 × 10− 17).ConclusionsDNN analysis of SSc biopsies is an unbiased, quantitative, and reproducible outcome that is associated with validated SSc outcomes.