BACKGROUND: Surgical mortality indicators should be risk-adjusted when evaluating the performance of organisations. This study evaluated the performance of risk-adjustment models that used English hospital administrative data for 30-day mortality after neurosurgery. METHODS: This retrospective cohort study used Hospital Episode Statistics (HES) data from 1 April 2013 to 31 March 2018. Organisational-level 30-day mortality was calculated for selected subspecialties (neuro-oncology, neurovascular and trauma neurosurgery) and the overall cohort. Risk adjustment models were developed using multivariable logistic regression and incorporated various patient variables: age, sex, admission method, social deprivation, comorbidity and frailty indices. Performance was assessed in terms of discrimination and calibration. RESULTS: The cohort included 49,044 patients. Overall, 30-day mortality rate was 4.9%, with unadjusted organisational rates ranging from 3.2 to 9.3%. The variables in the best performing models varied for the subspecialties; for trauma neurosurgery, a model that included deprivation and frailty had the best calibration, while for neuro-oncology a model with these variables plus comorbidity performed best. For neurovascular surgery, a simple model of age, sex and admission method performed best. Levels of discrimination varied for the subspecialties (range: 0.583 for trauma and 0.740 for neurovascular). The models were generally well calibrated. Application of the models to the organisation figures produced an average (median) absolute change in mortality of 0.33% (interquartile range (IQR) 0.15-0.72) for the overall cohort model. Median changes for the subspecialty models were 0.29% (neuro-oncology, IQR 0.15-0.42), 0.40% (neurovascular, IQR 0.24-0.78) and 0.49% (trauma neurosurgery, IQR 0.23-1.68). CONCLUSIONS: Reasonable risk-adjustment models for 30-day mortality after neurosurgery procedures were possible using variables from HES, although the models for trauma neurosurgery performed less well. Including a measure of frailty often improved model performance.