Affordable Access

A comparison of several regression models for analysing cost of CABG surgery.

Authors
  • Austin, Peter C
  • Ghali, William A
  • Tu, Jack V
Type
Published Article
Journal
Statistics in medicine
Publication Date
Sep 15, 2003
Volume
22
Issue
17
Pages
2799–2815
Identifiers
PMID: 12939787
Source
Medline
License
Unknown

Abstract

Investigators in clinical research are often interested in determining the association between patient characteristics and cost of medical or surgical treatment. However, there is no uniformly agreed upon regression model with which to analyse cost data. The objective of the current study was to compare the performance of linear regression, linear regression with log-transformed cost, generalized linear models with Poisson, negative binomial and gamma distributions, median regression, and proportional hazards models for analysing costs in a cohort of patients undergoing CABG surgery. The study was performed on data comprising 1959 patients who underwent CABG surgery in Calgary, Alberta, between June 1994 and March 1998. Ten of 21 patient characteristics were significantly associated with cost of surgery in all seven models. Eight variables were not significantly associated with cost of surgery in all seven models. Using mean squared prediction error as a loss function, proportional hazards regression and the three generalized linear models were best able to predict cost in independent validation data. Using mean absolute error, linear regression with log-transformed cost, proportional hazards regression, and median regression to predict median cost, were best able to predict cost in independent validation data. Since the models demonstrated good consistency in identifying factors associated with increased cost of CABG surgery, any of the seven models can be used for identifying factors associated with increased cost of surgery. However, the magnitude of, and the interpretation of, the coefficients vary across models. Researchers are encouraged to consider a variety of candidate models, including those better known in the econometrics literature, rather than begin data analysis with one regression model selected a priori. The final choice of regression model should be made after a careful assessment of how best to assess predictive ability and should be tailored to the particular data in question.

Report this publication

Statistics

Seen <100 times