Affordable Access

deepdyve-link
Publisher Website

Application of a model-based recursive partitioning algorithm to predict crash frequency.

Authors
  • Tang, Houjun1
  • Donnell, Eric T2
  • 1 Department of Civil and Environmental Engineering, The Pennsylvania State University, 212 Sackett Building, University Park, PA 16802, United States. Electronic address: [email protected] , (United States)
  • 2 Department of Civil and Environmental Engineering, The Pennsylvania State University, 212 Sackett Building, University Park, PA 16802, United States. Electronic address: [email protected] , (United States)
Type
Published Article
Journal
Accident; analysis and prevention
Publication Date
Aug 22, 2019
Volume
132
Pages
105274–105274
Identifiers
DOI: 10.1016/j.aap.2019.105274
PMID: 31446099
Source
Medline
Keywords
Language
English
License
Unknown

Abstract

Count regression models have been applied widely in traffic safety research to estimate expected crash frequencies on road segments. Data mining algorithms, such as classification and regression trees, have recently been introduced into the field to overcome some of the assumptions associated with statistical models. However, these data-driven algorithms usually provide non-parametric output, making it difficult to draw statistical inference or to evaluate how independent variables are associated with expected crash frequencies. In this paper, the model-based recursive partitioning (MOB) algorithm is applied in a crash frequency application. The algorithm incorporates the concept of recursive partitioning data in tree models and develops user-defined statistical models as outputs. The objective of this paper is to explore the potential of the MOB algorithm as a methodological alternative to parametric modeling methods in crash frequency analysis. To accomplish the objective, a standard negative binomial (NB) regression model, a NB model developed using the MOB algorithm, adjusted NB models which incorporate variables identified by the MOB algorithm, and a random parameters NB model are compared using 8 years of data collected from two-lane rural highways in Pennsylvania. The models are compared in terms of data fitness, sign and magnitude of statistical association between the independent and dependent variables, and predictive power. The results show that the MOB-NB model yields better data fitness than other NB models, and provides similar performance to the RPNB model, suggesting that the MOB-NB model may be capturing unobserved heterogeneity by dividing the data into subgroups. The presence of a passing zone and posted speed limit are two covariates identified by the MOB algorithm that differentiate variable effects among subgroups. In addition, the MOB-NB model provides the highest prediction accuracy based on the training and test data sets, although the difference among models is small. The comparison results reveal that the MOB algorithm is a promising alternative to identify covariates, evaluate variable associations and instability, and make predictions in a crash frequency context. Copyright © 2019. Published by Elsevier Ltd.

Report this publication

Statistics

Seen <100 times