Advantages and Disadvantages of Different Crash Modeling Techniques
Advantages and Disadvantages of Different Crash Modeling Techniques
* Corresponding author. Tel.: +1 850 410 6233; fax: +1 850 410 6142. Artificial neural networks (ANN) consists a network of
E-mail addresses: [email protected] (T. Sando), [email protected] many simple processors—units, nodes, or neurons. There
(R. Mussa), [email protected] (J. Sobanjo), [email protected] are a few previous research studies that have used artificial
(L. Spainhour).
1
Tel.: +1 850 410 6191; fax: +1 850 410 6142.
neural networks in crash prediction. Vogt and Bared (1998)
2
Tel.: +1 850 410 6153; fax: +1 850 410 6142. presented an artificial neural network (ANN) concept in
3
Tel.: +1 850 410 6123; fax: +1 850 410 6142. crash modeling. According to the study, the most delicate
0022-4375/$ - see front matter D 2005 National Safety Council and Elsevier Ltd. All rights reserved.
doi:10.1016/j.jsr.2005.10.006
486 T. Sando et al. / Journal of Safety Research - Traffic Records Forum proceedings 36 (2005) 485 – 487
part of neural network modeling is generalization—the Hattori and Takahashi (1999) reported that k-nearest
development of a model that is reliable in predicting future neighbor (k-NN) rule is effective when the probability
crashes. The study also suggests that overfitting (i.e., getting distributions of the feature variables are not known and
weights for which the error is small on the training set that therefore the Bayes decision rule cannot be used. It should
even random variation is accounted for) can be minimized be noted however that the definition of the distance
by having two validation samples in addition to the training measurement in crash variables is difficult and subjective.
sample. The existence of variables that vary in form and magnitude
Musone, Ferrari, and Oneta (1999) used ANN to analyze makes it difficult to establish the distance function. While
urban crashes in the city of Milan in Italy. The study applied some variables are continuous, others are discrete. In
the feed-forward neural networks with a back-propagation addition, even within the continuous and discrete variable
learning paradigm. Abdelwahab and Abdel-Aty (2001) groups, the range of magnitudes and the number of
developed ANN models to predict driver injury severity in categories differ from variable to variable. This lessens the
traffic accidents at signalized intersections. The study appropriateness of the nearest neighbor technique in crash
investigated the use of two well known neural network prediction.
paradigms, the multilayer perceptron (MLP) and fuzzy
adaptive resonance theory (ART) neural networks. The 3.2. Bayesian belief networks technique
MLP neural network has a better generalization perfor-
mance of 65.6% and 60.4% for the training and testing Most of the techniques used for modeling crashes require a
phases, respectively. The performance of the MLP was prior knowledge of the distribution of crash parameters.
compared with an ordered logit model. The ordered logit Sometimes the knowledge about a distribution is not directly
model was able to correctly classify only 58.9% and 57.1% known but instead the statistical dependencies or indepen-
for the training and testing phases, respectively. dencies among the variables are known. For example, by
Artificial neural networks (ANN) approach has the intuition there exists a dependency between side swipe
following advantages: (a) there is no need to assume an crashes with the lane width, vehicle speed and severity of
underlying data distribution; (b) neural networks are the crash, traffic volume and the crash rate, to name a few. The
applicable to multivariate non-linear problems; and (c) the dependency between crash occurrence and traffic factors
transformations of the variables are automated in the such as AADT, geometric factors such as number of lanes,
computational process. However, ANN technique has and design factors such as speed could be established. The
several disadvantages including: (a) minimizing over-fitting internal dependencies could then be represented by condi-
requires a great deal of computational effort, and (b) the tional probabilities that could be used to determine the
individual relations between the input variables and the likelihood of the magnitude of crashes in a particular roadway
output variables are not developed by engineering judgment segment given certain conditions.
so that the model tends to be a black box without analytical The Bayesian Belief Networks technique is fairly new.
basis. The technique is being researched in areas that have
complex dependency of variables such as medical diagnos-
tic systems, real-time weapons scheduling, computer pro-
3. Pattern recognition methods cessor fault diagnosis, generator monitoring expert system,
and software troubleshooting. Bayesian Belief Networks are
3.1. Nearest neighbor rule appropriate for modeling crashes due to the fact that
dependencies between factors are known and can be used
The nearest neighbor analysis is a classification method to construct the belief network structure.
in which the class of an unknown record is assigned after
comparisons between the unknown record and all known
records (training data) in data repository are made. The 4. Summary
degree of similarity between different records is determined
by a function called the distance function. Nukoolkit and The review of several studies on models used in highway
Chen (2001) used two different distance functions— safety modeling data indicates that each method has its
Euclidian Distance (ED) and Value Difference Metric advantages and disadvantages. The advantages and disad-
(VDM) distance both combined with k-mode clustering in vantages of several methods have been discussed in this
predicting whether a car crash will have either an injury or a paper. The suitability of the method depends upon the
non-injury outcome using a subset of year 2000 Alabama desired output and the model inputs. The output of the
interstate alcohol-related crashes. The prediction errors of model could include type of the crash—fatal, injury, and
33% and 45% were observed using ED and VDM methods, property damage only, number of crashes, crash rate,
respectively. The study further proposed an improved segment/intersection severity rating and so forth; the inputs
technique that combines the distance function with decision may include different environmental, traffic, and roadway
tree clustering which reduced the prediction error to 19%. variables. Further review and analysis of different models is
T. Sando et al. / Journal of Safety Research - Traffic Records Forum proceedings 36 (2005) 485 – 487 487
underway. Documentation of limitations of each modeling Vogt, A., & Bared, J. G. (1998). Accident models for two-lane rural roads:
technique will help analysts to decide the best method to use Segments and intersections. Publication FHWA-RD-98-133. FHWA,
US. Department of Transportation.
at each particular modeling problem.
Thobias Sando is a Ph.D. candidate at Florida State University. He is
involved in a research entintled ‘‘Implementation of GIS for Crash Data
References Management.’’ His research interests include the use of Pattern Recognition
Techniques and GIS in crash modeling. He also performs research on the
Abdelwahab, H. T., & Abdel-Aty, M. A. (2001). Development of artificial usability of GPS receivers for collecting field crash data.
neural network models to predict driver injury severity in traffic
accidents at signalized intersections. Journal of the Transportation Renatus Mussa is an Associate Professor and Director of Traffic
Engineering Laboratory at the FAMU-FSU College of Engineering. Dr.
Research Board, 1746, 6 – 13.
Fridstrom, L., Ifver, J., Ingebrightsen, S., Kulmala, R., & Thomsen, L. Mussa has been teaching with the Department of Civil and Environmental
(1995). Measuring the contribution of randomness, exposure, weather, Engineering of the FAMU-FSU College of Engineering since 1998. He has
and daylight to the variation in road accident counts. Accident Analysis excelled in research in different areas of transportation engineering
including intelligent transportation systems, highway safety, and traffic
and Prevention, 27(1), 1 – 20.
Hattori, K., & Takahashi, M. (1999). A new nearest-neighbor rule in the studies.
pattern classification problem. Pattern Recognition, 32, 425 – 432.
John Sobanjo is an Associate Professor at the FAMU-FSU College of
Jovanis, P., & Chang, H. (1986). Modeling the relationship of accidents to
miles traveled. Journal of the Transportation Research Board, 1068, Engineering. His research areas of interest include infrastructure manage-
42 – 51. ment, implementation of new technology in highway applications,
Miaou, S., Hu, P., Wright, T., & Davis, S. (1992). Relationship between construction management, and highway safety. Apart from academics, Dr.
Sobanjo has a vast industrial experience gained from working with
truck accidents and highway geometric design: A Poisson regression
approach. Journal of the Transportation Research Board, 1376, 10 – 18. California Department of Transportation (CALTRANS) and Texas State
Musone, L., Ferrari, A., & Oneta, M. (1999). An analysis of urban Department of Highways and Public Transportation (TxDOT).
collisions using an artificial intelligence model. Accident Analysis and
Lisa Spainhour is an Associate Professor at the FAMU-FSU College of
Prevention., 31, 705 – 718.
Nukoolkit, C., & Chen, H. C. (2001). Improving accuracy of Engineering. Her research interests include field performance of roadside
nearest neighbor algorithm in highway accident prediction. barriers, civil engineering applications of composite materials, and
engineering data modeling and management.
ANNIE ’2001 for the proceedings of smart engineering system design
(vol. 11) (pp. 763 – 768).