V .. .. 3.5.2.. ‘ th—auA . 1 , A . ‘ . V a. a. . , . . . i . . . , . . a ii , , , ‘ . .3 w a; . {.3 v ,2 i . ‘ . . 3‘ Jrahm-HH-dqvu p . . , :35 , aux Solitary»: .3 km“ z.... . . "a. Q I 3: . mg. rah; :2 ‘ , . , . 23....» ,xu‘ng‘xmmu ivy. .mwfl s ad». I .86 _. u 4 . w, .. yeah ’4". 1.1.3! 4 ‘ , ‘ air...“ ‘ .. , , , , .mma L» g I. I ‘ , . . . x. . .Frmv. V .995. , . . . ‘ rvfluwrvfl {.3 ”w. s. 3 .y. ‘ ~ . . . - § .. .3? < k . “.13, ‘ .v 3.2» t? .ertdh. $ — .. )5 .\II #3“ y 5‘ r‘ 2 y. ~. 2.. . . This is to certify that the dissertation entitled EVALUATION OF SAFETY AT FREEWAY INTERCHANGES presented by Nakmoon Sung has been accepted towards fulfillment of the requirements for Ph.D. Civil Engineering degree in lag/(dz gm c, fly/A4 Major professor ‘ Date 30 (967 1000 MS U i: an Affirmative Action/Equal Opportunity Institution 0-12771 ‘ LIBRARY Michigan State University PLACE IN RETURN BOX to remove this che ckout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 11m Wm.“ EVALUATION OF SAFETY AT FREEWAY INTERCHANGES By Nakmoon Sung A DISSERTATION Submitted to Michigan State University In partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Civil and Environmental Engineering 2000 ABSTRACT EVALUATION OF SAFETY AT FREEWAY INTERCHANGES By Nakmoon Sung This research focused on several issues that arise when the Negative Binomial distribution rather than the Poisson distribution(which have been the commonly accepted assmnption in analyzing traffic accidents), is found to better fit the accident data. On the basis of the Negative Binomial distribution, the framework of the rate quality control method was redefined as a basis for the identification of hazardous sites. This produced conceptually more reasonable results than the existing approaches such as the Poisson distribution based rate quality control method, or the Bayes approach. However, it is sometimes not efficient for traffic engineers to apply this approach since the parameters of the Negative Binomial distribution can not be easily estimated. Therefore, a Normal approximation method to overcome this issue was developed. The Normal approximation method identified the same hazardous sites from a list of two common interchange types found on several freeways in Michigan. Although the rate quality control method based on the Negative Binomial distribution is an effective technique for the identification of hazardous sites, it has two limitations. First, the selection of reference sites is a matter of judgement. Second, a sufficient number of reference sites with similar characteristics are not always available to assure statistical accuracy. As an alternative, a prediction model method was developed. This method produced results similar to those from the rate quality control method. By using the prediction model method, the conceptual and practical problems associated with the identification of hazardous sites can be reduced. The Generalized Linear Model concept was used to calibrate the accident prediction models. To Haein and J inmo ACKNOWLEDGEMENTS I would like to express my sincere appreciation to my major professor, Dr. William C. Taylor for his invaluable wisdom and kindness during the journey of this research. I would like to express my gratitude to the other members of the committee, Dr. Thomas L. Maleck, Dr. Virginia P. Sisiopiku and Dr. Vincent Melfi for their suggestions and interest in this work. A particular note of thanks is extended to Dr. Robert Maki and Mr. Robert Rios of the Michigan Department of Transportation for helping me use several databases, and Mr. Oleg V. Makhnin, Mr. Linyuan Li and Mr. Jan Hannig of the Statistical Consultant Service(SCS) of the Department of Statistics and Probability for advice in understanding various statistical concepts. Finally, I would like to express my thanks to my wife, Byungcheon, for her love and patience, and my parents for their encouragement through the years. TABLE OF CONTENTS LIST OF TABLES ............................................................................................................. ix LIST OF FIGURES ........................................................................................................... xi CHAPTER 1 INTRODUCTION ............................................................................................................... l 1.1 Background and problem identification ......................................................................... 1 1.2 Proposed research objectives ......................................................................................... 4 1.3 Structure of this dissertation .......................................................................................... 5 CHAPTER 2 THE PROBABILITY DISTRIBUTION OF TRAFFIC CRASHES AT FREEWAY INTERCHANGES ............................................................................................................... 6 2.1 General ........................................................................................................................... 6 2.2 Concept of the Poisson distribution and Negative Binomial distribution ...................... 8 2.3 Phenomena of over-dispersion over time ...................................................................... 9 2.3.1 Analysis by crash types ......................................................................................... 9 2.3.2 Analysis by annual crash frequency per interchange (Diamond interchanges) ....................................................................................... 13 2.3.3 The results ........................................................................................................... 16 CHAPTER 3 A TRAFFIC CRASH PREDICTION MODEL FOR FREEWAY INTERCHANGES17 3.1 General ......................................................................................................................... 17 3.2 Dependent variable description .................................................................................... 20 3.2.1 Classification by interchange type ...................................................................... 21 3.2.2 Crash data summary ............................................................................................ 25 3.3 Independent variable description ................................................................................. 33 3.4 Preliminary analyses of variables ................................................................................ 36 3.4.1 Correlation analysis ............................................................................................ 36 3.4.2 Analysis of variance (ANOVA) .......................................................................... 38 3.5 Model structure ............................................................................................................ 42 3.6 Model calibration and analysis .................................................................................... 49 3.6.1 Link functions for the Generalized linear model (GLIM) approach ................... 50 3.6.2 Assessing the goodness of fit of the model ......................................................... 52 3.6.3 The procedures used in parameter calibration .................................................... 55 3.6.4 Results of the model calibration ......................................................................... 58 3.6.5 Pearson Residuals ............................................................................................... 61 3.7 A comparison of model calibration results according to the error structure (Normal versus Negative Binomial assumption) ......................................................... 63 3.8 Sensitivity analysis of the accident prediction model .................................................. 65 vi CHAPTER 4 THE DEVELOPMENT OF A METHOD TO IDENTIFY HAZARDOUS SITES BASED ON THE NEGATIVE BINOMIAL DISTRIBUTION ........................................ 68 4.1 General ......................................................................................................................... 68 4.2 A review of the statistical methods for identifying hazardous sites ............................ 69 4.2.1 The rate quality control method .......................................................................... 69 4.2.2 The Bayes approach ............................................................................................ 74 4.2.3 The problems resulting from the use of the Poisson distribution ....................... 77 4.3 The identification of hazardous sites based on the Negative Binomial distribution....79 4.3.1 Concept ............................................................................................................... 79 4.3.2 Estimation of the parameters .............................................................................. 81 4.3.3 Application and validation of the Negative Binomial approach ......................... 87 CHAPTER 5 A SIMPLIFIED APPROACH FOR OVER-DISPERSION (NORMAL APPROXIMATION METHOD) ................................................................... 93 5.1 General ......................................................................................................................... 93 5.2 Concept ........................................................................................................................ 93 5.3 Application and validation of the Normal approximation method .............................. 95 5.3.1 Estimation of parameters .................................................................................... 96 5.3.2 Validation of the Normal approximation approach ............................................ 96 5.4 An examination of the assumptions ........................................................................... 100 5.4.1 Goodness of fit of the Normal distribution ....................................................... 100 5.4.2 Large values of u, ............................................................................................. 101 CHAPTER 6 IDENTIFICATION OF HAZARDOUS SITES USING A TRAFFIC CRASH PREDICTION MODEL (PREDICTION MODEL METHOD) ...................................... 104 6.1 The limitation of the rate quality control method (or upper control limit method) .................................................................................. 104 6.2 The concept of the prediction model method ............................................................ 107 6.3 Application and validation of the prediction model method ..................................... 111 6.3.1 Illustration of the prediction model method ...................................................... l 1 1 6.3.2 Validation of the prediction model method ...................................................... 1 14 6.4 Evaluation of Michigan freeway interchanges on the basis of the prediction model ......................................................................................................................... 1 19 CHAPTER 7 SUMMARY AND CONCLUSIONS .............................................................................. 121 7.1 Summary .................................................................................................................... 121 7.1.1 Traffic crash distribution ................................................................................... 121 7.1.2 Traffic crash prediction model .......................................................................... 122 7.1.3 The rate quality control method based on the Negative Binomial distribution ........................................................................................................ 123 7.1.4 The Normal approximation method .................................................................. 124 7.1.5 The prediction model method ........................................................................... 125 vii 7.2 Conclusions ................................................................................................................ 126 BIBLIOGRAPHY ............................................................................................................ 127 APPENDICES ................................................................................................................. 133 A. The results of accident prediction model calibration .......................................... 134 B. The results of evaluating of freeway interchanges .............................................. 145 viii LIST OF TABLES Table 2.1 The correlation and residual values according to the distribution ..................... 16 Table 3.1 Interchange classification .................................................................................. 24 Table 3.2 Summary of crashes per interchange (1996 ~ 1998) ......................................... 25 Table 3.3 Injury percent by interchange type (1996 ~ 1998) ............................................. 29 Table 3.4 Summary of mainline and ramp crashes (1996 ~ 1998) .................................... 31 Table 3.5 Summary of the crash types ( 1996 ~ 1998) ....................................................... 32 Table 3.6 Classification of independent variables ............................................................. 35 Table 3.7 Correlations between independent variables ..................................................... 37 Table 3.8 ANOVA for traffic efi‘ects ................................................................................. 39 Table 3.9 ANOVA for geometric effects ........................................................................... 41 Table 3.10 Cross tabulation of crashes by mainline volume and ramp volume ................ 44 Table 3.11 The results of crash prediction model calibration (Interchange type 11) ........ 60 Table 3.12 A comparison of model calibration results according to error structure (Normal and Negative Binomial assumption) .................................................. 64 Table 3.13 Sensitivity analysis (effect of main geometric variables) ................................ 67 Table 4.1 The estimation of parameters ............................................................................. 87 Table 4.2 A comparison of hazardous sites according to the methods (Diamond interchanges) ..................................................................................... 91 Table 4.3 A comparison of hazardous sites according to the methods (Par Clo A or B 4 Q Interchanges) .................................................................... 92 Table 5.1 Estimation of parameters ................................................................................... 96 Table 5.2 A comparison with the results using the Negative Binomial distribution ......... 98 Table 6.1 A comparison of results ................................................................................... 117 ix Table 6.2 The abnormal sites by the prediction model method ....................................... 120 Table A.1 The results of accident prediction model calibration (Interchange type 11) ....................................................................................... 135 Table A2 The results of accident prediction model calibration (Interchange type 12) ....................................................................................... 136 Table A.3 The results of accident prediction model calibration (Interchange type 13) ....................................................................................... 137 Table A.4 The results of accident prediction model calibration (Interchange type 14) ....................................................................................... 138 Table A5 The results of accident prediction model calibration (Interchange type 21) ....................................................................................... 139 Table A6 The results of accident prediction model calibration (Interchange type 31) ....................................................................................... 140 Table A.7 The results of accident prediction model calibration (Interchange type 33) ....................................................................................... 141 Table A.8 The results of accident prediction model calibration (Interchange type 35) ....................................................................................... 142 Table A9 The results of accident prediction model calibration (Interchange type 41) ....................................................................................... 143 Table A.10 The results of accident prediction model calibration (Interchange type 51) ....................................................................................... 144 Table B] The results of evaluating of freeway interchanges by the prediction model method ..................................................................... 146 LIST OF FIGURES Figure 2.1 Observed and theory variances by crash types (Poisson distribution) ............. l I Figure 2.2 Observed and theory variances by crash types (Negative Binomial distribution) ...................................................................... 12 Figure 2.3 Observed and theory variances of annual crash frequency (Poisson distribution) ........................................................................................ 14 Figure 2.4 Observed and theory variances by annual crash frequency (Negative Binomial distribution) ...................................................................... 15 Figure 3.1 Boundary of the interchange ............................................................................ 23 Figure 3.2 Total crashes and injury crash percent ............................................................. 28 Figure 3.3 Mainline traffic volume and crashes ................................................................ 45 Figure 3.4 Ramp traffic volume and crashes ..................................................................... 46 Figure 3.5 The process to calibrate coefficients & k parameter ........................................ 57 Figure 3.6 Pearson Residuals and E(x) .............................................................................. 62 Figure 4.1 The concept of rate quality control method under the Poisson distribution ..... 73 Figure 4.2 Identification of hazardous sites by Bayes approach ....................................... 76 Figure 4.3 A comparison of the Negative Binomial and Poisson assumption in applying rate quality control method .............................................................................. 78 Figure 5.1 A comparison of the results of the Negative Binomial and Normal approximation approaches ............................................................................... 99 Figure 5.2 The difference between the true and approximate upper control limits ......... 103 Figure 6.1 Basic geometric classification tree for reference sites ................................... 106 Figure 6.2 Concept of prediction model method ............................................................. 108 Figure 6.3 An example of application of prediction model method (95 %) .................... l 13 xi Figure 6.4 A comparison of results of the rate quality control method and prediction model method ................................................................................................. 1 18 Chapter 1 INTRODUCTION 1.1 Background and problem identification In response to limited budgets, it has become very important to ensure that funding available for road improvements is efficiently utilized. A typical safety program includes identification, diagnosis, and remediation of hazardous locations, and hence the success of the safety program can be enhanced by efficiently identifying hazardous locations. A hazardous location is defined as a site where the observed number of crashes is larger than a specific norm (a record of crashes at locations with similar characteristics). That is, a site is deemed hazardous if its crash history over a given period exceeds a predetermined level which is based on the concept of confidence levels within the context of classical statistics (Witkowski 1988). The observed number of crashes over a specific period at a specific site can usually be obtained from a database related to traffic crashes. However, several difficulties arise in determining a base for comparing this number to an expected number of crashes at reference sites that are defined as sites with similar geometric and traffic characteristics. Hauer (92) recognized that the identification of hazardous sites using reference sites causes conceptual and practical problem in nature. The main conceptual problem is that of choosing suitable reference sites, which is a matter of judgement. The practical problem is that if very similar sites are chosen to reduce the variations caused fiom the conceptual difficulties, the number of reference sites will usually be too small to allow for an accurate estimate of the hazard at a given site. These same questions were also raised by Mahalel (1982), Hauer and Persaud (1987), and Mountain and Fawaz (1989). There are 397 interchanges along the four main Interstates (I-69, I-75, I-94 and I-96) in Michigan. In order to define reference sites for the evaluation of a given interchange in Michigan, the interchanges were first classified according to their geometry; such as interchange type, the number of ramps, shoulder width, the number of lanes, ramp length et al., and second according to traffic conditions. However, with this level of stratification, it was not possible to obtain enough reference sites to guarantee a significant level of accuracy for each type of interchange. To overcome these difficulties, a method using a crash prediction model to identify hazardous sites was examined in this study. The basic concept of the prediction model method is that the expected value of crashes at the reference sites E(6) can be obtained by developing a crash prediction model rather than on the basis of reference sites. A specific site is deemed to be hazardous if the probability of the number of observed crashes occurring at the site is smaller than some predetermined values (i.e.,0.05). That is, a hazardous location is one in which the deviation from the expected crash frequency E(G) is large. The prediction model method is a technique to identify hazardous sites, based on an expected value which is calculated by accident prediction models. Thus, if this method is to be accurate, it is important to develop the traffic crash prediction models under appropriate rationale. There are generally two kinds of crash prediction models which differ according to the assumption of the error structures. One is the conventional linear regression model with a constant normal error structure, the other is a regression model with a non normal and heterogeneous error structure (i.e., Poisson and Negative Binomial distribution). In this research, we have examined the error structures of crash occurrences in various respects on the basis of the observed data, and verified that crashes on freeway interchanges follow the Negative Binomial distribution rather than a Normal or Poisson distribution. Accordingly, the model parameters should be calibrated under the assumption of the Negative Binomial error structure. The classical rate quality control method has been used by many transportation agencies to identify hazardous sites since it was first proposed in the transportation field in 1956 (Stokes and Mutabazi 1996). This method uses a statistical test to determine whether the crash rate of a site is abnormally high, compared with that of reference sites. Therefore, if the crashes follow the Negative Binomial distribution, the rate quality control method should be reexamined because it is based on the assumption that the probability of traffic crash occurrences can be approximated by the Poisson distribution (Norden et a1. 1956, Morin 1967, and Stokes and Mutabazi 1996). 1.2 Proposed research objectives The four major objectives of this research are: 1) to verify that the fieeway traffic crashes follow the Negative Binomial distribution rather than the Poisson distribution, 2) to develop crash prediction models for freeway interchanges using the Negative Binomial distribution, 3) to provide a new framework for the rate quality control method for identifying hazardous sites on the basis of the Negative Binomial distribution, and 4) to propose a method for the identification of hazardous sites using a traffic crash model calibrated on the basis of the Negative Binomial distribution. Even though there are several objectives for this research, each is based on the assumption that the error structure follows a Negative Binomial distribution. First, this research will describe how traffic engineers can apply the rate quality control method based on the Negative Binomial distribution. However, there are interchanges where this method can not be applied because an insufficient number of reference sites are available to allow for an accurate evaluation. To solve this kind of problem, a method for identifying hazardous sites using a crash prediction model is proposed. The prediction model method can be used to evaluate a freeway interchange without reference sites, and to determine the sites in need of remedial actions. 1.3 Structure of this dissertation The background and objective of this research have been discussed in the first chapter. The issues related to the distribution of crash occurrences are analyzed in chapter 2. In chapter 3, the effort is focused on parameter calibration of the crash prediction models for freeway interchanges based on the Negative Binomial error structure. This chapter includes the description of independent variables, such as traffic and geometric features, the model structures, methods to converge nonlinear regression models, and measures of model accuracy. In addition, sensitivity analyses of the models is discussed in this chapter. Chapter 4 presents the problem resulting from applying the Poisson error assumption in the existing rate quality control method, and develops a new framework for the rate quality control method on the basis of the Negative Binomial error assumption. Chapter 5 focuses on how this rate quality control method based on the Negative Binomial distribution can be simplified through a Normal approximation for the purpose of user convenience. This chapter also demonstrates that the Normal approximation method produces the same results as the proposed rate quality control method. Chapter 6 describes how the prediction model method can be used as an alternative for the identification of hazardous sites when the number of reference sites is insufficient to allow for significant results. Based on the prediction model method, about 200 interchanges along Michigan freeways are evaluated. A summary and conclusions occupy the last chapter of this dissertation. Chapter 2 THE PROBABILITY DISTRIBUTION OF TRAFFIC CRASHES AT FREEWAY IN TERCHANGES 2.1 General The most appropriate distribution of crash occurrences is a fundamental question that ofien arises in the traffic safety field. For example, the Poisson distribution frequently appears in articles identifying hazardous locations using control limit charts, because of its simplicity caused from the assumption that the variance is the same as the mean (Norden et al 1956, Hauer 1996). It has also been recognized that the Poisson distribution provides a better fit to traffic crash data than the Normal distribution (Miaou et al 1992, Jovanis and Chang 1993). However, in studying the injury severity to the front seat occupants of vehicles in crashes, Hutchinson and Mayne (1977) realized that there appeared to be more variability of different severity levels occurring in different years than would be expected on the hypothesis of the Poisson distribution. When there is greater variability than expected by Poisson' law, we call this phenomenon over-dispersion. Issues related to this over- dispersion are also implicit in the works of earlier researchers (Benneson and McCoy 1997, Vogt and Bared 1999). Consequently, two distributions (Poisson and Negative Binomial) have been assumed for traffic crash occurrences. However, no researcher has yet provided a full discussion of the issue, even though the assumption of the probability distribution for crash occurrence is very important for the identification of hazardous sites necessary for highway safety programs and for the calibration of crash prediction models. For example, with the rate quality control method, a site is identified as hazardous if its observed crash rate exceeds the upper control limit, which is the mean crash rate of reference sites plus a multiple of the standard deviation of the site crash rates (Stokes and Mutabazi 1996). Herein, the standard deviation is equal to the square root of the mean for a Poisson distribution and the square root of the (mean + mean 2 /k) for the Negative Binomial distribution, respectively (Rice 1997). Three distributions have generally been assumed for the calibration of traffic crash prediction models (i.e., constant normal, Poisson and Negative Binomial). However, recently there is an implicit agreement between traffic engineers that the Poisson or Negative Binomial distributions are more desirable assumptions than the constant normal distribution. Crash prediction models with a heterogeneous error structure such as the Poisson or Negative Binomial distribution, are generally calibrated using weighted least squares (Seber and Wild 1989). In weighted least square regression, data points are weighted by the reciprocal of their variances. Thus, in calibrating traffic crash models, the assumption of error structures is a very critical issue in determining the accuracy of coefficients. Because of the importance of the distribution of crash occurrences, the year to year variability in the number of crashes is examined and discussed in this chapter. 2.2 Concept of the Poisson distribution and Negative Binomial distribution The Poisson distribution is often the first option considered for random counts; it has the property that the mean of the distribution is equal to the variance (Rice 1997) and the following fiequency function: X p(X = x) = exM-m'xm) (2.1) x. where, m = mean However, when the variance of the counts is substantially larger than the mean, consideration is given to the Negative Binomial distribution, which is a discrete distribution with the following frequency function (Rice 1997): _k x _ 1r F(k +x) m f(x/m,k)—[l+ k] x!F(k) (”Hka (2.2) where, m = mean k 2 negative binomial parameter 2.3 Phenomena of over-dispersion over time In examing the freeway interchange crash data over time, there appeared to be more variability than would be expected under the hypothesis of the Poisson distribution. The large variability could be expected because there are many factors to cause the annual crash frequency to vary, including maintenance activities, the weather and traffic changes. The Negative Binomial distribution might be considered as a model for the situation in which the rate varies over time or space(Rice 1997). The Negative Binomial distribution has been assumed to explain various physical phenomena; the distribution of insect counts if the insect hatch from the depositions of larvae(Rice 1997). Thus, it is not unique to apply the Negative Binomial distribution in analyzing discrete random counts. Two kinds of data sets are utilized to test the over-dispersion. One is the number of crashes classified by type, the other is the number of crashes per interchange per year across 84 interchanges. Analyses of the over-dispersion were performed for the crashes during the 5 year-period 1994-1998. 2.3.1 Analysis by crash types To test over-dispersion of the crashes which occurred in freeway interchanges, crash frequencies of each of 24 types of crashes were obtained separately for each of 5 years from 1994 to 1998. The variance and the mean annual number of crashes were calculated on the basis of the crashes that occurred over the 5 years. To test whether the crash occurrences follow the Poisson distribution, the observed variances of the annual number of crashes were plotted against the annual mean value. Therefore, there are 24 points corresponding to the 24 types of crashes. In Figure 2.1, the solid line is the variance that would be expected on the hypothesis of the Poisson distribution. Ifthe Poisson distribution is a good fit, the observed variances should lie along the solid line. However, the figure shows that there is larger variability than would be expected under the Poisson distribution. There is a much larger variability in the most common types of crashes (rear end, sideswipe) than for the less common types of crashes (backing, fixed object). This phenomenon was discussed in previous research (Hutchinson and Mayne 1977). Noting that the Negative Binomial distribution is an alternative to reflect the phenomenon of over-dispersion, the maximum likelihood estimate of k was determined to be about 71 by fitting the data to the Negative Binomial distribution. In Figure 2.2, the solid line is the variance that would be expected on the hypothesis of the Negative Binomial distribution. This figure shows that the Negative Binomial distribution fits the data much better than the Poisson distribution shown in Figure 2.1. 10 ooooow $825133 .8365 893 :33 .3 895:5; .985 ES 63.536 fiN unawfi oooow A A , ooow cams or cow Door oooow ooooor aoueueA 11 Eat—523:. .5825— o>=awozv 893 £229 .3 823?: .985 a...“ 13.539 N.N 0.53...— can: ooooor oooor coo. ea? or F , , F o. co, m u. 3 U 3 a ooow coco, ooooov 12 2.3.2 Analysis by annual crash frequency per interchange (Diamond interchanges) To see how widely this relationship applies, a similar approach was used for Diamond interchanges, which is the most common type of freeway interchange in Michigan. The variance and the mean annual number of crashes were calculated from the total number of crashes that occurred on the same 84 interchanges from 1994 through 1998. The observed variances in the annual numbers of crashes were also plotted against the mean annual numbers, with a data point corresponding to each of the 84 interchanges. In Figure 2.3, the solid line is the variance that would be expected on the hypothesis of the Poisson distribution, and we see that there is also greater variability than expected by the Poisson distribution, as in the previous case. When the data were fit to the Negative Binomial distribution, it was found that the maximum likelihood estimate for k is about 21. Figure 2.4 shows that the Negative Binomial distribution fits the data much better than the Poisson distribution. 13 Ecznfituflc .83.ch .355?!» 5.98 FEE.“ .«o 8953...; boos. “Eu 603930 Md PEEK cues oooF ooF or F or aoueueA oow ooow 14 AuetantE—c .3825— otéawoZv 5:250...— gmano 3:53 we moon-«ta.» been. can 63.530 YN gamma cue: ooor cor or F or eoungeA cor Door 15 2.3.3 The results For theoretical support of these results, correlation coefficients and squared residuals were calculated for the data in Figure 2.1 through Figure 2.4. As shown in Table 2.1, the correlation coefficients between the observed and the expected variances increased from 0.91 to 0.97 and from 0.84 to 0.90 in the analysis of 24 crash types and annual total crashes, respectively, when the Negative Binomial distribution was assumed. Squared residuals were calculated using the observed variances and expected variances. The residuals were reduced by more than 80 % when the Negative Binomial distribution was assumed as shown in Table 2.1. Thus, we can conclude that the Negative Binomial distribution is a more reasonable assumption for the distribution of freeway interchange crashes than the Poisson distribution. Table 2.1 The correlation and residual values according to the distribution Poisson Negative Binomial Correlation coefficient Correlation coefficient Squared Residual Accident type 0.91 0.97 87%U Annual crash frequency 0.84 0.90 84%U l6 Chapter 3 A TRAFFIC CRASH PREDICTION MODEL FOR FREEWAY INTERCHANGES 3.1 General There have been several studies which purpose was to develop crash prediction models using the relationship between traffic crashes and various independent variables. In all such studies, the first issue is selection of the independent variables. Using characteristics of a county, Maleck (1980) and Tarko et a1 (1996) developed models for predicting the expected annual crashes for a county. Independent variables in these models consist of a subset of the following factors: the number of licensed drivers, the number of registered vehicles, population, median family income, road mileage, and percentage of state roads over all ones. Mcguigan (1981), Maher and Summersgill (1996), Persaud and Nguyen (1998), Rodriguez and Sayed (1999), Bonneson and McCoy (1997), Lau and May (1988), and Belanger (1994) developed crash prediction models for signalized or unsignalized intersections. These models include one or more of the following independent variables; major road traffic volume, minor road traffic volume, pedestrian volume and channelization on the main road. The main road traffic and minor road traffic have been found to be the most significant variables. 17 Hauer and Griffith (1994), Vogt and Bared (1999), Seder and Livneh (1981), and Moutain et a1 (1996) developed crash prediction models for road sections using only the traffic volume. In addition, Hauer and Persaud (1987) used traffic volume and train volume for crash models of rail-highway grade crossings, and Miaou et a1 (1992) modeled truck crashes using geometric characteristics and truck ADT. A few researchers modeled the effects of independent variables on traffic crashes on freeways. Kim (1989) used interchange types, traffic volume, population and the number of ramps to develop a crash prediction model for freeway interchanges. All of these models would be classified as macroscopic models because they use average daily traffic (ADT), rather than the traffic volume at the time of the crash. Persaud and Dzbik (1993) developed a microscopic model to estimate crashes on freeway sections. Microscopic models relate crash occurrences to the specific flow at the time of the crash rather than to the average daily traffic (ADT). Hence a freeway with intense flow during rush hour periods would have a higher crash potential than a freeway with the same ADT, but with flow more evenly distributed during the day. As noted above, traffic volume is considered the main contributing factor in predicting traffic crashes in most of the models, with additional geometric variables chosen based on the objective of modeling. The second issue in the development of an accident prediction model is how to calibrate the model parameters, which usually depend on the error structure. There are 18 two approaches that are often used when calibrating model parameters. One is a conventional linear regression approach, with its assumption of a normally distributed and homogeneous error structure. The linear regression approach has been recognized to be lacking the distribution properties to adequately describe the discrete, nonnegative, and sporadic traflic crash events with a low mean value (Mahalel 1986, Miaou and Lum1993). Before the Poisson approach was introduced, most models were developed on the basis of multi linear regression, with the assumption of a normal distribution. For example, McGuigan (1981), Kim (1989), and Lau and May (1988) used the normal error structure to calibrate their crash prediction models. The other approach is the use of a regression model, with a non -normal and heterogeneous error structure. These include the Poisson, Negative Binomial and Gamma distributions. It has been generally recognized that crash frequencies better fit a model using the assumption of a Poisson distribution rather than a Normal distribution. For example, Miaou et a1. (1992, 1994) proposed the Poisson model to develop the relationship between truck crashes and geometric design. J ovanis and Chang (1993) also used the Poisson model to relate crashes to mileage and environmental variables. However, the Poisson model also has its weakness. For example, the Poisson model assumes that the variance is the same as the expected number, and hence it can not reflect the phenomenon of "over-dispersion" which ofien occurs in traffic crashes. In order to overcome this problem, Persaud and Nguyen (1998), and Rodriguez and Sayed 19 (1999) have proposed regression models with the Negative Binomial error structure to predict signalized intersection crashes. The phenomenon of over-dispersion on freeway crashes has been verified and discussed in chapter 2. In this chapter, a crash prediction model for freeway interchanges will be developed under the assumption of a Negative Binomial error structure. 3.2 Dependent variable description The focus on freeway interchange crashes requires a working definition of the boundary of an interchange. In this study, the interchange is composed of ramps and mainlines. The ramps include on- ramps and off-ramps, and the mainlines are defined as the section within 500 feet from the beginning of the off- ramp to 500 feet from the end of the on- ramp as shown in Figure 3.1. This definition is the same as that of the Michigan DOT interchange inventory file. The crashes on cross roads are not included in this study because of the practical barrier that traffic volume for the cross road is not available, and the engineering intuition that the crashes on the cross road may have very different characteristics (i.e., low severity, high percentage of angle crashes). May (1964) found that there is little to be gained by using a study period longer than three years. Subsequently, many previous researchers have used three years of crash data in developing crash prediction models (Miaou and Lum 1993, Bonneson and Macoy 1993, Persaud and Nguyen 1998). Noting that data older than three years may not reflect 20 the current situations, the number of crashes that occurred in the past 3 years (1996 through 1998) are used as the dependent variable for this study. The accident rate will not be used as the dependent variable since accurate volume data for each element of the interchange is not available. The original source of the crash data is the "Official Michigan Traffic Accident Report' (form UD-lO). The crash data are summarized in section 3.2.2 3.2.1 Classification by Interchange type A lack of homogeneity refers to the understanding that different relationships may hold between variables on the basis of the values of various characteristics (i.e., geometry, control, traffic, and so on). In many cases, tree structures which are easily understood and interpreted, are built describing the main factors and interactions between factors (Lau and May 1988). However, the tree structures can be used only in the case of large samples, and hence this method may be inadequate in developing crash prediction models for freeway interchanges, even though it is a conceptually powerful and systematic tool. In this study, a total of 199 interchanges are grouped into 10 categories as shown in Table 3.1. We can not classify the interchanges more specifically because of the limitation of sample sizes, even though the Michigan interchange inspection file includes 22 categories of interchanges. In the approach to grouping interchanges, the independent variables (i.e., traffic volume, ramp length, et a1) were explicitly excluded from the features which were used in the classification of interchange types. 21 As shown in Table 3.1, the number of type 11 and type 31 interchanges is relatively large compared with those of other types. 22 owns—.285 05 we Ewan—5m fin 0.53m A 32350:th mm... .3 :25 Iv .Sm 2 o a 95a co magnum 2 o a gene we Em 23 Table 3.1 Interchange classification CLASSIFICATION INTERCHANGE TYPE SAMPLE SIZE Type 11 . Diamond 34 l. DIAMOND INTERCHANGE Type 12 . Tight Diamond 19 . Modified Tight Diamond Type 13 . Partial Diamond 24 . Partial Tight Diamond Type 14 . Split Diamond 14 . Modified Diamond Type 21 o Trumpet — A 2. T-INTERCHANGES . Trumpet _ B 9 . Partial Clover A . Partial Clover B Type 31 . Partial Clover A 4 Quadrant 41 . Partial Clover B 4 Quadrant 3. CLOVER LEAFS . Partial Clover AB Type 33 . Partial Clover AB 4 Quadrant 21 . Clover Type 35 . Clover with CD 8 . Full Directional . Partial Directional 4' DIRECTIONAL Type 41 . Directional Y 21 . Partial Directional Y 5- OTHERS Type 51 . Others 8 TOTAL 199 24 3.2.2 Crash data summary 3.2.2.1 Summary of crashes per interchange The summary statistics describing the crashes that have occurred over 3 years in each interchange are provided in Table 3.2. As listed in the table, an average of 126 crashes occurred in each interchange, 28 % of which were injury crashes. The average number of crashes is highest in Directional interchanges, and lowest in T-interchanges. Table 3.2 Summary of crashes per interchange (1996~1998) Interchange type Total crashes Injury crashes (include fatal crashes) Max Min Average Max Min Average Type 11 321 24 132 93 6 39 Diam‘md Type 12 492 42 123 156 6 33 Type 13 252 18 120 84 3 33 Type 14 393 24 99 135 3 27 T-interchange Type 21 156 21 75 69 6 24 Type 31 402 33 135 99 6 33 ample“ Type 33 237 24 84 54 3 21 Type 35 405 51 168 138 12 48 Directional Type 41 408 21 186 111 3 54 Others Type 51 408 21 180 45 6 21 Total 492 18 126 156 3 36 25 3.2.2.2 Summary of injury data Figure 3.2 shows the relationship between total crashes and injury crash percentage. As shown in the figure, the smaller the total number of crashes, the greater the scatter of injury crash percentage. Therefore, total crashes are a more reliable dependent variable than injury crashes, because there is always implicit variability in injury crashes. In the case of the interchanges with a small number of crashes, this variability may inappropriately model the effects of the independent variables on crashes. Table 3.3 contains summary statistics of injury crashes that occurred in the past 3 years. It is not surprising that the percent of injury crashes is relatively high for T- interchanges and Directional interchanges (30.8 % and 29.2 % respectively), considering that the vehicle operating speeds on these types of interchanges are high compared with those on other types of interchanges. The coefficient of variation V(x) is a stable measure of the variability of a random variable x, which is defined as (Harr 1996): V(x) = 52—8 x 100 (%) The higher the coefficient of variation V(x), the greater will be the scatter. As a rule of thumb, coefficients of variation below 15 % are thought to be low, between 15 and 30 % moderate, and greater than 30 °/o high (Harr 1996). 26 As shown in the last row of the Table 3.3, the coefficient of variation of injury percent across the interchange types is 10. 8 %, which is low. This implies that interchange types are related to the number of crashes, but not the severity of the crashes. Thus, for this study, the total number of crashes is used as the dependent variable for the development of traffic crash prediction models. 27 0mm 2.3.59 guano him—z 6:: mos—v.98 .53. ~.m 25w:— mozmfio he n com com CON om? 00—. om H P % MnIUI 00.? 28 Table 3.3 Injury percent by interchange type (1996~1998) Interchange type Total crashes Injury crashes Injury (°/o) Type 11 4479 1272 28.4 Diamm‘d Type 12 2211 600 27.1 Type 13 2886 822 28.5 Type 14 1380 393 28.5 T-interchange Type 21 681 210 30.8 Type 31 5388 1380 25.6 Clmr'leaf Type 33 1779 453 25.5 Type 35 1347 381 28.3 Directional Type 41 4074 1188 29.2 Others Type 51 699 177 25.3 Total 24924 6876 27.6 V(x) - - 10.8 29 3.2.2.3 Summary of mainline and ramp crashes Table 3.4 presents a statistical summary of mainline and ramp crashes that occurred from 1996 to 1998. Ramp accidents are about 4300 of the total 25000 crashes, or about 17 %. There is a large variability in the percent of ramp crashes across the interchange types, as shown in the table. That is, the coefficient of variation is 344 %, which is extremely high. This implies that different explanatory variables are needed when developing crash prediction models by interchange type. Table 3.5 presents data on the crash type according to the interchange type. In our sample sites, rear end crashes account for 39.7 % of total crashes. Rear end crashes are especially high in Type 11(Diamond) and Type 35 (Clover leaf) interchanges, and low in Type 33(Partial Clover AB or Partial Clover AB 4 Q). Fixed object and sideswipe crashes are 20.9 % and 14.1 %, respectively, as shown in the table. The coefficients of variance of a special type of crash percent across interchange types range from 53 % to 172 %, which are high. Accordingly, one recognizes that the different types of interchanges are associated with different types of crashes. It is very important to analyze crash type by interchange type because the crash type provides clues for treatment of a hazardous site. For example, if there were a high percent of sideswipe crashes in an interchange, traffic engineers would analyze in detail the merge section to find the solution. If there were many rear end crashes at an interchange, one possibility is that the ramp length is too short to accelerate to freeway speeds before vehicles enter the mainline. If there are many overturn crashes at an 30 interchange, one possibility is that there may be an imbalance between the radius of the ramp curve and the exit speed limit onto the ramp. Thus it is valuable to classify crashes according to the crash type. Table 3.4 Summary of mainline and ramp crashes (1996~1998) Interchange type Total Mainline Ramp crashes Crashes % Crashes % Type 11 4479 3780 84.4 699 15.6 Diamm‘l Type 12 2211 1872 84.7 339 15.3 Type 13 2886 2634 91.3 252 8.7 Type 14 1380 1329 96.3 51 3.7 T-interchange Type 21 681 486 71.4 195 28.6 Type 31 5388 4323 80.2 1065 19.8 Clover‘leaf Type 33 1779 1470 82.6 309 17.4 Type 35 1347 960 71.3 387 28.7 Directional Type 41 4074 3135 77.0 939 23.0 Others Type 51 699 642 91.8 57 8.2 Total 24924 20631 82.8 4293 17.2 V(X) - - - - 344 31 Table 3.5 Summary of the crash types (1996~1998) Total Rear end Fixed obj Sideswipe Others Interchange type crashes (overturn) # °/o # % # % # % Type 1 1 4479 2247 50.2 908 20.3 463 10.3 861 19.2 Diamond . Type 12 2211 910 41.2 450 20.3 360 16.3 492 22.2 Type 13 2886 963 33.4 615 21.3 467 16.2 842 29.2 Type 14 1380 604 43.7 321 23.3 102 7.4 353 25.6 T- Type 2] 681 205 30.0 184 27.1 88 12.9 205 30.0 interchange Type 31 5388 2122 39.4 1275 23.7 816 15.1 1175 21.8 Clover'leaf Type 33 1779 400 22.5 434 24.4 374 21.0 571 32.1 Type 35 1347 694) 51.5 234 17.4 130 9.7 289 21.5 Directional Type 41 4074 1527 37.5 625 15.3 590 14.5 1332 32.7 Others Type 51 699 233 33.3 153 21.9 117 16.8 196 28.0 Total 24924 9905 39.7 5199 20.9 3509 14.1 6314 25.3 V(x) 172 53 118 85 32 3.3 Independent variable description Independent variables used for this study consist of traffic data and geometric data. The traffic data are: 1) Mainline traffic volume, 2) Ramp traffic volume, and 3) Truck traffic volume and truck percent. Average daily traffic (ADT) on mainlines of freeways has been shown to be an important contributing factor in predicting interchange traffic crashes. The Michigan Department of Transportation (MDOT) maintains about 100 permanent traffic recorders located at various sites throughout the state. The traffic volume data at these counter locations are used to estimate the ADT on all highway segments each year. Ramp ADT is also considered to be an important independent variable for model development. The ramp ADT are traffic volumes on every on and off ramp (including loop) within the Ramp Counting program jurisdiction (Detroit Metropolitan area, Flint, Lansing, Grand Rapids, Jackson, etc). Any missing ramp data is estimated by reviewing previous years' traffic volumes and adjacent ramps. This adjustment implies an assumption that if traffic exits a freeway, it will return through the same intersection, going the opposite way. 33 Truck percent was also included, based on engineering intuition that truck ADT and mainline ADT, or truck ADT and ramp ADT may have the same mechanistic origin, which causes multicollinearity in crash prediction models. Geometric data were obtained from the sufficiency rating file(1994) and freeway interchange inventory fi1e(1997), which are maintained by the Michigan DOT. Table 3.6 presents all variables that are intuitively thought to effect crash frequency, and are possible to obtain. An analysis of variance (ANOVA) of all independent variables was performed to determine which variables have a significant effect on the dependent variable (i.e., crash frequency). The results of this preliminary analysis are discussed in detail in section 3.4. 34 Table 3.6 Classification of independent variables Independent variables Variable type 1 Variable type 2 Traffic 0 Mainline traffic(ADT) effects a On ramp traffic(ADT) o On and Off ramp traffic(ADT) - Truck traffic(Truck ADT) 0 Truck percent (°/o) Geometric o Interchange length (miles) 0 Number of lanes effects 0 Average spread - ramp 0 Number of on ramps length (miles) 0 Total number of ramps 0 Average loop- ramp 0 Shoulder width(feet) length (miles) 0 Lighting condition 35 3.4 Preliminary analyses of variables 3.4.1 Correlation analysis There is an implicit assumption in statistical model development that the independent variables are mutually independent. It is generally accepted that multicollinearity exists when a linear combination of independent variables is highly correlated, and that it is difficult to identify independent variable effects on the dependent variable (Neter et a1. 1992, Sever and Wild 1989). Therefore, explanatory variables with low collinearity should be selected in the process of modeling. To evaluate the mutual independence between variables, a correlation table was produced. As shown in Table 3.7, some of the independent variables are identified as relatively highly correlated. For example, the correlation between the ramp traffic volume and the interchange size, and the correlation between the mainline traffic volume and shoulder width are 0.454 and - 0.411 respectively. Those are not high enough to be excluded in the first stage of model developments. However, these variables are carefully dealt with in the detailed process of modeling. 36 593 co; ommd- 39o- god- wmmd oood wmod- Nmod awed mvmd- :1? 83:95 2: ~26- wvmd ”and- SN? wfld- oood- 3v- Nome mmvd 3an wanna be new no mo ocog mm _ .o- mm M .o mmod- vwmd god mvod aovd awed 598:2 mesa—mo 8o; Dm— .o- oomd- 8 ~ .9. god vmvd- _m _ .o oomd 898:2 swag 2: 83 fig 83- $3 89o S3- 38.30.5330; 5on— 2: Sod 43o- Sad .33. 33- geaaoozmeoé as. 2: ovod- Nnod vad Sod- owcmnocoafi E083 co; wmvd vwmd- vcvd- 0.th 0829 cc; mood wmod oEmb xosb. 2:23 2: 32 0E2 25 A23— 58 2:29 84 oEwb 25522 388 3:2 5ma2 2:23 2:23 A28— 53 523 be new .8 8:2 uo mash 685m 95.782 4&5— Eoocoa Shah cash 05“: 82.5% 855 mo 89:82 59:52 owfio>< owfio>< owgnobufi onH x03: 95M 8ng 833...; «515935 :3an macaw—9.30 Em «Bah. 37 3.4.2 Analysis of variance (ANOVA ) Analysis of variance (ANOVA) techniques are a useful tool for analyzing the statistical relationship between a dependent variable and independent variables. In fact, these may be considered as a special case of linear regression. However, ANOVA models allow analyses of statistical relations from a different perspective than with linear regression, and therefore are widely used. In this section the ANOVA is used for the preliminary analyses of the relationship between the independent variables and a dependent variable. The independent variables are categorized into several groups before the ANOVA models are applied (i.e., for mainline ADT, 1: under 10000, 2:10000~15000, 3: 15000~20000, 4: over 20000). The next step is to carry out a test whether or not the category means uj are equal. The hypothesis for this test is the following (N eter et a1. 1992) H03 “1:112:19 ------- 21% H1 : Not all llj are equal Here, H0 implies that all of the probability distributions have the same mean, and thus there are no factor effects. Alternative H1 implies that the means are not equal, and hence that there are factor effects. The F - test statistic and p-value are used as a decision rule for this test, and statistical package SPSS (9.0 version) is used to investigate the ANOVA. 38 3.4.2.1 ANOVA for traffic effects .When a: 0.05, F(0.95; 3, 195) is equal to 2.65. For mainline ADT from Table 3.8, the F- test statisitic=l 7.578>2.65. Thus we conclude Hl- that the mean crash frequency is not the same for the different mainline ADT categories. Similarly, ANOVA of ramp ADT and truck percent result in the same interpretion as that of mainline ADT. However, for truck ADT, the F-test statistic 0.244 is less than the critical value of 3.04, and hence we conclude Ho -that the mean crash frequencies are the same for different truck ADT. The large p-value of the test in this table provides strong evidence that the sample data are in accord with equal mean frequencies for the different truck ADT. Mainline ADT, ramp ADT, and truck percent are thus expected to be contributing factors in the crash prediction models Table 3.8 ANOVA for traffic effects Source of variance d.o.f Mean F-test P-value square Statistic Critical Value (a=0.05) Mainline ADT Hypothesis 3 12887 17.578 2.65 0.000 Error 195 733 Ramp ADT Hypothesis 2 28635 45.134 3.04 0.000 Error 196 634 Truck ADT Hypothesis 2 225 0.244 3.04 0.784 Error 196 924 Truck percent Hypothesis 2 10434 12.722 3.04 0.000 Error 196 820 39 3.4.2.2 ANOVA for geometric effects Table 3.9 presents the results of ANOVA for geometric effects. For the variables of interchange size and average spread ramp length, the F-test statistics are 6.760 and 3.901, respectively, which exceed the critical value of 3.04. This implies that the mean accidents are not the same for the different length of interchange, or the different length of spread ramps. However, for average loop ramp length, the F-test statistic 0.146 is very small, compared to the critical value of 3.11, and hence we conclude Ho - that the mean crashes are the same for the different length of loop ramps. The small P-value of the test in this table provides strong evidence of this conclusion. On the other hand, the number of lanes and shoulder width are expected to be important independent variables for the prediction models based on F-test statistics that exceed critical values at a 0.05. However, for lighting, the F ~test statistic (1.953) is less than the critical value of 3.04, and hence we can not conclude that mean accident frequencies are not the same for the different lighting conditions. In addition, the F- test statistic for the number of on-off ramps is 1.818, which is close to the critical value of 1.93. Thus, the number of on and off ramps, the number of lanes, shoulder width, interchange length and average spread ramp length are expected to be contributing factors. However, there are no factor effects caused by lighting condition and average loop ramp length, and thus no further analyses which include these variables is required. 40 Table 3.9 AN OVA for geometric effects A. Variable type 1 Source of variance d.o.f Mean F -test P-value square Statistic Critical value (0L=0.05) Interchange Hypothesis 2 5860 6.760 3.04 0.001 length Error 196 866 Average Hypothesis 2 1 1 5 0. 146 3.11 0.703 10"" ramp Error 83 782 length Average Hypothesis 2 3 565 3 .901 3 .04 0.021 Spread ”ml” Error 193 9112 length B. Variable type 2 Number of on Hypothesis 9 1608 1.818 1.93 0.067 and “f “‘me Error 189 884 Number of Hypothesis 4 2206 2.477 2.42 0.046 lanes Error 194 890 Shoulder Hypothesis 1 17458 20.950 3.89 0.000 Width Error 197 833 Lighting Hypothesis 2 l 703 1.953 3.04 0.144 Error 196 872 41 3.5 Model structure Model structure is another issue in building an accident prediction model. However it is very difficult to choose the form of model equations because modeling remains, partly at least, an art (McCullagh and Nelder 1989). There are, however, some principles related to model structures which are summarized as follows. ( McCullagh and Nelder 1989): O A good model is one that fits the observed data very well. 0 Simplicity is a desirable feature of any model; we should not include parameters that we do not need. 0 Models should make sense intuitively. 0 If main effects are found from several studies bearing on the same phenomenon, the main effects should usually be included whether significant or not. The above principles were used in the process of choosing model structures for this study. There are a few research papers on freeway interchanges, as mentioned in section 3.1. But these may not be appropriate guides for this study, since the models are based on a normally distributed and homogeneous error structure. For this reason, the findings from these studies related to traffic crash estimation at intersections have been reviewed based on the engineering intuition that the crash patterns at interchanges would be similar to those at intersections. Several studies (Maher and Summersgill 1996, Persaud and Nguyen 1998, Bonneson and McCoy 1997, Vogt and Bared 1999) found that nonlinear relation is 42 mainly proposed, and traffic volume belongs in the main effect group among the various variables. To confirm the model structure, the cross tabulation between crash frequency and traffic volume were produced as shown in Table 3.10. This approach was performed in a similar manner by Bonneson and McCoy (1993), and Hauer et al.(l988). In Table 3.10, the traffic ranges were selected such that the same traffic ranges are located in each row, or each column, in order to obtain equal weight in calculating the average number of crashes per interchange. Therefore, 52 interchanges with traffic volumes that exceed these ranges were excluded in building the table. The cells give the average number of crashes that have occurred for 3 years at interchanges with mainline volume and ramp volume given in the left-most column and the upper row. The brief examination of the row and column summaries indicates a positive relation between crashes and both mainline volume and ramp volume as shown in Figure 3.3 and Figure 3.4. However, the rate of increase may be different, depending on the traffic volume. For example, while crashes are always increasing over all ranges of mainline ADT, the increase is very small between mainline ADT 10000~15000 and 15000 ~20000, compared with other ranges of mainline ADT. This implies that the increase of crashes with mainline ADT is nonlinear, and the increase can be captured by a function such as V B, where V is mainline ADT and B is a coefficient larger than 0.0. 43 Table 3.10 Cross tabulation of crashes by mainline volume and ramp volume Ramp volume 5000 5000 5000 5000 Summary ~ ~ ~ ~ Row Mainline volume - 15000 15000 15000 15000 5000 50“ 88 66 62 66 10000 9039/18” 1233/14 132/2 186/3 2454/37 10000 55 100 108 148 98 15000 721/13 2091/21 1624/15 1038/7 5474/56 15000 57 122 90 133 103 20000 454/8 1095/9 270/3 1065/8 2884/28 20000 116 170 178 175 159 25000 815/7 851/5 1420/8 1049/6 4135/26 Summary 63 108 123 139 102 Column 2893/46 5270/49 3446/28 3338/24 14947/147 1): Average number of crashes per interchange 2): Total crashes 3): The number of interchanges 44 8 {3‘ Crashes 8 '8‘ 5000~10000 10000~15000 15000~20000 20000-25000 ADT/lane Figure 3.3 Mainline traffic volume and crashes 45 200 150 Crashes 8 01 O / 5000~15000 15000~25000 25000~35000 35000~45000 Ramp ADT Figure 3.4 Ramp traffic volume and crashes 46 We can also determine from Table 3.10 that there is a nonlinear relationship between crash frequency and traffic volume. For example, in the first column, crash frequencies increase sharply from 57 to 116 when the mainline volumes are changed from 15000~20000 to 20000 ~25000, whereas the crash frequencies increase only slightly (from 50 to 57) when the mainline volumes are changed from 5000~10000 to 15000~20000. These combinations can be found in other cells in Table 3.10, which is conceptually consistent with the nonlinear product of flows to power formulation as follows: 15(67): AleBl xV232 (3.1) where, E(6) : Expected number of crashes Vl :Mainline volume V2 :Ramp volume A,B1,B2 :Parameters In principle, one should seek a model structure that best fits each interchange type. However, in this case, the model structure would be based on too small of a sample size to allow for finesse. Therefore, we regard equation (3.1) as the basic model structure describing the main effects of traffic variables on the interchange crash frequency. 47 The range of geometric variables is also an issue in choosing the appropriate model structure. The previous research found that the expected number of crashes can be represented by a product of geometric variables raised to various powers (Mountain et a1. 1996), or by an exponential applied to a linear fimction of the geometric variables (Vogt and Bared 1999, Mahel and Summersgill 1996). The effect of the range of possible geometric variables can not be evaluated efficiently, and hence, iterative tests of the model structures were performed. The results showed that a product of variables raised to various powers is appropriate for variables of type 1 (such as the size of interchanges), whereas an exponential applied to a linear function is appropriate for variables of type 2 (such as the number of on and off ramps). On the basis of the literature review, the principles of model structures, and the results of the analyses, the general model structure for this study was finally determined to be of the following form: 13(19): AxViBi ijCj xepo(Ck ka) (3.2) where, E(19) : Expected number of crashes Vi :Traffic variables G j : Geometric variables(type 1) GR : Geometric variables(type 2) A,Bi,Cj,Ck :Parameters 48 3.6 Model calibration and analysis In section 3.4, the results of a preliminary analysis used to determine which variables have a significant effect on crash occurrences were discussed. The basic model structure that has been proposed in section 3.5 includes the independent variables that are significant as a result of the AN OVA. However, a variable can be insignificant when we put the variable into a nonlinear model structure stratified by interchange type, even though it has been evaluated to be significant in the preliminary analysis, because the preliminary ANOVA was performed independent of the interchange type. This issue is related to the simplicity of the model. Simplicity is a desirable feature of any model as noted by McCullagh and Nelder (1989). This means that we should not include insignificant parameters in a model, noting that not only does a simple model enable the researchers to think about their data, but the model that involves only the correct variables gives better predictions than one that includes unnecessary variables. In this stage, the irrelevant terms from the general model structure are excluded, and the models are calibrated through checks on the fit of a model to the data, for example by residuals and other statistics. A nonlinear regression model was proposed in the preceding section, and it has been verified that the crash occurrences follow a Negative Binomial distribution in chapter 2. Therefore, it is necessary to calibrate the coefficients of the crash prediction models and the Negative Binomial distribution parameter k simultaneously. There are two methods used to calibrate nonlinear regression models with a heterogeneous error 49 structure (such as the Negative Binomial distribution): transformation of the model and generalized linear models (GLIM). However, the transformation of models causes a change of scale in the data (Sever and Wild 1987, and McCullagh and Nelder 1989), which results in a violation of the Negative Binomial error assumption. Therefore, the analyses that follow are performed on the original scale of the data. This feature is a characteristic of generalized linear models (McCullagh and Nelder 1989). Previous researchers have suggested that the generalized linear models can be a technique to overcome the shortcomings of the conventional normally distributed error assumption in describing random, discrete and non-negative events which often occur in the traffic crash field (Rodriguez and Sayed 1999). 3.6.] Link functions for the Generalized Linear Model (GLIM ) approach Recognizing that traffic crashes follow the Negative Binomial distribution as mentioned in chapter 2, the GLIM approach is utilized for model calibration. The GLIM approach used herein is based on the work of McCullagh and Nelder (1989), and Lawless (1987). The generalized linear modeling technique introduces a link function n that relates the linear equation to the expected value of an observation. This link function equates the non- linear relationship to a linear one. On the other hand, there is a specific link function that is associated with the error structure of a distribution. This is defined as the natural link function. For example, 50 natural link functions can be described for Normal, Poisson and Negative Binomial distributions as follows (McCullagh and Nelder 1989): Normal : 77 = E(6) Poisson : 77 = ln[E(6)] 15(6) 1 Negative Binomial : 77 = K + E((9) In order to describe the use of the Poisson link function, equation (3.2) in section 3.5 can be changed into a linear predictor as follows: n=mwwfl = lniA >< VrBi X jSj xexPZw/r X CH] =lnA+Bi1nVi+CjGj +Z(Ck ka) Now, this is a linear predictive equation after applying the Poisson link function. However, our models for crash occurrence are based on the Negative Binomial distribution, and it is much harder to calculate a linear predictor from the natural link function for the Negative Binomial distribution. In fact, it is not algebraically possible to derive the linear predictor using the natural link function for the Negative Binomial distribution (Bonneson and Macoy 1997). Therefore, the Poisson link function is utilized 51 instead, recognizing that the use of a natural link function is not a requirement for the GLIM approach (McCullagh and Nelder 1989). In order to calibrate the prediction model, a dispersion parameter (Dp) will be utilized. That is, if Dp is greater than 1.0, then the data has a greater dispersion than is explained by the Poisson error assumption, and further analysis using the Negative Binomial error structure is required. In this case, the parameters are estimated in the iterative process using the maximum likelihood method. The model calibration procedures are explained in section 3.6.3. 3.6.2 Assessing the goodness of fit of the model This section describes a basis of measuring the model significance. To make understanding easier, the following notations are used: yi : the observed number of crashes at a site i E(0)i: the expected number of crashes at a site i E(0): the average expected number of crashes Var(yi): estimated variance in crashes at a site i 11: sample size p: the number of parameters Several measures are used to assess the model fit and the significance of the model parameters, based on the studies of McCullagh and Nelder (1989), and Bonneson 52 and McCoy (1997). One such measure is the generalized Pearson x2 statistic, which is calculated as: " (y- —-E<<9>-)2 P 2 = 1 1 earson I ; var(y,-) where var(yi) is estimated from the variance equation of the Negative Binomial distribution which has been shown in equation (2.2). McCullagh and Nelder( 1 989) indicate that the generalized Pearson x2 statistic has the exact x2 distribution for a Normal linear model, while asymptotic results are available for other distributions. The asymptotic results may not be relevant to statistics calculated from a small sample size. Therefore it sometimes can not be used as an absolute measure for assessing the fit of a model. A second measure of model fit is the Dispersion parameter (Dp), which can be calculated as: Pearson [2 n — P Dispersion parameter(DP) = As shown in the above formula, Dp can be obtained by dividing the Pearson x2 by n - p. McCullagh and Nelder(l989) indicated that it is a useful measure for assessing the fit of a 53 model. A Dp value near 1.0 means that the error assumption of the model is equivalent to that found in observed data. If Dp is greater than 1.0, then the observed data has greater dispersion than is assumed in the model. This concept will be utilized in estimating the " k parameter " in the Negative Binomial distribution and the coefficients of the accident prediction models. This will be described in detail in the following section. The third measure of model fit is the coefficient of determination R2, which can be calculated as: where SSE = 2110509), —y,.]2 l SST = ib, - 177(6)] 2 This measure is commonly used for the fit of a linear regression model based on the normally distributed error assumption. Nevertheless, this statistic can still be useful in assessing the model fit, recognizing the findings that the coefficient of determination R2 is still efficient in assessing a model calibrated under a non normal error structure (Kvalseth 1985). 54 The fourth measure of model fit is the Pearson Residual, which can be calculated as: 5(9); ‘y,’ V var(y,' ) Pearson Residual (PRi) 2 As shown in this formula, this is defined as the difference between the predicted and observed data divided by the standard deviation. The Pearson Residual will be discussed again in section 3.6.5. In addition to these measures, the standard error and t-value are used for assessing the significance of variable coefficients. The t-value is the ratio between the variable coefficient and its standard error. The detailed descriptions of these statistics are not presented here since the concepts are commonly applied in measuring the fit of linear regression models. 3.6.3 The procedures used in parameter calibration The calibration of model parameters was performed based on the works of Lawless ( 1987). The calibration for this research is a multi-step process as shown in Figure 3.5. 55 First, the model parameters are estimated based on the Poisson errorstructure that the variance equals to the expected value. Using the expected number being calculated in the first step, the second step is to estimate the "k" parameter. If I/k is not greater than 0.0, then there is no over-dispersion in the observed data and the procedure stops. If 1/k is greater than 0.0, then a third step is to calculate new model coefficients under the Negative Binomial error structure using the k from the second step. In this step, the maximum likelihood estimates of the model coefficients are obtained by iterative weighted least squares. The final step is to calculate the Dispersion parameter (Dp). If Dp does not equal 1.0, the k parameter is increased (or decreased) and then a feedback loop is performed to the third step. The analysis is repeated in an iterative manner until the Dispersion parameter (Dp) converges to 1.0. Models with Negative Binomial errors can not be calibrated using conventional statistical packages (i.e., SPSS, SYSTAT), and thus a statistical package for Generalized Linear Interactive Modeling (GLIM), which is specially designed to calibrate models with special types of errors (i.e., Negative Binomial, Poisson and Gamma), was used. Rodriguez and Sayed (1999) used a similar process in calibrating the traffic crash prediction models for urban unsignalized intersections. 56 I 1. Poisson Model Calibration : Variance = Mean I 2. Residual analysis 2 i (y,- - #i) z [’1' k ) i=1 #i (1 + I Initial model coefficients :B1.BZ.B3.B4 ....... Initial 1/k NO .° YES F 3. NB Model Calibration : Maximum likelihood estimation by iterative weighted least squares New model coefficient :01, 02, B3, [34 I 4. Dispersion parameter (Dp) Calculation _ pearsonz 2 n ‘ P Dp I Converge ? (1)1351) YES Figure 3.5 The process to calibrate coefficients & k parameter 2 n . -— . ‘ pearsonz2 = Z£J/'—#—;l)— i=1 i - 1+ _- #,( k Adjusted 1/k 57 3.6.4 Results of the model calibration On the basis of the procedures for assessing the model fit explained in the previous sections, the crash prediction models have been calibrated. The logarithmic link function has the following basic form, as mentioned in section 3.6.1: 1n[E(6)] =lnA+Biani+CjGj+Z(Ck ka) (3.3) This equation can be rewritten in a more useful form as: ,3. CH E(6)=AxVi szj 1 xepo(Ckak) (3.4) where, E (9) : Expected number of crashes Vi :Traffic variables G j :Geometric variables G k :Geometric variables A,Bi,C j‘C k : Parameters The model calibration process starts with individual models according to the interchange types that have been classified in section 3.2. Table 3.11 presents several statistics relating to the calibrated crash prediction model for interchange type 11. In determining the significance of the variable coefficients, the 95 percent confidence level is used with a few exceptions. In the second row of the table, the statistic for the constant terms does not have any meaning since the logarithm results in a change of scale. 58 ., .. , 1 .44.. 1- «l' ' .vc‘vnh'1' '1;.,‘iL-u ,rr. . out“. '. '10 ' "to”; Ill 4s $81 ‘6' b-u cc'l «I'D l'1-‘ I‘h-IC.‘ l “.3“? 14114119 -- .o ~— -. o ,. .-. C "c ~. .. O. o .1. ’3 -«o - ~o . -r -. . -.-« ‘i :1: 1 "'3 i. r .‘ , a .3- '- 3'! ‘J :z; "2 i§ :3 :3; ' 4 4”: . élg ‘- '4 '4. 1113113313 44211!- . ‘, .2 "..‘ (1...! I’H"-I"_.u 4 I I 'I‘L :3] vq";£:l thirst! 4.4m 1T .1 l #0: "ti! “Lilli: saga! ‘5‘"; fight-’1‘ - in i ‘ ‘2 The table indicates that several variables have a significant effect on the frequency of interchange crashes. These variables are mainline traffic, ramp traffic, truck percent, interchange size, spread ramp length, and shoulder width. However, the number of lanes and the number of total ramps are not included in this model because the effect of these variables is not significant. The calibrated coefficients can be applied to the equation (3.4) that is the basic model structure, in order to predict the number of traffic crashes that would be expected for 3 years in interchange type 1 1. The resulting model can be written as follow: E(6) = 3.448 V,"401 V20'186 V30'620 010738 exp(—1.267 G2 — 0.156 G5) where, Vl : Mainline traffic volume per lane : Ramp traffic volume w< N< : Truck percent : Interchange length ~ : Average spread - ramp length : Shoulder width 000 N LII A k parameter of 8.05 is found to yield a dispersion parameter of 1.0. The Pearson x2 is 28.84, and the degrees of freedom are 27(n-p-1=34-6-1). This statistic is less than x2 0 05, 27 = 40.11, and hence the hypothesis that the model fits the data can not be rejected. It implies that the model is consistent with the observed data. Several statistics associated with the calibrated crash prediction models for other interchange types are included as Appendix 1. 59 Table 3.11 The results of crash prediction model calibration (Interchange type 1 1) Coefficient Variable definition Unit Estimate Std t - error statistic A Constant - 3.448 Log(A) (1.238) (0.67) (1.85) B1 V1: Mainline traffic volume per lane (ADT/1000) 1-401 0-30 4-66 B2 V2 : Ramp traffic volume (ADT/1000) 0.186 0.12 1.55 0 B3 V3 :1.ka percent (/o) 0.620 0.19 3.26 C] G. : Interchange length (Mile) 0-733 0-15 4-92 C2 G2 : Average spread- ramp length (Mile) '1'267 0'97 '1 '31 C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) -0.156 0.12 -l.30 C5 G5 2 Shoulder Width Model statistic [)p Dispersion parameter 1.0 X2 Pearson chi -square 28.84 (12 0.95, 27 = 40.11) R2 Coefficient of determination 0.60 K Negative Binomial parameter 8.0 5 60 3.6.5 Pearson Residuals A useful subjective measure of the model fit is the Pearson Residuals(PR),which are normalized residuals in the context that Pearson Residuals are the difference between the predicted and observed data divided by the standard deviation as described in section 3.6.2. One can visually assess the goodness of model fit by plotting the Pearson Residuals versus the estimates of the expected number of crashes. A good model will have the Pearson Residuals centered around 0.0. Pearson Residuals are plotted against the expected frequency for the 199 interchanges in Figure 3.6. As shown in the figure, Pearson Residuals are centered around 0.0 for the entire range of expected frequency, which indicates that the calibrated models fit the observed data well. The advantage of the Negative Binomial error assumption in crash model development will be examined again in section 3.7. 61 0 100 200 300 400 500 E(x) Figure 3.6 Pearson Residuals and E(x) 62 3.7 A comparison of model calibration results according to the error structure (Normal versus Negative Binomial assumption) In section 3.1, it was noted that there are two error structures used in calibrating traffic crash prediction models. One is a normally distributed and homogeneous error structure, the other is a non- normal and heterogeneous error structure. Recently, Poisson or Negative Binomial error structures are most often assmned in modeling traffic crashes. Nevertheless, Iiteratures reviewed did not contain a full description of the advantages relating to this approach. In order to examine the advantages of the Negative Binomial error assumption, the results of model calibrations for interchange type 11 and type 12 are compared in Table 3.12. The parameter estimates and their standard errors are very sensitive to the error structures assumed. An extremely useful relative measure of the scatter of a random variable is its coefficient of variation V(x) (Harr 1996). This implies that the coefficient of variation is a measure of the reliability of the calibrated model coefficients. There were large reductions in the coefficient of variation, as shown in the table, when the models were calibrated using the Negative Binomial distribution instead of the Normal distribution. This reduction in the coefficient of variation occurs in all coefficients that have been calibrated as shown in the table, with a maximum reduction of 80 % and an average reduction of 30 %. These results support the hypothesis that the Negative Binomial distribution is a desirable assumption in calibrating crash prediction models relating to freeway interchanges. 63 (Normal and Negative Binomial assumption) Table 3.12 A comparison of model calibration results according to error structure Typell Normal Negative Binomial Reduction of V(x) Parameter Estimate Std.err V(x) Estimate Std.err V(x) A B B/Axl 0 C D / x () () ( 0) () () (DC100) (%) (°/o) B1 1.118 0.37 33 1.401 0.30 21 36 B2 0.108 0.15 142 0.186 0.12 64 55 B3 0.425 0.24 55 0.620 0.19 30 45 G1 0.558 0.20 36 0.738 0.15 20 44 G2 -0.600 1.13 189 -l .267 0.97 77 50 G5 -0.297 0.26 87 -0.156 0.12 77 11 Type 12 B1 1.003 0.26 26 0.946 0.24 25 4 G1 0.570 0.22 39 0.933 0.36 39 0 G2 -0.705 1.18 167 -3.842 1.31 34 80 Max: 80 Min : 0 Avg : 36 64 3.8 Sensitivity analysis of the crash prediction model There are two objectives associated with a sensitivity analysis: One is to examine the possibility that the crash prediction model violates conceptual rules. For example, if a model were designed such that its predicted crashes would decrease with an increase in ramp volume, the model should be rejected because it violates a conceptual rule. The other objective is to determine the effects of individual variables on the crash frequency at freeway interchanges. The sensitivity analysis is performed for the major geometric variables, but not for the traffic variables because it is possible to change the geometry, but changing traffic is difficult. During the sensitivity analysis of a specific variable, other design parameters are assumed to be a constant. For this analysis, an experimental matrix was established, which includes 3 experiments (A: 0.1 mile shorter than mean, B: mean, C: 0.1 mile longer than mean) for interchange length, 3 experiments (A: 0.1 mile longer than mean, B: mean, C: 0.1 mile shorter than mean) for spread -ramp length, and 2 experiments (A: 12 feet and B: 10 feet) for shoulder width. Table 3.13 illustrates the results of the sensitivity analyses. In the sensitivity analysis of interchange size, when the interchange size is increased by 0.1 mile, traffic accidents increase in all interchange types which use this variable as a model component. The average increase is 14 %. 65 In the sensitivity analysis of the spread- ramp length, traffic crashes increase by an average of 26 % when the spread- ramp length is decreased by 0.1 mile. The traffic crashes increase most rapidly for interchange type 12 (Tight diamond interchanges), which increases by 47 %. The crash frequency is very sensitive to shoulder width for both interchange types that include this variable, and especially for type 41(Directional interchanges). In the sensitivity analyses, no violation of conceptual rules of traffic crashes were found. 66 Table 3.13 Sensitivity analysis(effect of main geometric variables) Parameter Interchange Experiment Experiment Experiment Effects type (A) (B) (C) Interchange Length 0.534 mile 0.634 mile 0.734 mile 0,1 Mile (T) Length Type 11 0.629 0.714 0.796 1.12 Type 12 0.557 0.654 0.749 1.16 Type 13 0.599 0.689 0.777 1.14 Type 14 0.438 0.549 0.666 1.23 Type 31 0.819 0.865 0.906 1.05 Type 33 0.549 0.647 0.744 1.16 Mean 0.599 0.686 0.773 1.14 Spread- Length 0.33 mile 0.23 mile 0.13 mile 0.1 mile (I) ramp length Type 11 0.658 0.747 0.848 1.14 Type 12 0.281 0.413 0.607 1.47 Type 14 0.472 0.592 0.744 1.26 Type 33 0.438 0.563 0.723 1.28 Mean 0.462 0.579 0.730 1.26 Shoulder Width 12 11 10 ft 2.0 feet(I«) width Type 11 0.154 0.211 1.37 Type 41 0.057 0.093 1.63 Mean 0.106 0.152 1.50 67 Chapter 4 THE DEVELOPMENT OF A METHOD TO IDENTIFY HAZARDOUS SITES BASED ON THE NEGATIVE BINOMIAL DISTRIBUTION 4.1 General The traditional rate quality control method is based on the assumption that the probability of crash occurrences follow a Poisson distribution(Zegeer and Deen 1977) in which the mean and the variance are equal. The normal approximation to the Poisson provides a control chart without tedious interpolation from the table of the Poisson distribution(Orlansky and Jacobs 1956), and this chart has been commonly used for the identification of hazardous locations. In chapter 2, the fact that the variance of crash occurrences at freeway interchanges is substantially larger than the mean was discussed, based on the observed data. This over-dispersion can be better explained by using the Negative Binomial distribution. A control chart constructed under the assumption of the Poisson distribution, can not reflect the phenomenon of over-dispersion in identifying hazardous locations. The purpose of this chapter is to describe a technique to overcome the limitation of the rate quality control method based on the Poisson distribution. One statistician 68 (Rice 1997) suggests that the Negative Binomial distribution might be considered as a model for situations in which the rate varies over time and over space. Thus, in this chapter, the rate quality control method will be developed under the assumption of the Negative Binomial distribution. 4.2 A review of the statistical methods identifying hazardous sites 4.2.1 The rate quality control method The rate quality control method is one of the most common methods used to identify hazardous sites. This method was originally developed as a means to control the quality of industrial production(Norden et a1. 1 956). This approach uses a statistical test to determine whether the traffic accident rate for a particular location is abnormally high compared with the rate of reference sites with similar properties. The statistical test is based on the assumption that traffic crashes are rare, hence the probability of their occurrences follows a Poisson distribution(Zegeer and Deen 1977). There have been changes in the original equations based on a comparison of the errors between real values and estimated values obtained from the rate quality control method formula. The following is a brief description of changes of the rate quality control method. The rate quality control method was proposed as a way to analyze crash data on highway sections in 1957 using the following formula: 69 UCL = ,1 + 2.576\/2 + 0'8” + 1 (4.1) V, V i 2Vi LCL = )1 — 2.576 1 + 0829 - 1 (4.2) VI. Vi 2V]. where UCL : upper control limit LCL : lower control limit )1 : average accident rate of reference sites(Z N, /Z Vi) N ,- : the number of accidents at site i V,- :the number of vehicles at site i A decade later, it was recommended that the correction term(0.829/V 1) be eliminated to improve the validity of the equations(Morin 1967). Thus, the following equations are currently in use to calculate the upper and lower limits for the rate quality control method. UCL=I1+z i+L (4.3) VI. 2V,- LCLzli—z i+——1— (4.4) VI. 2V1. where z : predetermined significance level 70 With the rate quality control method, a site is identified as hazardous if its observed accident rate exceeds the mean accident rate of similar sites plus a multiple of the standard deviation of the site accident rate, which is called the critical accident rate. The critical accident rates can be calculated for each site by applying the following equation: RCi=2+z i+—1—— (4.5) Vi 2V4 where RCI- : Critical rate for site i In the above equation, the first two terms result from the Normal approximation to the Poisson distribution, the third term is a correction factor necessary because only integer values are possible for the observed number of accidents. The coefficient of the second term describes a probability factor determined by the level of statistical Significance desired for RC.. The FHWA, however, proposes the following equation for calculating the critical accident rate(Stokes and Mutabazi 1996). Vi 2V1. where RC ,- :Critical rate for Site i 71 Equation(4.6) is different from equation(4.5) in that the Sign of last term is negative, and the difference results from whether a probability should be included or excluded if the rate is equal to the critical rate. The method of identifying hazardous sites by the rate quality control method can be visually explained by Figure 4.1, where filled stars correspond to the hazardous sites chosen under the rate quality control method. grassed“. 523m 05 .32.: .552: .8289 5:35 8a.. me 3355 2:. — .v ohswwm EVE—85mm“ .8325 .3 :8: .9555 L253 _ ea 73 4.2.2 The Bayes approach Higle and Witkowski(l988) introduced and illustrated how Bayesian theory can be used to identify the hazardous intersections from among many signalized intersections. Hauer(1986) then proposed the application of Emperical Bayesian(EB) theory in traffic safety problems, based on Robbin's work(Robbin l977,1979,1980) and this method has subsequently been used by many researchers (Maher and Summersgill 1996, Persaud 1993, Belanger 1994). Both the Bayesian approach and the EB method which are described above, have the following assumptions(Higle and Witkowski 1988). Assumption 1: At a given site, when the average accident rate of reference sites(k) is known, the count of accidents(N) obeys the Poisson probability law with expected value( 1V1). N. exp( —/1Vi)(/1Vi) ' N. I (4n MNzNflHQz where, ,1 :average accident rate of reference Sites( 2 Ni /2 Vi) 2V1. :the expected value at site i Assumption 2: The accident rate of reference sites(which the given site belongs to) can be described by a Gamma probability density function such as: __,B_(_z_ 01—1 :62 f(/1) - Ha) /l. e (4.8) where, f(/i) : gamma probability density function of reference sites a, ,6 : parameters 74 Equation (4.8) is also denoted as the probability density function of the prior distribution in terms of Bayes theory. Here, parameters 01 and [3 can be estimated through the method of moment estimates(MME) or the maximum likelihood estimates(MLE). Under the assumption that 01 and B were calibrated, if the observed data at a given site i are N and V,, the probability density function of the posterior distribution can be described as: N. (,B+V,)0[+ ’ a+N.-1 —(fl+V.),t ,1 I e I T(a+Ni) f(/1/Ni,V1-)= (4.9) Equation (4.9) is the posterior distribution using Bayesian theory in its original meaning, and we can evaluate the hazard of a given site using this equation. That is, the probability that Bayes accident rate at site i, M, exceeds an average accident rate of reference sites, Aavg: P(2,. 2 2mg) = 1 — Pu, < 2mg) 4avg a+N- +V. ’ N.—1 — V2 :1. [w ') 71““ e('6+’) d2 (4.10) 0 l'(a+Ni) Figure 4.2 shows a graphical representation of the equation (4.10) 75 3a.— “5293‘ 5.3.5.? anaem— ma 83m ESP—«nu: he :euaouraog N .v Baum..— ”CK 53.55% 8:38.. " 5:2:me 58...- u 9...“ 5:: 82 a z :2: 9.526...— 76 4.2.3 The problems resulting from the use of the Poisson distribution The rate quality control method as used in practice identifies a site as hazardous if its observed accident rate over a given period exceeds its critical accident rate, which is the average accident rate over reference sites plus a multiple of the standard deviation of the accident rate of the site over the same period. This rate quality control method is based on the Poisson distribution as mentioned is section 4.1. However, the Negative Binomial distribution fits the freeway interchange crash data much better than the Poisson distribution. Thus, this identification of hazardous sites may not be valid. Recognizing that the variance of the Poisson distribution equals the mean, whereas the variance of the Negative Binomial distribution is (mean+mean 2/ k), the existing approach under the Poisson assumption will identify more sites as hazardous than would be expected under the Negative Binomial assumption. For example, in Figure 4.3, the solid line is the upper control limit chart based on the Poisson distribution while the dotted line is the upper control limit chart based on the Negative Binomial distribution. The stars correspond the hazardous sites which are chosen under the Poisson or Negative Binomial distribution. 77 .1232: 3.5—So 5:2... 8a.. warm—ans E 559858“ .8345— ES 335::— 955? Z a... we :emtaafioo < n .v 95w:— _ gum O O O O u D III-IIIIIIIIIIIIIIIIIIIJ” * u‘lllllllllll‘. \/ A7 * “II-IIIIIIIIIIIDIIliIII noun—:33 382.5 guawoz zeta—=58: 335m .3 :8: 3.5.3 .395 | .3 :5: 3.5—.3 .395 ll 78 4.3 The identification of hazardous sites based on the Negative Binomial distribution. 4.3.1 Concept In equation (4. 1) and (4.3), the true value of the upper control limit for the Normal approximation to the Poisson distribution on traffic crash frequency can be computed from the following equation: U—l -m y P: 2: )3" (4.11) y: where, P : predetermined probability level U : true upper control limit m : expected value In equation(4.l 1), m=lfi, Ni where, N,- = the accident fi'equency at site i V,- = the number of vehicles at site i 79 Under the condition that the average accident rate of reference sites, A, is known, the true control limit for a site i can be calculated by selecting the value that meets the predetermined probability levels (i.e., 0.9, 0.95) from equation (4.11). The above concept can be utilized for the Negative Binomial distribution. That is, when crash occurrences follow the Negative Binomial distribution, the formula for the true upper control limit is as follows: U—l -k y 1):: (1+3) F O,F(y + c) / F(c) = c(c +1)(c + 2) ------ (c + y — 1), -I M:l[l+lj{l+2j ......... (1+yi —1] ma“) a a a a Now, we can write the log likelihood fimction, log L(yi, a, mo, Vi) as —1 y- a ’7 F(y-+1/a) am V. ' 1 l(yi,a,mO,Vi)=zlog ' [ 0 I J —— i i=1 F(1/ a) 1+ amOV 1+ amOVi ,V‘ a " F .+1/a am V. ’ =2 log—(I):’—-——)+log ——0—'- +log —1—— (4.16) ~ (l/a) 1+am0Vi 1+am0Vi In equation (4.16) The first term, 2'21. _._1“=,. 1(1+_aj(1+2a)(1+3a]..... 1.0,-.. i=1 g F(1/a) ga a a a a =log1+log(1+ a)+log(1+ 2a) - - - log(1 + (yi — 1)a)— yl- loga le = Zlog(1+ aj) — yi loga (4.17) j=O 82 The second term, ’7 amOVi y" Zlog —1———7 = yi log(amOVi)—yi log(1+ amOVi) = yi loga + yi log mO +yi log VI. -yl. log(1+ amOVi) (4.18) The third term, " 1 a 1 Zlog —— = ——log(l + amOVi ) (4.19) (1 Thus, from equations (4.17),(4.18)and(4.19) the log likelihood function can be summarized as follow : log(1+ aj) + yi log mO — y,- log(1+ amO Vi) — llog(1+ amO Vi) (4.20) a 83 The simplest way to obtain (m0 ,a) is to maximize [(yi ,a,mO‘Vl-) with respect to (m0,a), and thus, we need to set the partial derivatives of the l(y,- , (1, m0 Vi) equal to zero. That is, 51(yi,a,mO‘V,-) _ 0 4.21 6m0 ( > 61(yi,a,m0‘Vi) _ 0 4.22 6a ( ) 01(yi,a,m0.Vl-):i fi_ yiaVi _ Vi (3m0 i=1 m0 1+am0Vi 1+amOVi zfi yi’moVi (423) i=1 m0(1+amOVi) 61(y.,a,m V.) n yi—1 ' .m V. m V. ’ 0" =Z[Z[ j ']—————y' 0' +L210g(1+amOVi)-l_—Ol 6a i=1 1+am0Vi a a 1+am0Vi n y:'“1 - V- - _1 =Z[ ( J.]+;12—10g(1+am0Vi)-m0'(y'+a 1 (4.24) l+ amOVi 84 From equation (4.23), n m0 = L?" (4.25) Vi =1 I From equation (4.24), mOVi(yi + a_1) _ .Vi-l . Z[ J j + —l3log(1 + amOVi) — 0 (4.26) j=0 1+aj a 1+am0Vi The maximum likelihood estimate of " m0" can be easily calculated using equation (4.25), but that of " a " is not simply obtained because it is not a closed form. Therefore, a numerical approach is used to solve equation (4.26). 85 4.3.2.2 Parameter estimation It is not easy to verify that one method is superior to another in the absence of perfect information, which can not be obtained naturally in the traffic safety field. Therefore, in this section the results of the Negative Binomial approach are compared with those of the existing two methods that are commonly used in the traffic safety field: the rate quality control method and the Bayes identification method. Because the two methods have already been reviewed at length in section 4.2 and 4.3, the parameters are estimated and the results obtained without expanding on them here. There is a limitation in choosing the sample sites to examine the Negative Binomial approach because sites with similar geometry should be used for reference sites. Therefore, two data sets are selected for this study. The first data set includes 16 diamond interchanges with similar geometric properties (i.e., Diamond type, 6 lanes, 10 ft shoulder width, 4 ramps). The second data set includes 14 partial clover A or B 4 Quadrant interchanges which have similar geometric characteristics (i.e., 6 lanes, 10 ft shoulder width, 6 ramps). It is not possible to get a data set with exactly the same geometric conditions in practice, and the more classification variables used, the smaller the sample size. With the above data sets, the new approach has been tested and compared with the results of the existing two methods. The parameters for Bayes approach have been estimated through the method of moment estimates (MME), and the parameters for the rate quality control method based on the Negative Binomial distribution have been 86 estimated through the procedures described in the preceding sections. The estimates of the parameters are summarized in Table 4.1. Note that the k parameter for Diamond interchanges is larger than that of Partial Clover 4 Quadrant interchanges. It implies that the variance of Diamond interchanges is less than that of Partial Clover 4 Quadrant interchanges. Table 4.1 The estimation of parameters Method Parameters Estimates Diamond Par clo A or B 4 Q Negative Binomial m0 0.0010 0.00117 a 0.105 0.095 k(=1/a) 9.52 10.52 Poisson A, 0.0010 0.001 17 Bayes approach a 6.51 10.98 5 6160.40 10194.61 4.3.3 Application and validation of the Negative Binomial approach In order to choose hazardous sites based on the estimates of the parameters, the following scenarios are developed. Scenario 1: a site i is hazardous if the observed accident rate (N (N i) exceeds the upper control limit which is a function of the average accident rate of reference sites and 87 a desired probability level. This scenario will be used to test the rate quality control method based on the Negative Binomial distribution. Scenario 2: a site i is hazardous if the observed accident rate (N (N i) exceeds the upper control limit which is a function of the average accident rate of reference sites and a desired probability level. This scenario will be used to test the rate quality control method based on the Poisson distribution. Scenario 3: a site i is hazardous if the probability that its Bayes accident rate exceeds the average accident rate of reference sites is greater than a predetermined probability level. For this study, a 95 % probability level is applied for all scenarios. Table 4.2 presents the identification of hazardous sites for Diamond interchanges. An asterisk (*) corresponds to the sites that have been identified as hazardous on the basis of the above scenarios. For example, under the existing methods (rate quality control method based on the Poisson distribution, and Bayes approach), 7 sites (i.e., sites: 1, 2, 4, 7, 8, 9, 11) out of 16 are identified as hazardous, whereas under the new method (rate quality control method based on the Negative Binomial distribution), 2 sites(i.e., sites: 2 and 4) are identified as hazardous. In each of the these scenarios (1, 2 and 3), a 95 % probability level was used, which implies that there is approximately 1 abnormal or hazardous site out of 20 88 random sites in the statistical sense. Therefore, 7 sites out of 16 is an unreasonably high percentage in the context of the 95 % probability level. Thus, the new method is clearly more conceptually persuasive in identifying hazardous locations than the existing methods. For Par Clo A or B 4 Q interchanges, 4(i.e., sites: 7, 12, 13, 14) of 14 sites were chosen with the existing methods, whereas only 1 site was identified as hazardous when assuming the Negative Binomial error as shown in Table 4.3. Thus similar conclusions can also be reached. The disagreement between the existing methods and the new method is probably best explained in the context of the underlying assumptions. The existing methods are both based on the widely accepted assumption that crashes occur according to the Poisson distribution, whereas the new method is based on the assumption that the occurrence of accidents follows a Negative Binomial distribution. The upper control limits, which are functions of the average accident rate of the reference sites and the variance in the accident rate at the given site, were lower when a Poisson distribution instead of a Negative Binomial distribution is assumed. This causes the procedure to identify more hazardous sites than are expected, which was shown in Figure 4.3. 89 Chapter 2 verified and discussed at length that the variance of crashes is substantially larger than the mean, and hence the Negative Binomial distribution is an appropriate assumption for the occurrence of crashes at freeway interchanges. Thus, the rate quality control method based on the Negative Binomial distribution would be an effective measure for the identification of hazardous sites, in these cases where the variance of accidents exceeds the mean (over-dispersion). 90 Table 4.2 A comparison of hazardous sites according to the methods (Diamond interchanges) Diamond interchanges Site(i) Accidents Vehicles Observed rate Upper control limit Bayesian ( Ni) ( vi) (NiNi) Negative Poisson APPmaCh Binomial 1 213 150529 0.00142 0.00153 0.00114 * 1.000 * 2 322 166103 0.00194 0.00154 * 0.00114 * 1.000 * 3 137 123131 0.00111 0.00154 0.00116 0.875 4 196 128354 0.00153 0.00153 * 0.00115 * 1.000 * 5 193 240902 0.00080 0.00153 0.00111 0.001 6 194 227992 0.00085 0.00153 0.00112 0.010 7 247 171559 0.00144 0.00153 0.00113 * 1.000 * 8 164 137889 0.00119 0.00154 0.00115 * 0.981 * 9 207 161545 0.00128 0.00154 0.00114 * 1.000 * 10 160 183034 0.00087 0.00154 0.00113 0.039 11 242 179213 0.00135 0.00153 0.00113 * 1.000 * 12 102 207928 0.00049 0.00153 0.00112 0.000 13 1 1 1 203816 0.00054 0.00153 0.00112 0.000 14 158 207045 0.00076 0.00153 0.00112 0.000 15 161 217719 0.00074 0.00153 0.00112 0.000 16 121 209255 0.00058 0.00153 0.00112 0.000 91 Table 4.3 A comparison of hazardous sites according to the methods (Par Clo A or B 4 Q) Par Clo A or B 4Q Sile(i) Accidents Vehicles Observed rate Upper control limit Bayes (N) (v,) (N, Ni) Negative Poisson Approx“ Binomial 1 39 39434 0.00099 0.00183 0.00147 0.127 2 45 65947 0.00068 0.00179 0.00140 0.000 3 53 86205 0.00061 0.00177 0.00137 0.000 4 62 94429 0.00066 0.00178 0.00136 0.000 5 127 100303 0.00127 0.00177 0.00135 0.764 6 120 103676 0.00116 0.00177 0.00135 0.407 7 157 108898 0.00144 0.00177 0.00135 * 0.990 * 8 111 110946 0.00100 0.00177 0.00134 0.041 9 103 112020 0.00092 0.00177 0.00134 0.005 10 1 17 127060 0.00092 0.00176 0.00133 0.003 11 131 133995 0.00098 0.00177 0.00133 0.016 12 226 169887 0.00133 0.00176 0.00131 * 0.959 * 13 286 202817 0.00141 0.00176 0.00130 * 0.998 * 14 403 235275 0.00171 0.00170 * 0.00129 * 1.000 * 92 Chapter 5 A SIMPLIFIED APPROACH FOR OVER- DISPERSION (NORMAL APPROXIMATION METHOD) 5.1 General The rate quality control method based on the Negative Binomial distribution was discussed in the preceding chapter, and it was found that this method produces reasonable results in the statistical sense. However it is not easy for traffic engineers to apply this technique in the safety field because the parameters can not be estimated as simply as those of the Poisson distribution. This chapter provides a simple approach for the identification of hazardous sites when the Negative Binomial distribution should be assumed because of the phenomenon of over-dispersion. 5.2 Concept The Negative Binomial approach can be simplified using the Normal approximation as: N,- ~N(.U,'ad.ui) (5-1) where, #i = 4Vi Ni N I. = the number of accidents at site i V, = the number of vehicles at site i d/JI- = variance (d 2 1) 93 In equation (5.1), the variance is larger than the mean, which is conceptually consistent with the error structure of the Negative Binomial distribution. On the other hand, under ideal conditions, the Poisson distribution can be approximated by the Normal distribution for large values of in, because the probability mass function of the Poisson distribution becomes more symmetric and bell-shaped as 11, increases (Rice 1997). Let N, be a sequence of Poisson random variables with the corresponding parameters. Then, E(Ni)= Var(Ni)= iii. If we wish to approximate the Poisson distribution by a Normal distribution, the Normal distribution should have the same mean and variance as the Poisson, and hence the random variables can be standardized by letting, X. z—N"_#i ' (a? then, E(Xi) = 0, Var(Xi) =1. (5.2) That is,Xi ~ N(O,1) However, the assumption is : E(Xi) = 0, Var(Xi) = d. That is, X, ~ N(O, d), (5.3) 94 Therefore, the random variables with over - dispersion can be standardized as follow : N,- —,ul. Z]. = (5.4) dfli Then E(Zi) = 0, Var(ZI-) = 1.That is, Zi ~ N( 0,1) (5.5) Thus, Z, can be applied to the identification of hazardous sites based on the traffic crash data with over—dispersion. 5. 3 Application and validation of the Normal approximation method The Normal approximation method is an alternative to solving the difficulties associated with the estimation of parameters in applying the rate quality control method based on the Negative Binomial distribution. Therefore, the results should be analogous to those of this rate quality control method. The validity of the Normal approximation method was tested using the two data sets (16 Diamond interchanges and 14 partial clover A, or B 4 Quadrant interchanges) which were used in chapter 4. 95 5.3.1 Estimation of parameters In order to apply the Normal approximation method for over-dispersion, two parameters need to be estimated. One is the average accident rate, 2». The other is a parameter relating to the over-dispersion, (1. Here, the average accident rate is the same as that of the Poisson or Negative Binomial distributions described in chapter 4, whereas d is the variance of random variables that have been standardized by formula (5. 2). Table 5.1 shows the results of the estimated parameters. The "d" estimates are 30.39 and 11.96 for Diamond, and Par Clo A or B 4 Q interchanges, respectively. Table 5.1 Estimation of parameters Method Parameters Estimates Diamond Par Clo A or B 4 Q Normal approximation )c 0.0010 0.001 17 method d 30.39 1 1.96 5.3.2 Validation of the Normal approximation approach Based on the estimates of parameters in Table 5.1, the Normal approximation method was tested for the 2 data sets. Table 5.2 shows that the use of the Normal approximation method produced similar results with those of the approach using the Negative Binomial distribution. In this table, asterisks (*) correspond to the sites that have been identified as hazardous on the basis of the two approaches. 96 For example, under a 95 °/o probability level, the Normal approximation method identifies 2 sites as hazardous, whereas the rate quality control method under the Negative Binomial distribution identifies 3 sites as hazardous out of 30 sites. That is, site 4 is not identified as a hazardous site by the Normal approximation method, whereas it is hazardous based on the Negative Binomial approach. However, this difference is not substantial with the probabilities being 0.94 and 0.95 respectively, when we use each approach. Figure 5. 1 presents a comparison of the probabilities with which a site is identified as hazardous by the two methods. As shown in the figure, results of both methods are consistent, even though there are a few sites that disagree slightly. It is expected that the differences would be reduced even further with larger data sets. Thus, for the identification of hazardous sites, the Normal approximation method can be used as an alternative to solving the difficulties associated with the estimation of the parameters for the rate quality control method. There is only a slight loss of accuracy as discussed here. 97 Table 5.2 A comparison with the results using the Negative Binomial distribution Site Probability ( i ) Interchange type lli Xi Zi Approximation Negative Binomial 1 Diamond 151 5.03 0.91 0.82 0.91 2 Diamond 167 12.02 2.18 0.99* 0.99* 3 Diamond 124 1.20 0.22 0.58 0.72 4 Diamond 129 5.91 1.53 0.94 0.95* 5 Diamond 242 -3.14 -O.57 0.38 0.36 6 Diamond 229 -2.31 -0.42 0.33 0.42 7 Diamond 172 5.69 1.03 0.85 0.92 8 Diamond 138 2.17 0.39 0.65 0.79 9 Diamond 162 3.52 0.64 0.74 0.85 10 Diamond 184 -1.75 -0.32 0.38 0.45 1 1 Diamond 180 4.63 0.84 0.80 0.90 12 Diamond 209 -7.39 -1.34 0.09 0.05 13 Diamond 205 -6.55 -1.19 0.12 0.09 14 Diamond 208 -3.46 -0.63 0.26 0.31 15 Diamond 219 -3.90 -0.71 0.24 0.28 16 Diamond 210 -6.15 -1.12 0.13 0.11 1 Par Clo A 4 Q 46 -1.06 -0.31 0.39 0.43 2 Par Clo A 4 Q 77 -3.67 -1.06 0.14 0.12 3 Par Clo B 4 Q 101 -4.77 -l .38 0.08 0.07 4 Par Clo A 4 Q 111 -4.62 -l.34 0.09 0.09 5 Par Clo A 4 Q 117 0.88 0.25 0.61 0.77 6 Par Clo B 4 Q 121 -0.13 -0.04 0.48 0.59 7 Par Clo B 4 Q 128 2.61 0.76 0.78 0.83 8 Par Clo A 4 Q 130 -1.66 -0.48 0.32 0.55 9 Par Clo A 4 Q 131 -2.46 -0.71 0.34 0.33 10 Par Clo B 4 Q 149 -2.61 -0.75 0.33 0.33 11 Par Clo A 4 Q 157 -2.07 -0.60 0.38 0.39 12 Par Clo A 4 Q 199 1.92 0.55 0.71 0.75 13 Par Clo A 4 Q 237 3.15 0.91 0.82 0.81 14 Par Clo B 4 Q 276 7.68 2.22 0.99* 0.95* 98 8:22:93 gran—€823." 38.8 Z .5: 3.82.5 95532 2: be 3.33.. 05 he neat—383 < mm 0.5»; 225023 @23qu ooé owd cod ovd 0N6 oo.o . w w L. cod ONO N o w . W ovo V d .m . m 000 w. .9... w. owd 00,—. 99 Tilt 1116; 5.4 An examination of the assumptions In the preceding sections, the following assumptions were made: 1) random variables X, which are standardized follow a Normal distribution, and 2) the expected values u, should be large enough to be approximated by the Normal distribution. Therefore, these assumptions need to be examined. 5.4.1 Goodness of fit of the Normal distribution In section 5.2, we assumed that X, follows a Normal distribution without any verification. In order to enhance the credibility of this method, we need to test the goodness of fit of the random variable X, to a Normal distribution. The Chi-square test was used to conduct this test after partitioning a Normal distribution into eight intervals of equal probability (Neter et al. 1992). Thus, if Ho holds (that is, X, is Normally distributed), then X2 follows an approximate x2 distribution with n-p-1=8-2-1=5 degrees of freedom. For (1:005, we require x2(0.95; 5)=11.07. Hence, the decision rule is as follows: If X2 S 11.07, conclude H0 If X2 > 11.07, conclude H1 The analysis of Diamond interchanges and Par-Clo A, or B 4 Q interchanges, which are the same data sets as used in the previous section, found X2 values of 6.00 and 5.43, respectively. Thus, a Normal distribution is a reasonable assumption for Xi. 100 5.4.2 Large values of 11, In section 5.2, we assumed that 11, should be large enough to be approximated by the Normal distribution. This approximation method can not be used, if it, is a small number. Accordingly, the effects of various values of u, were tested to determine the limits of the approximation method. The analysis focuses on the calculation of the difference between the true and approximate upper control limits over the values of 11,, as computed from the following equations. U—l “/1 y p_—_ Z 9 "1 (5-6) 1 y=0 y Ua = p + kfi (5.7) where, P : predetermined probability level U : true upper control limit Ua : approximated upper control limit k : standard normal variate corresponding to the predetermined probability level Figure 5.2 shows the difference between the true and approximate upper control limits for a range of expected frequencies (pi) from 0 to 60 crashes using the 95 probability level (k=1.645). Note that the difference is very large when the expected values are less than 5, then the curve flattens for expected values in the range from 5 to 15. 101 The difference is very small when the expected values are larger than 40. For this research, pi is large enough to be approximated, recognizing that the minimum value of u, is 151 and 46, for Diamond and Par Clo A or B 4 Q data sets, respectively as shown in Table 5.2. 102 Ae\c may :8: 35:8 .599. San—€8.59" was 0:5 2: :3an 353:? 23. N.m ensur— >o=o=ao¢ nouooaxm mm mv mm mm m _. m F (7.)eouaragga 103 Chapter 6 IDENTIFICATION OF HAZARDOUS SITES USING A TRAFFIC CRASH PREDICTION MODEL (PREDICTION MODEL METHOD) 6.1 The limitation of the rate quality control method (or upper control limit) As mentioned before, a rate quality control method is commonly used for identification of hazardous sites. In order to overcome the problem caused by over-dispersion, the rate quality control method based on the Negative Binomial distribution rather than the Poisson distribution has been examined and proposed as an alternative. Nevertheless, in identifying the hazardous sites using reference sites we recognize that there are still limitations as defined by others (Elvik 1988, Mountain and Fawaz 1989, Hauer 1992): First, the selection of reference sites is a matter of judgement, and hence the same site can be evaluated differently, depending on the researchers. Second, the number of reference sites will likely not be large enough to permit the accurate identification of hazardous sites in practice. For example, suppose that the objective was to evaluate the safety of all interchanges in Michigan using the rate quality method. The first step is to classify all interchanges to find the reference sites with similar properties. Figure 6.1 presents classification trees considering only the basic contributing factors to traffic crashes, and 1056 groups ( = 22x2x2x3x4) are produced, even though other contributing factors (i.e., 104 ramp length, interchange size, et al) are not considered in grouping the interchanges. This implies that we need reference sites for 1056 groups to evaluate all the interchanges using the rate quality control method described in chapter 4. Thus, it is not possible to identify hazardous interchanges using reference sites, recognizing that there are a total of only 397 interchanges along the four main freeways (I-69, I-75, I-94, I-96) in Michigan. There are many interchange types for which a sizeable number of reference sites does not exist. An alternative to the use of reference sites would be to use data from other states for the evaluation of freeway interchanges in Michigan. However, this approach causes several linked difficulties. For example, the definition of traffic crashes is different across states (i.e., total damage of $400 in Michigan, $500 in New Mexico and $1000 in Wisconsin: Michigan, New Mexico and Wisconsin traffic crash facts (1998 )), and interchange crashes are sensitive to weather conditions (i.e., in Michigan, winter crashes are approximately 15 % higher than in other seasons). In addition, it is not easy to obtain well defined geometric and traffic data from other states. Thus it is obviously not a good approach to use data from other states for the identification of hazardous sites. In this chapter, a method to search for hazardous sites using an accident prediction model is examined. lOS T1: 22 00.0 000000“ T2: 2 A3: 2 (A4: 3 Classification of geometg T1: Interchange types - 22 interchange types T5: 4 T2: Area - Urban or rural T3: Shoulder width -12 ft or 10 ft T4: The number of lanes - 4 lanes - 6 lanes - 8 lanes T5: The number of ramps - 2 ramps - 4 ramps - 6 ramps - 8 ramps Figure 6.1 Basic geometric classification tree for reference sites 106 6.2 The concept of the prediction model method In section 6.1, several limitations to the use of the rate quality control method for the identification of hazardous interchanges was discussed. The conceptual problem can be solved by diminishing the scope of individual judgement through logical procedures, whereas the practical problem can be treated by estimating the effects of special contributing factors to traffic crashes at a given site through an analysis of relevant traits at other sites. Previous researchers (J orgensen 1972, Flak and Barbaresso 1982) have recommended that hazardous sites be estimated by the difference between the observed accident frequency (B) of a site and the expected frequency (A) as predicted by an accident prediction model as shown in Figure 6.2. McGuian (1981) noted that this difference represents the size of the potential crash reduction when a safety improvement project is implemented at the site. These ideas can be updated to solve both the conceptual problem and the practical problem which have been identified. Suppose that the goal is to estimate the hazardness of site i using a statistical concept like the rate quality control method. In order to evaluate site i using the rate quality control method, reference sites with similar properties should be selected, and the accident rate of the site i compared with that of the reference sites. However, in the strict sense, there are no reference sites which exactly reflect the site i. Thus, the idea of the prediction model method is that the value of E(O) obtained from the crash prediction model can be used instead of the average crashes of the reference sites to which the site i belongs. Using this approach, the reference sites match exactly the traits of the site i (these are imaginary reference sites as denoted by Hauer (1992). 107 12:2: .25:— EEoGoE mo 33.80 ad 959% 8:» he €2.95 _ 3mm can—37¢ :8: .838 .25: 2282.5 3.3on :8: 35:8 .395 SE 108 This approach is similar to the rate quality control method in the sense that both use the mean and standard deviation for identification of hazardous sites. However, the difference is that the mean is the expected value 13(9), based on a calibrated model for the prediction model method, whereas the mean is the average of the reference sites for the rate quality control method. This is why "E(B)" instead of "m" is used in formula 6.]. Therefore, the calibration of the crash prediction model based on the correct error structure is extremely important to the identification of hazardous sites. It has already been shown that the desirable assumption for freeway crash models is the Negative Binomial rather than the Normal or Poisson error structure. In order to illustrate the prediction model method for the identification of hazardous sites, the Negative Binomial distribution function is again mentioned as equation (6.1). P=Ui1 [NEE-11k m”) E(B) x (61) ,20 k x!r(k) E(6)+ k ' where, U : the true upper control limit E( 6 ) : expected values k : parameter 109 In equation (6.1), E(G) would be obtained from the crash prediction model and the parameter k would be estimated in the process of calibrating coefficients of the crash prediction model, which were discussed in detail in chapter 3. From equation (6.1), therefore, the upper control limit for identification of hazardous sites at a desired probability level can be computed. The variance of the Negative Binomial distribution is E(0)+E(9)2/k, as discussed in chapter 2 and chapter 3, and hence the upper control limit will increase sharply with E(G) as shown by the thick dotted line in Figure 6.2. However, if an accident prediction model is developed under a constant normal error structure, the upper control limits would be a constant distance from the accident prediction line as shown by the thin dotted line in the Figure. This approach is similar to that of previous research (J orgensen 1972, Flak and Barbaresso 1982). 110 6. 3 Application and validation of the prediction model method 6. 3.1 Illustration of the prediction model method Suppose that the goal is to estimate the safety of a special site using a crash prediction model that has been calibrated under the Negative Binomial error structure. Again, k can be estimated by the parameter calibration procedure described in chapter 3, and E(G) can be computed from the crash prediction model using several independent variables of the site. Thus, the true upper control limit 'U' can be found from equation (6. 1) for a given site under the desired probability level. For example, consider site 1 in Table 6.1. Using the crash prediction model developed in chapter 3, the expected value at site 1,E (0) is 1.401 0.186 V 0.620 0.738 = 3.448 V1 V2 3 G1 exp(—1.267 G2 — 0.156 G5) (6.2) = 141 .6 accidents/3years The standard deviation at site 1 = JE(9)+ 13(9)2 /k = 51.3 accidents/ 3years 111 The parameter k was determined to be equal to 8.05 in chapter 3. In equation (6.1), (6.2) and (6.3), the upper control limit ' U ' is 233 crashes for 3 years under the 95 percent probability level as follows: U-l -8.05 x (1+141.6) I"(8.05+x)[ 141.6 ] (6.3) x!F(8.05) 141.6 + 8.05 However, there were only 213 crashes over 3 years at the given site. Thus this site is not identified as hazardous under the 95 percent probability level as shown in Figure 6.3. Thus, we can test the hazardness of each site on the basis of various desired probability levels using these results. 112 A3 3:552: :52: 55959:. .3 553:9? «a 29533 =< n .w ogwi 2N" 8:2...» 63.830 sins:— mnuu =8: 3.5—.3 .595 mad 113 6.3.2 Validation of the prediction model method The conceptual foundation for identifying hazardous sites using an accident prediction model are straightforward as discussed in the previous section. There are two main advantages of this method over the rate quality control method. First, it diminishes the scope of individual judgement through a logical procedure. Second, a large number of reference sites for any particular site are not required. Despite its advantages, the prediction model method can cause unreasonable results since there may be significant errors in choosing the model structure and calibrating the model parameters. For these reasons, it is important to illustrate empirically that the prediction model method and reference method produce similar results. However, we can not expect that the results of both approaches will be coincident, because in the strict sense, the imaginary reference sites for the prediction model method is a subset of the reference sites for the rate quality control method. To demonstrate the results of both approaches, the two data sets that were analyzed in chapter 4 were used. In Table 6.1, the 5th column presents the probability that observed crashes exceed the expected crashes at a given site under the prediction model method. The 6th column represents the probability that the observed accident rate exceeds the reference accident rate under the rate quality control method. There is some disagreement between the methods as expected. When these sites are identified at a high probability level (i.e., 0.95), 3 sites out of 30 are identified by the rate quality control method (marked by a "*" in the table), whereas there are no sites identified when using 114 the prediction model method. At a lower probability level (i.e., 0.90), 6 and 4 sites out of 30 are identified using the rate quality control method and prediction model method respectively (noted by a " O " in the table). In the prediction model method, the model parameters are calibrated through a minimization of the sum of squared residuals, and hence there may be underestimates of the variances for the special sites which have a larger values than the average sites as shown in Table 6.1. Moreover, not all geometric elements (i.e., interchange size, ramp length, et a1) and traffic elements (mainline traffic, on and off ramp traffic, truck traffic, et al) were used in classifying the reference sites to design the upper control limit, whereas the imaginary reference sites for the prediction model method match exactly the characteristics of a special site. Accordingly, it can be expected that the results of both approaches will be similar, but not coincide in every cell in Table 6.1. To test similarity of the results by the rate quality control method and prediction model method, the percentiles of sites were calculated and were plotted in Figure 6.4. The results of both approaches are highly correlated (correlation coefficient = 0.96). All the sites were ranked by the probability and the top 10 sites were chosen from the two data sets (5 sites at Diamond interchanges, and 5 sites at Par-Clo A or B 4 Q interchanges). As shown in Table 6.1(noted by a "v" in the table), the prediction model method identifies the same sites as the rate quality control method for the Diamond 115 interchanges. It also identifies 4 sites out of the 5 identified by the rate quality control method for the Par-Clo A or B 4 Q interchanges. It is surprising that there is so little difference between the rate quality control method and the prediction model method in terms of determining the hazard ranking of several sites. A practical application of the above results is that if the goal is to prioritize several sites for a highway safety program, the prediction model method can be used as a tool to produce very similar ranks as the rate quality control method. If the goal is to evaluate a specific site for a purpose, we can approximately evaluate the hazardness of the site under the desired probability level through the prediction model method. These advantages imply that we can overcome the conceptual and practical problem associated with the identification of sites where the crashes exceed the expected number of crashes as discussed in the previous sections, through the use of the prediction model method. The accuracy of this method depends on having the crash prediction model calibrated under the appropriate error structure. 116 Table 6.1 A comparison of results Site(i) Interchange type The number of crashes Probability (3 yearS) Observed Estimated By upper control limit By prediction model 1 Diamond 213 141.6 0.91 0 v 0.91 9 v 2 Diamond 322 204.2 099 * e V 0.93 o v 3 Diamond 137 113.8 072 Q75 4 Diamond 196 139.9 0.95 at Q V 0.37 V 5 Diamond 193 237.7 0.36 0.34 6 Diamond 194 251.5 0.42 0.29 7 Diamond 247 163.7 0.92 0 v 0.92 0 v 8 Diamond 164 138.7 0.79 0.73 9 Diamond 207 169.3 0.85 0.76 10 Diamond 160 166.5 0.45 0.51 11 Diamond 242 157.8 0.90 0 v 0.92 0 v 12 Diamond 102 177.7 0.05 0.10 13 Diamond 111 182.7 0.09 0.13 14 Diamond 158 179.5 0.31 0.42 15 Diamond 161 198.5 0.28 0.34 16 Diamond 121 188.4 0.11 0.16 1 Par Clo A 4 Q 39 44.8 0.43 0.43 2 Par Clo A 4 Q 45 69.1 0.12 0.20 3 Par Clo B 4 Q 53 87.1 0.07 0.15 4 Par Clo A 4 Q 62 93.3 0.09 0.21 5 Par Clo A 4 Q 127 85.4 0.77 v 0.89 v 6 Par Clo B 4 Q 120 135.1 0.59 0.44 7 Par Clo B 4 Q 157 127.8 0.83 v 0.76 v 8 Par Clo A 4 Q 111 102.5 0.55 9 Par Clo A 4 Q 103 125.7 0.33 0.64 v 10 Par Clo B 4 Q 117 134.8 0.33 0.36 0.42 11 Par Clo A 4 Q 131 166.1 0.39 0.33 12 Par Clo A 4 Q 226 221.5 0.75 v 0.58 13 Par Clo A 4 Q 286 275.9 0.81 v 0.59 v 14 Par Clo B 4 Q 403 285.3 095 * 9 v 0.86 v * : Identified sites under 95 percent probability level 0: Identified sites under 90 percent probability level v : Top 10 rankings (5 for Diamond, and 5 for Par C10 4 Q) 117 .352: «5.3.09... .28 .552: .9555 5:25 8.... a... he 8.3.0.. .o Satan—=8 < v.0 950.... 00.. 0050:. .23.: :o_.u_uo..m 00.0 00.0 0V0 0N0 0V0 00.0 _. 00.0 00._. poqtaw |o.nuoo Amenb area 118 6. 4 Evaluation of Michigan freeway interchanges on the basis of the prediction model As noted in the preceding section, the prediction model method can be used to identify hazardous sites without the use of reference sites. Using this approach, the 199 interchanges which were utilized in the crash prediction model development were assessed using the coefficients and k parameters estimated according to the interchange type in chapter 3. The sites which exceed the thick dotted line in Figure 6.2 are summarized in Table 6.2. Under the 95 % upper control limit, there is one such site out of the 10 interchanges on L69, 4 sites out of 65 on I-75, 6 sites out of 90 on 1-94, and 1 site out of 34 on [-96, respectively. Therefore, a total 12 sites are identified out of 199. These results are approximately consistent with the statistical concept that there may be 10 abnormal sites out of 200 random sites using the 95 % upper control limit. Under the 90 % upper control limit, 22 sites are chosen, which also supports the preceding conclusion. The results of evaluating all interchanges are presented in detail in the Appendix. The identified sites are candidates for improvement under highway safety improvement program for freeway interchanges. These results could not be obtained through the existing rate quality method because there are not enough reference sites as discussed at length in section 6.1. 119 088888 0.53 58...... 88.8 .0 80:8: 2.. A V .88.. 8.8 N. 8..» mu .8. .8. - . 88 8 B. 8>< .82.. 88.0 .850 8.8 M88. 8 . 88 8. 8.8 o>< .885 m 8.288 8.88 8-. . .. 88 88 8... 3. 8.8.2.88 8 8.9.8 8888 8-. - . 88 E 8. 882 < 8.288 .88 8-. .. 88 8.. 8 cm 82.8 m .852.- 888 8-_ .. 88 8 8 8.-. mm < .852. 888 8-. .. - 88 .8 N8 .8 m :8.- ..885 .88 o 8.88 8-. . .88 8 8 .m 8-. 888.8 8.2 8.8 8-. . 88 88. 8. 94. .8852. 888.8 .8: 8n. 8 v.88 8-. - .. 88 8. 88 .w .8. 885.88 .8: 8n. o «.88 8-. . 88 8 8. 8.8 82¢ 885 885.88 .8? 8n. < 4.88 8-. .. . 88 8. 88 m>< ..o...m .s. 888.8 .8: m :88 8-. .. 88 8. N... 3. 88.8.8.8 8885 .8: 8-. - .. 88 h... N8 3. 8...: a. 888.8 888 8-. - .88 8. .8 8.5.. 9.5 8> 88.85 8.88 8-. - .88 N... 8.8 cm 88.“. 8885 < 888 .8.“ 8 . .. 88 88 8.. .3. .288 8.8 8-0.; 5.5.0 88. 8.-. .. .. 88 3. 88 om 88?. m< 8.288 E8.- 8... .. .. 88 8 8 3. 58.88 8885 En. 88. 8.-. .. 88 8 8.. B. .828 85.. 888.8 .8: 8.2 88. m.-. .. .. 88 8 8 cm 8882 888.8 8.8. A8.... 8. .. .. 88 8 8. mm 8-5. .85 =8“. 8.8 8.. $8 8 8 8. mmgw 09.2.000— >.___omno.n. 03:... 00308.0 two. map—U 00>. 09.9.9.3:— cm:mco..m.c_ 930m. 02.8:— .o..:.: 558.08.... 2.. .3 83.8 3:98:08 2.... «.0 0.0:... 120 Chapter 7 SUMMARY AND CONCLUSIONS 7.1 Summary The Poisson distribution is a commonly accepted assumption in analyzing traffic crashes. When freeway interchange crash data were examined, it was found that there is substantially larger variability than would be expected if the distribution followed Poisson's law, and that the Negative Binomial distribution provides a better fit. This research focused on several linked issues which occur with the assumption that traffic crashes follow the Negative Binomial distribution rather than the Poisson distribution. 7.1.1 Traffic crash distribution To test the distribution on freeway interchange crashes, the year to year variances were calculated for crashes that occurred during the 5 year-period 1994-1998. Throughout this study, it was found that there is greater variability than would be expected under the assumption of the Poisson distribution, and the Negative Binomial distribution fits the data much better than the Poisson distribution. That is, 0 The correlation coefficients between observed and expected variances increased from 0.91 to 0.97 and from 0.84 to 0.90 in the analysis of 24 crash types and 84 Diamond interchanges, respectively, when the data were fitted to the Negative Binomial instead of the Poisson distribution. 121 0 The squared residuals between observed and expected variances were reduced by more than 80 % when the Negative Binomial distribution is assumed. 7.1.2 Traffic crash prediction model One objective of the research was to develop crash prediction models for freeway interchanges using the Negative Binomial error structure. 0 Based on the results of ANOVA and correlation, mainline ADT, ramp ADT, and truck percent were selected as traffic variables that effect freeway interchange crashes. The number of on and off ramps, the number of lanes, shoulder width, interchange length, and average spread-ramp length were determined to be geometric variables that affect accidents. 0 A non-linear regression model was selected as the model structure for the crash prediction model developed in this study, and the model is: E(B) = A x ViBi ijCj x epo(Ck x Gk) where, E(6) : Expected number of crashes Vi :Traffic variables G j : Geometric variables(type 1) Gk : Geometric variables(type 2) A,Bi ,Cj,Ck : Parameters 122 0 To calibrate this model, the Generalized Linear Model (GLIM) approach that prevents the Negative Binomial error assumption from being violated was used. 0 Using several measures of assessing the goodness of fit of models, such as Pearson Chi-square(x2), Dispersion parameter (Dp), Coefficients of determination (R2), Pearson Residuals(PR) and so on, 10 crash prediction models were developed, one for each of the most common interchange types in Michigan. 0 Large reductions in the coefficient of variation of parameter estimates were found when the traffic crash prediction models were calibrated based on the Negative Binomial error assumption. For example, the coefficient of variation of parameter estimates in the models for interchange type 11 and type 12 were reduced by an average of 36 percent when the models were calibrated under the Negative Binomial error assumption rather than the Normal one. 7.1.3 The rate quality control method based on the Negative Binomial distribution Since the accidents follow the Negative Binomial distribution rather than the Poisson distribution, the rate quality control method needed to be reexamined because it is based on the Poisson distribution. The findings can be summarized as follows: 0 The rate quality control method under the Poisson assumption identifies more sites as hazardous than should theoretically be expected because the variance of the Poisson distribution is equal to the mean, whereas the variance of the Negative Binomial distribution (and the observed data) equals the mean + mean 2 /k. 123 O The Negative Binomial distribution parameters that were necessary for the identification of hazardous sites were calculated using the maximum likelihood method of estimation. 0 On the basis of the Negative Binomial error structure, the framework of a rate quality control method was proposed for the identification of hazardous sites. This framework produced more reasonable results than the existing approaches, such as the rate quality control method assuming the Poisson error structure, or Bayes approach. 7.1.4 The Normal approximation method Even though the rate quality control method based on the Negative Binomial distribution produced conceptually more reasonable results than the existing approaches, the application of this method may not be efficient because the parameters can not be easily estimated. 0 In order to overcome the difficulties associated with the estimation of parameters of the Negative Binomial distribution based rate quality control method, a Normal approximation method was proposed, and is shown to produce good results when identifying hazardous locations. 0 The Normal approximation method identified hazardous sites with no loss of accuracy, even though it is a relatively simple method based on the Negative Binomial distribution. 124 0 The validity of the Normal approximation method was shown to be contingent on two assumptions. These assumptions are: 1) the standardized random variables X, follow the Normal distribution and 2) the expected mean pi is large enough. 0 The testing of the two assumptions showed that Xi does follow the Normal distribution, and pi is large enough to allow for the accuracy of the Normal approximation method. 7.1.5 The prediction model method In applying the rate quality control method to the identification of hazardous sites, two limitations were identified in this study. The conceptual problem is that the selection of reference sites is a matter of judgement. The practical problem is that a site can not be efficiently evaluated unless there is a sufficient number of reference sites to assure the accuracy of the results. 0 To overcome the limitations of the rate quality control method, the prediction model method was tested, and it was found that there is little difference between the rate quality control method and the prediction model method in identifying hazardous sites. This implies that we can evaluate the safety of the sites in a statistical sense without reference sites. 0 Recognizing the accuracy and the availability of the prediction model method, about 200 freeway interchanges in Michigan were evaluated, 12 sites were identified at the 95 % probability level. 125 7.2 Conclusions This research focused on the issues occurring when an assumption is made that traffic crashes follow the Negative Binomial distribution rather than the Poisson distribution. The following is conclusions that were reached in this study. 0 Crash prediction models for freeway interchanges can be efficiently calibrated under the assumption of the Negative Binomial error structure. 0 The rate quality control method using the Negative Binomial distribution identified a more reasonable set of abnormal sites than the existing methods such as the Poisson based rate quality control method, or Bayes approach. 0 The Normal approximation method proposed for user convenience identified hazardous sites without loss of accuracy, even though it is relatively simple compared to the Negative Binomial based rate quality control method. 0 The prediction model method developed accurately identified the safety of sites in the statistical sense. 126 BIBLIOGRAPHY Ix) 10. 11. BIBLIOGRAPHY . A. H-S. Ang and W. H. Tang. “Probability concepts in engineering planning and design”. John Willey & Sons, 1975. A. Vogt and J. Bared. “Accident models for two-lane rural segments and intersections” Transportation Research Record, 1635, Transportation Research Board, National Research Council, Washing DC, 1998, pp. 18-29. . A. P. Tarko, K. C. Sinha, and O. Farooq. “Methodology for identifying highway safety problem areas”. Transportation Research Record, 1542, Transportation Research Board, National Research Council, Washing DC, 1996, pp. 49-53. A. Ceder and M. Livneh. “Relationships between road accidents and hourly traffic-I”. Accident Analysis and Prevention, Vol. 14, No l, 1982, pp. 19-34. A. Ceder. “Relationships between road accidents and hourly traffic-H”. Accident Analysis and Prevention, Vol. 14, No 1, 1982, pp. 35-44. B. Persaud and L. Dzbik. “Accident prediction models for freeways” Transportation Research Record, 1401, Transportation Research Board, National Research Council, Washing DC, 1993, pp. 55-60. B. Persaud and T. Nguyen. “Disaggregate safety performance models for signalized intersections on Ontario provincial roads” Transportation Research Record, 1635, Transportation Research Board, National Research Council, Washing DC, 1998, pp. 113-120. B. Efron and C. Morris. “Data analysis using Stein’s estimator and its generalizations”. Journal of the American Statistical Association, Vol. 70, No 350, 1975, pp. 311-318. B. Efron and C. Morris. “Stein’s estimation rule and its competitors- An Empirical Bayes approach”. Journal of the American Statistical Association, Vol. 68, No 341, 1973,pp.117-130. C. Belanger. “Estimation of safety of four-leg unsignalized intersections”. Transportation Research Record, Transportation Research Board, 1467, National Research Council, Washing DC, 1994, pp. 23-29. C. Morris. “Parametric Empirical Bayes inference: theory and applications” Journal of the American Statistical Association, Vol. 78, No 381, 1983, pp. 47-65. 128 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. D. Mahalel, A. S Hakkert and J. N. Prashker. “A system for the allocation of safety resources on a road network”. Accident Analysis and Prevention, Vol. 14, No 1, 1982, pp. 45-56. D. Mahalel, A. S. Hakkert and J. N. Prashker. “A system for the allocation of safety resources on a road network”. Accident Analysis and Prevention, Vol. 14, No 1, 1982, pp. 45-56. D. Mahalel. “A note on accident risk”. Transportation Research Record, 1068, Transportation Research Board, National Research Council, Washing DC, 1986, pp. 85-89. D. R. D. Mcguigan. “Non-junction accident rates and their use in black-spot identification”. Traffic Engineering and Control, Vol. 23, 1982, pp. 60-65. D. R. D. Mcguigan. “The use of relationships between road accidents and traffic flow in 'black-spot' identification”. Traffic Engineering and Control, Vol. 22, 1981, pp. 448-451. E. Hauer and B. N. Persaud. “How to estimate the safety of rail-highway grade crossings and the safety effects of warning devices”. Transportation Research Record, 1114, Transportation Research Board, National Research Council, Washing DC, 1986, pp. 131-139. E. Hauer and B. N. Persaud. “Problem of identifying hazardous locations using accident data”. Transportation Research Record, 975, Transportation Research Board, National Research Council, Washing DC, 1984, pp. 31-43. E. Hauer and J. Lovell. “Estimation of safety at signalized intersection”. Transportation Research Record, Transportation Research Board, 1185, National Research Council, Washing DC, 1988, pp. 23-29. E. Hauer, B. N. Persaud, A Smiley and D.Duncan. “Estimating the accident potential of an ontario driver”. Accident Analysis and Prevention, Vol. 23, No 2/ 3, 1991 , pp. 133-152. E. Hauer. “Empirical Bayes approach to the estimation of ‘unsafety’: the multivariate regression method”. Accident Analysis and Prevention, Vol. 24, No 5, 1992, pp. 457- 477. E. Hauer. “Identification sites with promise”. Transportation Research Record, 1542, Transportation Research Board, National Research Council, Washing DC, 1996, pp. 54-60. E. Hauer. “On the estimation of the expected number of accidents”. Accident Analysis and Prevention, Vol. 18, No 1, 1986, pp. 1-12. 129 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. E. Hauer, D. Terry, and M. S. Griffith. “Effects of resurfacing on safety of two - lane rural roads in New York state”. Transportation Research Record, 1467, Transportation Research Board, National Research Council, Washing DC, 1994, pp. 30-37. E. Hauer, J. C. N. Ng, and J. Lovell. “Estimation of safety at signalized intersection”. Transportation Research Record, 1185, Transportation Research Board, National Research Council, Washing DC, 1988, pp. 48-60. E. Willaimson and M. H. Bretherton. “Tables of the Negative Binomial probability distribution”. John wiley & sons, 1963. G. M. Carter and J. E. Rolph. “Empirical Bayes methods applied to estimating fire alarm probabilities”. Journal of the American Statistical Association, Vol. 69, No 348, 1974, pp. 880-885. H. R. Al-Masaeid, K. C. Sinha, and T. Kuczek. “ Evaluation of safety impact of highway projects”. Transportation Research Record, 1401, Transportation Research Board, National Research Council, Washing DC, 1993, pp. 9-16. H. Robbins. “Empirical Bayes estimation problem”. Proc, Natl, Acad, Sci. USA, Vol.77, No.12, 1980, pp. 6988-6989. H. Robbins. “Estimation and prediction for mixtures of Exponential distribution( Empirical Bayes/accident proneness”. Proc, Natl, Acad, Sci. USA, Vol.74, No.7, 1980, pp. 2382-2383. H. Robbins. “Prediction and estimation for the compound Poisson distribution( Empirical Bayes/accident proneness”. Proc, Natl, Acad, Sci. USA, Vol.74, No.7, 1977, pp. 2670-2671. J. A. Bonneson and PT. McCoy. “Effect of median treatment on urban arterial safety : An accident prediction model". Transportation Research Record, 1581, Transportation Research Board, National Research Council, Washing DC, 1997, pp. 27-36. J. A. Bonneson and P. T. McCoy. “Estimation of safety at two-way stop controlled intersections on rural highways". Transportation Research Record, 1401, Transportation Research Board, National Research Council, Washing DC, 1993, pp. 83-89. 130 34. J. L. Higle and J. M. Witkowski. “Bayesian identification of hazardous location” Transportation Research Record, Transportation Research Board, 1185, National Research Council, Washing DC, 1988, pp. 24-29. 35. J. F. Lawless. “Negative Binomial and Mixed Poisson Regression”. The Canadian Journal of Statistics, Vol. 15, No 3,1987, pp. 209-225. 36. J. A. Rice. “Mathematical statistics and data analysis: 2nd edition”. Wadsworth, Inc, 1997. 37. J. Neter, W. Wasserman and G. A. Whitmore. "Applied statistics: 4th edition". A Division of Somon & Schuster, Inc, 1993 38. L. Moutain, B. F awaz and D. Jarrett. “Accident prediction models for roads with minor junctions”. Accident Analysis and Prevention, Vol. 28, No 6, 1996, pp. 695- 707 39. L. F. Rodriguez and T. Sayed. “Accident prediction models for urban unsignalized intersections in British Columbia”. University of British Columbia. 1999. 40. M. Norden, J. Orlansky and H. Jacobs. “Application of Statistical Quality-Control Technique to analysis of highway-accident data”. Bulletin 117, HRB, National Research Council, Washing DO, 1956, pp. 17-32. 41. M. A. Flak and J. C. Barbaresso. “Use of computerized roadway-information system in safety analyses”. Transportation Research Record, 844, Transportation Research Board, National Research Council, Washing DC, 1982, pp. 50-55. 42. M. J. Maher and I. Summersgill. “A comprehensive methodology for the fitting of predictive accident models”. Accident Analysis and Prevention, Vol. 28, No 3, 1996, pp. 281-296. 43. M. Y. Lau and A. D. May, Jr. “Injury accident prediction models for signalized intersections”. Transportation Research Record, 1171, Transportation Research Board, National Research Council, Washing DC, 1988, pp. 5 8-67. 44. Michigan Department of State Police. “ Michigan Traffic Crash Facts”. 1997 45. M. E. Harr. "Reliability - based design in civil engineering". Dover Publication, Inc, 1996. 46. N. M. Laird and T. A Louis. “Empirical Bayes confidence intervals based on bootstrap samples”. Journal of the American Statistical Association, Vol. 82, No 399, 1987, pp. 739-750. 131 47 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. R. E. Fay HI and R. A Herriot. “Estimates of income for small places: An application of J ames-Stein procedures to census data”. Journal of the American Statistical Association, Vol. 74, No 366, 1979, pp. 269-277. R. W. Stokes and M. I. Mutabazi. “Rate-Quality Control Method of identifying hazardous road locations”. Transportation Research Record, 1542, Transportation Research Board, National Research Council, Washing DC, 1996, pp. 44-48. Royal Statistical Society. “GLIM 4: “The Statistical system for Generalized Linear Interactive Modeling”. Oxford University Press Inc, 1993. S. Miaou, P. S. Hu, T. Wright, A. K. Rathi, and S. C. Davis. “Relationship between truck accidents and highway geometric design: A Poisson regression approach”. Transportation Research Record, 1376, Transportation Research Board, National Research Council, Washing DC, 1992, pp. 10-18. S. Miaou and H. Lum. “Statistical evaluation of the effects of highway geometric design on truck accident involvement”. Transportation Research Record, 1407, Transportation Research Board, National Research Council, Washing DC, 1993, pp. 11-22. S. Malm'dakis and S.Wheelwright. “Forecasting methods for management: 5th edition”. John wiley & sons, 1989. S. M. Ross. “Introduction to probability models: 6th edition”. Academic Press, 1997. S. L. Hui and J. O. Berger. “Empirical Bayes estimation of rates in longitudinal studies”. Journal of the American Statistical Association, Vol. 78, No 384, 1983, pp. 75 3—759. T. L. Maleck. “ The development and evaluation of accident predictive models”. Michigan State University: Thesis (Ph. D), 1980. T. P. Hutchinson and A. J. Mayne. “ The year to year variability in the numbers of road accidents”. Traffic Engineering and Control, Vol. 18, 1977, pp. 432-433. T. Kim. “ Modeling freeway interchange accidents”. Michigan State University: Thesis (Ph. D), 1989. T. O. Kvalseth. “ Cautionary note about R2.” American Statistician, Vol. 39, No 4, 1985, pp. 279-285. 132 APPENDICES Appendix A - The results of accident prediction model calibration Appendix B - The results of evaluating of freeway interchanges 133 APPENDIX A THE RESULTS OF ACCIDENT PREDICTION MODEL CALIBRATION 134 THE RESULTS OF ACCIDENT PREDICTION MODEL CALIBRATION Table A.l The results of accident prediction model calibration (Interchange type 11) Coefficient Variable definition Unit Estimate Std error T - statistic A Constant - 3.448 Log(A) (1.238) (0.67) (1.85) B] V]: Mainline traffic volume per lane (ADT/1000) 1-401 0-30 4-66 82 V2 : Ramp traffic volume (ADT/1000) 0.186 0.12 1.55 °/ . 2 . . B3 V3 :Truck percent ( O) O 6 O O 19 3 26 C] G] : Interchange length (Mile) 0-733 0-15 4-92 C2 G2 : Average spread ramp length (Mile) '1'267 0'97 ’1'31 C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) 0156 0.12 -l .30 C 5 G5 : Shoulder width Model statistic DP Dispersion parameter 1.0 X2 Pearson chi -square 23.34 (12 0.95, 27 = 40_11) R2 Coefficient of determination 0.60 K Negative Binomial parameter 3_05 135 Table A.2 The results of accident prediction model calibration (Interchange type 12) Coefficient Variable definition Unit Estimate Std error t - statistic A Constant - 31.343 Log(A) (3.445) (0.73) (4.72) B] V1: Mainline traffic volume per lane (ADT/1000) 0.946 0-24 3-94 82 V2 : Ramp traffic volume (ADT/1000) °/ B3 V3 :Truck percent ( 0) Cl G1 ; Interchange length (Mile) 0.933 0.36 2.59 C2 G2 : Average spread ramp length (Mile) '3 '842 1'31 ‘2'93 C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic Dp Dispersion parameter 1.0 X2 Pearson chi -square 14.66 (x2 0'95, 14 = 23.68) R2 Coefficient of determination 0.33 K Negative Binomial parameter 10.74 136 Table A.3 The results of accident prediction model calibration (Interchange type 13) Coefficient Variable definition Unit Estimate Std error t - statistic A Constant - 3.614 Log(A) (1.285) (1.07) (1.20) 81 V1: Mainline traffic volume per lane (ADT/ 1000) 0947 0.47 2-01 32 V2 . Ramp traffic volume (ADT/1000) 0.187 0.16 1.17 0 B3 V3 :Truck percent (4) C1 G] ; [rum-change length (Mile) 0.816 0.22 3.71 C2 G2 : Average spread ramp length (Mile) C3 G3: The number of lanes - 0136 0'10 1‘36 C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic DP Dispersion parameter 1.0 X2 Pearson chi -square 1932 (X2 095‘ 19 = 30.14) R2 Coefficient of determination 047 K Negative Binomial parameter 543 137 Table A.4 The results of accident prediction model calibration (Interchange type 14) Coefficient Variable definition Unit Estimate Std error t - statistic A Constant - 17.531 Log(A) (2.864) (1.25) (2.29) B] V]: Mainline traffic volume per lane (ADT/1000) 0-91 1 0-43 2-12 B2 V2 : Ramp traffic volume (ADT/1000) 0.142 0.14 1.00 0 B3 V3 :Truck percent U0) C1 G1 : Interchange length (Mile) 1-315 0-33 3-93 C2 G2 : Average spread ramp length (Mile) ’2‘278 L984 '1‘15 C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic [)p Dispersion parameter 1.0 X2 Pearson chi -square 9.07 (12 095’ 9: 16.92) R2 Coefficient of detemtination 0.65 K Negative Binomial parameter 6.38 138 Table A5 The results of accident prediction model calibration (Interchange type 21) Coefficient Variable definition Unit Estimate Std error I- statistic A Constant - 5.479 Log(A) (1.701) (1.02) (1.67) B1 V1: Mainline traffic volume per lane (ADT/1000) 0-467 043 1-09 32 V2 : Ramp traffic volume (ADT/1000) 0.470 0.18 2.61 0 B3 V3 :Truck percent U0) C) G) : Interchange length (Mile) C2 G2 : Average spread ramp length (Mile) C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic DP Dispersion parameter 1.0 X2 Pearson chi -square 6.35 (x2 0 95, 6: 12.19) R2 Coefficient of determination 0.68 K Negative Binomial parameter 6.73 139 Table A.6 The results of accident prediction model calibration (Interchange type 31) Coefficient Variable definition Unit Estimate Std error t - statistic A Constant - 3.494 Log(A) (1.251 ) (0.83) (1.52) B] V]: Mainline traffic volume per lane (ADT/1000) 1-144 0-24 4-77 32 V2 : Ramp traffic volume (ADT/1000) 0.128 0.11 1.16 0 B3 V3 :Truck percent (/o) 0.138 0.12 1.15 C) G] : Interchange length (Mile) 0-319 0-19 1-68 C 2 G2 : Average spread ramp length (Mile) C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 2 Shoulder width Model statistic [)p Dispersion parameter 1.0 x2 Pearson chi -square 37.68 (x2 0,95, 35 = 51.00) R2 Coefficient of determination 0.72 K Negative Binomial parameter 7.02 140 Table A.7 The results of accident prediction model calibration (Interchange type 33) Coefficient Variable definition Unit Estimate Std error t - statistic 44.124 A Constant - Log(A) (3.787) (0.87) (1.20) B] V1: Mainline traffic volume per lane (ADT/1000) 0-515 0-24 2-15 32 V2 : Ramp traffic volume (ADT/1000) 0.244 0.12 2.03 °/ B3 V3 :Truck percent ( 0) Cl Gl ; Interchange length (Mlle) 0.956 0.24 3.98 C2 G2 : Average spread ramp length (Mile) '2'500 0'98 '2'55 C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic DP Dispersion parameter 1.0 X2 Pearson chi -square 16.23 (12 0'95, 16: 26.30) R2 Coefficient of determination 0.82 K Negative Binomial parameter 13.3 5 141 Table A.8 The results of accident prediction model calibration (Interchange type 35) Coefficient Variable definition Unit Estimate Std error t — statistic A Constant - 8.619 Log(A) (2.154) (1.19) (1.81) B] V1: Mainline traffic volume per lane (ADT/1000) 0-736 032 0-90 32 V2 :Ramp traffic volume (ADT/1000) 0.270 0.41 0.66 °/ B3 V3 :Truck percent ( o) C 1 G1 : Interchange length (Mile) C 2 G2 : Average spread ramp length (Mile) C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic Dp Dispersion parameter 1.0 X2 Pearson chi -square 5.36 (x2 0‘95, 5 = 11.07) R2 Coefficient of determination 0.37 K Negative Binomial parameter 4. 3 5 142 Table A.9 The results of accident prediction model calibration (Interchange type 41) 1 Coefficient Variable definition Unit Estimate Std error t - statistic A Constant - 28.247 Log(A) (3.341) (2.344) (1.43) B] V): Mainline traffic volume per lane (ADT/ 1000) 0-839 0-29 2.89 B2 V2 : Ramp traffic volume (ADT/1000) 0.215 0.15 1.43 °/ B3 V3 :Truck percent ( 0) C1 G1 : Interchange length (Mile) C 2 G2 : Average spread ramp length (Mile) C3 G3 : The number of lanes - - 0.182 0.06 3.03 C4 G4 : The number of total ramps (Feet) -0.238 0.18 -1.32 C5 G5: Shoulder width Model statistic DP Dispersion parameter 1.0 x2 Pearson chi -square 17.99 (12 095 17 = 27.59) R2 Coefficient of determination 0.64 K Negative Binomial parameter 6.37 143 Table A.10 The results of accident prediction model calibration (Interchange type 51) Coefficient Variable definition Unit Estimate Std error t - statistic A Constant _ - 3.658 Log(A) (1.297) (1.23) (1.05) B( V1: Mainline traffic volume per lane (ADT/1000) 0478 0.65 0.73 132 V2 : Ramp traffic volume (ADT/1000) 0.506 0.33 1.53 B3 V3 :Truck percent (0/0) C] G1 : Interchange length (Mile) C2 G2 : Average spread ramp length (Mile) C3 G3 : The number of lanes - C4 G4 : The number of total ramps - (Feet) C5 G5 : Shoulder width Model statistic Dp Dispersion parameter 1.0 X2 Pearson chi -square 5.19 (12 0.95, 5: 11.07) R2 Coefficient of determination 0_47 K Negative Binomial parameter 4.86 144 APPENDIX B THE RESULTS OF EVALUATING OF FREEWAY IN TERCHAN GES 145 mo... 3 2.. 8-5. Boss... 2....» N2...» 2-. 3.... 8 2 3. 9.23. 25590 E...» 8.2 $2...» 8.2 2-. . 8... 2 8 cm 5...”. 85: 25$... 2...» 8.... S2...» 202 2-. N2. 2: 8 3-. Basso :2...» 2-. 3.... m2 8: B. 22.28.... 28.50 :2...» 282 2-. mm... :2. 3: cm 2.5 2 Bose... :2...» 8.2 2-. 8... N... SN 22 .86 22:90 :2...» .22 2-. E... o: 8: o>< {so 22 Bose... :2...» 232 2-. 9.... B: 8 22. $5.... 9.25.... :2...» $82 2-. m2. 2 on «3. .66 205.8 :2...» 2-. 2.0 o: 2: o>< £9 28...... :2...» $32 2-. 2... E: 22 o>< 22......ch 22:90 :2...» 952 2-. . . no... .2 8 cm 822.2 28...... :2...» 292 2-. a... x. E 5 .230 9.255 :2...» N32 2-. 3... 9 9. 3. a... 2.3 Boss... :2...» 802 2-. on... we 8 3. 2m. 2.8.50 :2...» 252 2-. . . 8... 8 2.: .m 3.... 82.0 ..=... $2...» 228 m2. «4... 2: 2: 3. 935:.me a v m 22.... 52...» 2.8 2.: a... 8 No 3. .280 a v < 222. 52...» mm :8 2.1 on... 2: 8: .3: :8 5-... o v < 0.9mm 52...» 8:8 2.: om... 8 2 3. .622. c v < 29.2. 52...» 35.. 9.: no... 8: 9. cm .225 o v < 292. 52...» :28 m2. 9... 2 on 3. cm... a v < 222. 32...» «3% m2. 8... x. 8 B. 5...... m 222. 52...» 25.. 2: R... 2m. x. B. 5E2... m 222. 52...» 2:8 2: me... 5 2 a 6.2.0 2.8.8 :2...» 8:8 22. .\. mm .2. 8 o. 3% 35.52 5:539... 3E“. 8222.0 umo. 39.0 3.: 0920.2... 2....» 35:22:. 233. .552: ESE 55369.: 2: .3 mew—3:285 .3392: he warns—«>0 .«e 3.30.. 2:. —.m oEaH mflUZéUv—HFZm ><>>m§m m0 UZ~PH .mO mks—Emmy— HEP 146 00.0 00 00 00 0.00.0 .000 0.0 0.0.00 0000..» 0002 2-. 00.0 00 .0 00 .0005... o 0 0< 0.0.00 0000..» 0002 2-. 00.0 00 00 00 000.502 0.0 0.0.00 0000..» .002 2-. 2.0 00 2 .0 .020 00.0: 0< 0.0.00 0000..» .002 2-. 2.0 :0 .0 00 .00 .00 0< 0.0.00 002...» 2-. 00.0 20 000 00 0.00 0: c 0 < 0.0.00 000..» 0002 2-. 00.0 000 000 00 0..0. 0: c 0 0 0.0.00 :02...» 0002 2-. 00.0 00: 00. .0 0.0.0000 o 0 < 0.0.00 :02...» 202 2-. 00.0 00: :0: 00 0.00000 00-5. 0 0 < 0.0.00 :02...» .002 2-. 00.0 00 .0: 00 08.0.0 c 0 < 0.0.00 800..» 00.2 2-. 00.0 :0. 0: 00 30000000 c 0 < 0.0.00 500..» 0002 2-. 00.0 00. 00: 00 0.30.00 0 0 0 0.0.00 :02...» 0002 2-. 2.0 000 :0 00 0.000000 < 0.0.00 .000.» .002 2-. 00.0 t. 00 00 .00; 0 0.0.00 :02...» 0002 2-. 00.0 00: 2: 00.2 < .0005.» :02...» .002 2-. 00.0 00: 00 00 00.00... < .0005.» :02...» 0.02 2-. :0 00 00 20-. < .0005.» .000.» 0002 2-. 00.0 00: R 00 0.00.0.0. 0000.0... 0.00 0.00..» 0002 2-. 00.0 00 00 00 0.00 0 000.00... 0.00 0:00..» 0002 2-. 2.0 000 00: 00 00.0002 0000.0... 0.00 0:00..» 0002 2-. 00.0 00 00 00 00000.0... 0.. 000. + 0000.0... 0.00..» :02 2-. 0:0 00 00 0>< 0.000 0000.0... .00.» 000 0.00..» 0.0002 2-. 2.0 00 :0 0.... 0.00.0.0... 0000.0... .00.» 000 0.2...» 0002 2-. 00.0 000 N2 0>< 0.00.2 0000.0... .00.» 000 0:00..» 0002 2-. 0.0 00 00: 0>< 0000.0. 0000.0... .00.» 000 0.00..» 0.002 2-. 00.0 00: 000 .0 000.. 0000.0... .00.» 000 0:00..» 3002 2-. . . 00.0 .0 00 00 0.00.000 000.00... 000 0:00..» 0002 2-. 00.0 000 000 00 0.00 : 000.00... .00.» 0.2...» 0002 2-. .\. 00 .\. 00 o. 0.000 09.0 00.....002 3050080. 00...“. 0020000 000. 000.0 0...: 3500.9... 0......- 30000.2... 30:52.8. .0232: .3008 55310...— »a mow—.0503... .333...— mo mats—:56 be 3.509.. 2:. —.m «Bah. I47 00.0 00. 00. 00 0_.0. 0. 0000.0... ..00.» 00000 00-. - . 00.0 .0. 000 00 0..0. 0. 0000.0... ..00.» 00000 00-. - .00 00. .00 0.00 00.0 00> 000.00... ..00.» 0.000 00-. . .00 00. 0.0 0.. 0000.0 0000.0... ..00.» <00000 00-. .00 .0. 00. 00 00005.00. 000.00... ..00.» 00.00 00-. 00.0 00 00 00 0000.00.- 08.00... ..00.» .0.00 00-. 00.0 00 00 ... .2000 0000.00 .00... .000.» 2-. 00.0 00 .0 00 ..050 .00... .000.» 0000. 0.-. 00.0 .00 000 2-. 02 0. 0m. .00.... ..00. .000.» .002 2-. 00.0 0.0 .00 2-. 00 0. 0000,00 .00.... ..00 .000.» 0000. 2-. 00.0 .00 .00 00 0.0.00 .00.... 000 .000.» 0000. 2-. 0.0 .0. 0.. 2-. 02. 0. 0.0 > 000000.... .000.» 202 2-. 00.0 00. 00. 00 .0.... 000000.... 000 .000.» .. .2 2-. 2.0 00 .0 .2... x... 0. 0.0 .00.... 000 .000.» 0000. 0.-. 00.0 00. 00 2.1 > 000000.... .000.» 00.0. 0.-. .00 00. 00. ...... x... .00.... ..00 .000.» <0000. 0.-. 00.0 00. .0. 0>< 000.00.). .00.... ..00 .000.» .000. 2-. 00.0 00. 00. 0>< 000.00.... > 000000.... .000.» 0000. 0.-. 00.0 000 00. 00.... 0000.0... 0000.» ..00. 2-. - . 00.0 000 000 0m 0.... 0. 0-0.; .03... 0000.» 0000. 0.-. .00 0.. 000 00 .2000 .0.-... 000.- .0.0.0 0000.» 0. .0. 0.-. .00 000 00. .20.. 0.00000 c 0 0.0 0.0.00 0000.» 0000. 0.-. 00.0 0.. ... 00 ....000 c 0 0< 0.0.00 0000.» 0000. 2-. 00.0 .0. 00. 00-5. 0.0. 0.0.00 0000.» 0. .0. 0.-. - . 00.0 0.. 000 00 00.00.0- 04. 0.0.00 0000.» 0.00. 0.-. 00.0 00 00 00 0.0 00. 0.0.00 0000.» 0.00. 2-. 00.0 00 00 00 .00.. 00-... 9.. 0.0.00 0000.» 0.00. 0.-. .0.. 00 .... 00 ... 0000 005.002 3.5000... 00...“. 0020000 000. 000.0 00>. 0000090.... 0...... 0000020.... 0501 30:03:03 00508 .000.: 03.0.00... 05 .3 gun—£9.85 .0300...— .3 9503.30 .... 02:00.. 0.:- —.m 030B 148 00.0 .. 0. .3: 300.0. 000 08.00020... 0.00.» 00000 00-. 0.0 00 00 00 0.8000 0000.0... 00.2 0.00.» .0.00 00-. . 00.0 00. 00. 0.5.30.0.» 000000.. .00.» ..00 0.00.» 00.000 00-. .00 00. 00 .0 ..02. 0000.00 .00.» ..00 0.00.» <..000 00-. ...0 0.. 00. 0>< ..00.. 0000.00 .00.» 000 0.00.» 0..000 00-. - - 00.0 00. 000 .0 0.0. 000.00.. .00.» ..00 0.00.» 00.000 00-. .00 00 00 ... .0.00 000.00... .00.» ..00 0.00.» .0000 00-. 00.0 00. 00. 0.00.8.0: 000.00... ..00 0.00.» 000000 00-. 00.0 0.. 00. 0.00 000.0 .5 0000.0... ..00 0.00.» 0.000 00-. 00.0 ... 0.. 00 00000 000.00.. .00.» ..00 0.00.» . .000 00-. .00 .0 00 0>< 000.000. 0000.00 .00.» ..00 0.00.» o. .000 00-. . 00.0 00 00. 0.00 .0.00 000... 000000.. .00.» ..00 0.00.» <0.000 0.. .00 .0 .0 0.< 0.00. 000.00.. .00.» ..00 0 .00.» ..000 00-. 00.0 00. 00. .0 0.00 000000.. .00.» 000 0.00.» 00.000 00-. - . .00 00. 000 0.00 ..00.. 02 000.00... .00.» 0.00.» 0.000 00-. 00.0 00. 0.. ... 0000.00 0000.0... .00.» 0.00.» 0.000 00-. - 00.0 00. 00. 00 000.0000 0000.0... .00.» 0.00.» 00-. .00 0. 00 00.-.... 000.00... .00.» 00.2 0.00.» 00.00 00-. 00.0 00 00. .0 0.0 0000.0... .00.» 0.00.» 0.000 00-. 00.0 00 00 02.0002 0000.0... .00.» 0.00.» 00000 00-. 0.0 .0 00 00 ..00....~ 0000.0... .00.» 0.00.» .0.00 00-. 0.0 00 00 .0 0.00 .0020 0000.0... .00.» 0.00.» 00000 00-. 00.0 00 00 00 0.0. 0000.0... .00.» 0.00.» .0.00 00-. 00.0 00 00 0.0.. 000.00.... 000.00... .00.» 0.00.» 00.00 00-. 00.0 .0 00 00 0.000 000.. 000.00... .00.» 0.00.» 00000 00-. 0.0 ... 000 00 0_.0. 0 000.00... ..00.» .0000 00-. ...0 00. .00 00 00.0.0.2 0000.0... ..00.» 00000 00-. 00.0 000 000 00 .0.0.0> 0000.0... ..00.» 00000 00-. 0.0 00. 00. 0.00. 0.0.000 000.00... ..00.» 00000 00-. .... 00 0o 00 ... 0000 005.000. 20.0000... 00...“. 0020000 000. 000.0 0.... 0000000.... 0....» 0000020.... 0500. .000.—0:3. 0050:. .0005 5.30.09... 05 >0 009509.85 00300.... .... ”5.0305 .... 0.1.0.... 2:. 0m— 030,—- 149 00.0 .0. 00. .0 0.0.0 0 0 < 0.0.0.. .00.... ...00 00-. 00.0 000 0.0 00 0000.0: 0 0 < 0.0.0.. .000... 00.00 00-. 00.0 00. ... 00 00000 0..05. 0 0 < 0.0.0.. .000... 00000 00-. 0.0 00. 00 0>< 00000.00; 0 0 0 0.0.0.. .000... 0.000 00-. 00.0 .0. 00 00 0.00.0. 0 0 < 0.0.0.. .00.... 0.000 00-. 0.0 00. .0. 00 .0.00... 0 0 0 0.0.0.. .000... 00000 00-. 00.0 00. 0.. 0.. 00.0000 0 0 < 0.0.0.. .000... 00000 00-. 0.0 ... 0. .0.00 0 0 < 0.0.0.. .000... 00-. 00.0 00 00 0.-00 0 0 < 0.0.0.. .000... .0000 00-. - 00.0 00. 0.0 0.2. .000... 0 0.0.0.. .00.... 0.000 00-. 00.0 00. .0. 00-... 0 0.0.0.. .000... 00.00 00-. 00.0 00. .0. 0.00 ..00.... 0..... 0 0.0.0.. .000... 00000 00-. - - .00 00 00. 00 0.00080 0 0.0.0.. .00.... 00000 00-. - - 00.0 .. 00. 00.5. < 0.0.0.. .00.... .0000 00-. 00.0 0. 00 00-... 0 0.0.0.. .000.... 00.00 00-. 00.0 00 .0 00 .0000 0 0.0.0.. .000... 0.000 00-. 00.0 00 00 00 .0... 00.0.. 0 0.0.0.. .000... 00000 00-. 00.0 0. 0. 00 ..00. 0 .0000..- .000... 00000 00-. .00 00 00 .0 00-. 0 .000.... .000.... 00000 00-. - 00.0 00 00 00 .0000 0 .0000... .00.... 00000 00-. - 00.0 00 00 00.-. 00 < .000.... .000... 00000 00-. 00.0 .0 00 00.... < .000.... .000... 00.00 00-. - - 00.0 .00 000 00 000.. 0000.0... .....0 0.00... 00.000 00-. 00.0 00 0. 0>< 0.00.000 0000.0... .....0 0.00... <00000 00-. 0.0 00. 00. 00 0000.000... 000.00... 00.2 0.00... 00.00 00-. 00.0 00. .0 00 0000 0000.0... 00.2 0.00... 00.00 00-. 00.0 .0 00 00 0.00 .0 0000.0... 00.2 0.00... .0000 00-. .00 .. 00 00 00..00 ..00.-000.00... 0.0.... 0..00 00-. . .00 00 00 ..0 00-. 000.00... 00.2 0.0.... .0.00 00-. 0. 00 0. 00 ... 0000 00.....000. ......0000... 00...“. 0020000 000. 000.0 00... 000000.90. 00>... 000000.00. 0.30. .00—.0503. 0050... .0008 0.50.00... 0... .3 00w0000000£ .0033...— .«0 95020.00 .... 03:00.. 0.:- _.m 030,—- 150 0.0 0. 00 0.00.000 000.0 000.00... ..00.. .0.00 00-. 0.0 00. .0. 00 0.00.00 .00.0 .000... 00000 00-. 00.0 00 0. ... 0000.00 .00.0 .000... 00000 00-. 0.0 000 00. 00 0..0. .. .00.... .0.. .000... 00000 00-. 00.0 000 0.. .0 0.00.... .00.... ...... .000... 00-. .00 000 000 0.-... .00.... .0.. .00.... 00-. .00 000 000 00 - . 0. ...0 .00.... .0.. .000... 0.000 00-. .00 000 000 00 0.0.00.0. 00000360.... .000... 00000 00-. 0.0 00 00 0>< .000... > .00.... 00.. .00.... .0000 00-. 00.0 00. 000 0.0.. .00000 .00.... ...... .000.... 00000 00-. 0.0 00. 0. 0... 00.80 .00.... 00.. .000... 00000 00-. .00 .0. .0. 0.00 000.00.... .00.... 00.. .00.... 0.000 00-. 00.0 .0 00 0.-00 > .00.... 00.. .000... 00.00 00-. 0.0 00 00 00-. > .00.... 000 .000... 00.00 00-. 00.0 00. 00. 00 000.00.... 0-0.; 0.6.0 0000... 00.00 00-. 0.0 .0. 0. 3. 0.0.-...... 0-0.; .920 000.... 00-. 00.0 00. 00. .0.-0.. 0-0.; 0.5.0 0000... 0.000 00-. 0.0 .0. 00. 00.00000: 0-0.; 0.6.0 000.... 00.00 00-. 00.0 .0 .0 0.-00 .0000>0.0 0000... 00000 00-. 00.0 00. 00 0.00 000.00.... 0 0 0< 0.0.0.. 000.... 00-. 00.0 00. .0. 0.00 00020.00 0 0 0< 0.0.0.. 0000... 00000 00-. 00.0 .0 00 00 0.00000... 0 0 0.0 0.0.0.. 000.....- 00000 00-. 0.0 00 00 0.-00 0 0 00. 0.0.0.. 0000... .0.00 00-. 00.0 00 00 00 .00... 0.0.. 0 0 00. 0.0.0.. 0000... 00-. .00 00 0.. 00 .0.00 0002 0.0. 0.0.0.. 000.... .0000 00-. 00.0 00 .0 00 2.0.. 30.. 04. 0.0.0.. 0000... 00000 00-. 00.0 00 .0 00 0.00 .. 0.0 0.0.0.. 0000... 000.00 00-. 00.0 00. .0 00-... 0 0 < 0.0.0.. .000... 00000 00-. .00 000 000 00 00.0.5 0 0 < 0.0.0.. .00.... 00.00 00-. 0. 00 .0. 00 ... 00._0 00.....000. ......0000... 00...“. 0020000 000. 000.0 0..... 000000.20. 0...... 000000.20. 0.00m €055.80. 000.0:— _00..E 0050.00...— 0... .... 009509.02: A0300... .... 95020.6 .... 0..—.00.. 2:. —.m 0.00-H 151 . . 00.0 00 .0. 0.00. .0.00 000.0 .006 .00..... 00.00 00-. 00.0 000 .00 0.. 0..0. . o 0 0... 0.0.00 0000... 00.00 00-. 00.0 00 .0 00 0.0.. 9. 0.0.00 0000... 00-. 00.0 0.0 .00 0m 0.0. 0 o 0 < 0.0.00 .000... 0..00 00-. 00.0 00 00 0m ...02 c 0 < 0.0.00 .00..... 00.00 00-. 0.0 .0 00 00 00.0.. .000. o 0 0 0.0.00 .000.. 00.00 00-. 00.0 .0. 00. 0.. 0.0....2 < 0.0.00 .00..... 00.00 00-. 00.0 0.. 00 0.. 00.00.0000. 0 0.0.0.. .000... .0.00 00-. 00.0 00. 00. 0.. 00030.. 50005.. .00..... 000.00 00-. 00.0 .. 00 00 000.00.... ..00.-0000.0... 0.00... 00.00 00-. 0.0 .00 00. 0.. 00.0. 000.00... 000 0.00... 00..00 00-. .00 .0. 00. 000 000.0 2. 000.00... .00.. ..00. 0 .00... 00.00 00-. 00.0 .0 .0 0m .0.. 0000.0... .00.. 000 0.00... 000.00 00-. 0.0 0.0 000 0>< 0000.5 000.00... ..00 0 .00... 00.00 00-. 00.0 00. 0.. 02.. 0.0.02 0000.0... ..00. 0.00... .0.00 00-. ...0 .0 0. 0m ..0__0> .00000... 000.00... 000 0.00... 00.00 00-. 00.0 00. 00. 00. 0.000090 0000.0... .00.. 0.00... 00.00 00-. 00.0 00. 0.. 0.00.00. 0000.0... .00.. 0.00... 000.00 00-. 00.0 ... 00. 00. .00.....) 0000.0... .00.. 00.2 0.00... 00.00 00-. ...0 0.. 00. 00. 00.00.0000. 0000.0... ..00.. 0..00 00-. 00.0 00. .0. 00. 0.00.0. 0000.0... ..00.. ...00 00-. 0.0 00. ... 00 00.00.05. 0000.0... ..00.. 0..00 00-. 00.0 00. 00. 0.. 0000.00... 000.00... ..00.. 0..00 00-. 0.0 00. .0. .00.. 0000.. 0000.0... ..00.. 0..00 00-. .00 00. 00 ... .0.00 000.00... ..00.. 00.00 00-. 00.0 00 00 00. .0.00000 000.00... ..00.. 00-. 00.0 .0 .0 0.2 .00.. 000.0 000.00... ..00.. 00.00 00-. 00.0 00 00 020 00.0.0.2. 0000.0... ..00.. 00.00 00-. .0.. 00 .... 00 ... 0000 005.000. 30.0000... 00...“. 0020000 000. 000.0 0...: 09.05.20. 0..... 00005.2... 0.33. 30:50:00. 6050:. .0608 00:050.... 0.: .3 nouns—.28.: .0300... .«o Minus—«>0 .... 3.500.. 0.; —.m 030,—. 152 00.0 00. 00. 00 0..... 0 .006 .00.... 00-. mm... mom on. No.42 0. .050 .00.....- 00-. 00.0 .0 00 0.. 0.0.00.0. .006 .000.. 0..00 00-. 0\0 mm 0\0 cm a. 00...... 005.002 30.0000... 00...“. 0020000 000. 000.0 0...: 09.05.25 0%. 000000.20. 0.03. 30:00:03 0050.: _000E 00:300.... 05 .... mow—5:38.: 00300.... .... ”5.03.30 .3 03:00.. 0.:- —.m 030,—- 153