RE-CALIBRATION OF RIGID PAVEMENT PERFORMANCE MODELS AND DEVELOP MENT OF TRAFFIC INPUTS FOR PAVEMENT -ME DESIGN IN MICHIGAN By Gopi Krishna Musunuru A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Civil Engineering Doctor of Philosophy 2019 ABSTRACT RE-CALIBRATION OF RIGID PAVEMENT PERFORMANCE MODELS AND DEVELOP MENT OF TRAFFIC INPUTS FOR PAVEMENT -ME DESIGN IN MICHIGAN By Gopi Krishna Musunuru The mechanistic -empirical pavement design guide (AASHTOWARE Pavement -ME) incorporates mechanistic models to estimate stresses, strains, and deformations in pavement layers using site -specific climatic, materi al, and traffic characteristics. These structural responses are used to predict pavement performance using empirical models (i.e., transfer functions). The transfer functions need to be calibrated to improve the accuracy of the performance predictions, ref lecting the unique field conditions and design practices. The existing local calibrations of the performance models were performed by using version 2.0 of the Pavement -ME software. However, AASHTO has released versions 2.2 and 2.3 of the software since the completion of the last study. In the revised versions of the software, several bugs were fixed. Consequently, some performance models were modified in the newer software versions. As a result, the concrete pavement IRI predictions and the resulting PCC s lab thicknesses have been impacted. The performance predictions varied significantly from the observed structural and function distresses, and hence, the performance models were recalibrated to enhance the confidence in pavement designs. Linear and nonline ar mixed -effects models were used for calibration to account for the non -independence among the data measured on the same sections over time. Also, climate data, material properties, and design parameters were used to develop a model for predicting permane nt curl for each location to address some limitations of the Pavement -ME. This model can be used at the design stage to estimate permanent curl for a given location in Michigan. Pavement -ME also requires specific types of traffic data to design new or rehabilitated pavement structures. The traffic inputs include monthly adjustment factors (MAF), hourly distribution factors (HDF), vehicle class distributions (VCD), axle groups per vehicle (AGPV), and axle load distributions for different axle configurations. During the last seven years, new traffic data were collected, which reflect the recent economic growth, additional, and downgraded WIM sites. Hence it was appropriate to re -evaluate the current traffic inputs and incorporate any chang es. Weight and classification data were obtained from 41 Weigh -in-Motion (WIM) sites located throughout the State of Michigan to develop Level 1 (site -specific) traffic inputs. Cluster analyses were conducted to group sites for the development of Level 2A inputs. Classification models such as decision trees, random forests, and Naïve Bayes classifier were developed to assign a new site to these clusters; however, this proved difficult. An alternative simplified method to develop Level 2B inputs by grouping sites with similar attributes was also adopted. The optimal set of attributes for developing these Level 2B inputs were identified by using an algorithm developed in this study. The effects of the developed hierarchical traffic inputs on the predicted perf ormance of rigid and flexible pavements were investigated using the Pavement -ME. Based on the statistical and practical significance of the life differences, appropriate levels were established for each traffic input. The methodology for developing traffic inputs is intuitive and practical for future updates. Also, there is a need to identify the change in traffic patterns to update the traffic inputs so that the pavement sections would not be overdesigned or under -designed. Models were developed where the short -term counts from the PTR sites can be used as inputs to check if the new traffic patterns cause any substantial differences in design life predictions. iv TO MOM, DAD, & MONA v ACKNOWLEDGEMENTS I would like to thank my advisor, Dr. Syed Waqar Haider, for giving me the opportunity to pursue my Ph.D. back in 2016. I am grateful for his guidance throughout my research and the writing of this thesis . He has been a tremendous mentor and his advices on academics and my career have been invaluable. I would also like to thank my advisory committee, Dr. Neeraj Buch , Dr. Karim Chatti, Dr. Muhammed Emin Kutay , and Dr. Gustavo de los Campos for their suggestions and support . I would also like to thank the Michigan Department of T ransportation (MDOT ) for sponsoring and funding th e studies which made this thesis possible . Special thanks to Michael Eacker and Justin Schenkel of MDOT for providing valuable inputs throughout the research. Special thanks to our graduate secretary Laura Post for helping me out with all the paperwork over the last many years. The technical, personal, and financial support of the above mentioned as well as my family has made this all possible. Thank you to everyone who has made this possible. vi TABLE OF CONTENTS LIST OF TABLES ....................................................................................................................... vii i LIST OF FIGURES ...................................................................................................................... xii KEY TO ABBREVIATIONS ...................................................................................................... xvi CHAPTER 1 - INTRODUCTION ...................................................................................................1 1.1 PROBLEM STATEMENT AND BACKGROUND ............................................................1 1.2 OUTLINE OF REPORT .......................................................................................................7 REFERENCES ............................................................................................................................9 CHAPTER 2 - LITERATURE REVIEW ......................................................................................11 2.1 RIGID PAVEMENT PERFORMANCE MODELS IN PAVEMENT -ME ........................11 2.1.1 Transverse Cracking Model ......................................................................................12 2.1.2 IRI Model ..................................................................................................................14 2.2 REQUIRED INPUTS FOR LOCAL CALIBRATION .......................................................16 2.2.1 Concepts of Curling, Warping and Zero Stress Temperature ...................................17 2.3 METHODOLOGY FOR CALIBRATION .........................................................................26 2.3.1 Longitudinal Data Analyses ......................................................................................28 2.3.2 Transverse cracking ..................................................................................................41 2.3.3 IRI .............................................................................................................................42 2.3.4 Reliability Models .....................................................................................................42 2.4 PAVEMENT -ME TRAFFIC INPUTS ...............................................................................44 2.4.1 Directional distribution factor (DDF) .......................................................................45 2.4.2 Lane distribution factor (LDF) ..................................................................................45 2.4.3 Axles per truck class .................................................................................................45 2.4.4 Axle and tire spacing ................................................................................................46 2.4.5 Tire pressure ..............................................................................................................46 2.4.6 Traffic growth ...........................................................................................................47 2.4.7 Operational speed ......................................................................................................48 2.4.8 Lateral Wander ..........................................................................................................48 2.4.9 Monthly adjustment factor (MAF) ............................................................................48 2.4.10 Hourly distribution factor (HDF) ............................................................................48 2.4.11 Vehicle class distribution (VCD) ............................................................................49 2.4.12 Axle load spectra (ALS) .........................................................................................49 2.5 REVIEW OF PREVIOUS STUDIES ON TRAFFIC CHARACTERIZATION ................51 2.5.1 National Studies ........................................................................................................52 2.5.2 Other States ...............................................................................................................59 2.6 REVIEW OF EXISTING PRACTICES IN MICHIGAN ...................................................67 2.6.1 Improved Existing Methodology (Level 2A Inputs) ................................................67 2.6.2 Alternative Simplified Methodology (Level 2B Inputs) ...........................................68 2.7 SUMMARY ........................................................................................................................69 REFERENCES ..........................................................................................................................73 vii CHAPTER 3 - EVALUATION OF THE PERFORMANCE PREDICTION MODELS .............79 3.1 INTRODUCTION ...............................................................................................................79 3.2 EVALUATION OF PERFORMANCE MODELS .............................................................79 3.2.1 Rigid Pavements .......................................................................................................80 3.2.2 Flexible Pavements ...................................................................................................84 3.3 COMPARISONS WITH MEASURED PERFORMANCE ................................................88 3.3.1 Rigid Pavements .......................................................................................................89 3.3.2 Flexible Pavements ...................................................................................................93 3.4 AVAILABLE PMS CONDITION DATA ..........................................................................98 3.5 RE -CALIBRATION OF RIGID PAVEMENT MODELS .................................................99 3.5.1 Transverse Cracking Model ....................................................................................103 3.5.2 IRI Model ................................................................................................................110 3.6 IMPLEMENTATION CHALLENGES ............................................................................116 3.6.1 Impact of Re -calibration on Pavement Designs ......................................................116 3.6.2 Lessons Learned ......................................................................................................117 3.7 PERMANENT CURL/WARP MODEL ...........................................................................120 3.8 RE -CALIBRATION BASED ON PERMANENT CURL MODEL ................................123 3.8.1 Impact of Re -Calibration on Pavement Designs .....................................................133 3.9 SUMMARY ......................................................................................................................134 REFERENCES ........................................................................................................................136 CHAPTER 4 - DEVELOPMENT OF TRAFFIC INPUTS .........................................................138 4.1 DATA COLLECTION AND PROCESSING ...................................................................138 4.1.1 Review of Existing Data Collection Sites ...............................................................138 4.2 GENERATION OF MULTIPLE TRAFFIC INPUT LEVELS ........................................139 4.2.1 Level 2A Inputs .......................................................................................................141 4.2.2 Level 2B Inputs .......................................................................................................187 4.2 SUMMARY ......................................................................................................................200 REFERENCES ........................................................................................................................203 CHAPTER 5 - SIGNIFICANT TRAFFIC INPUTS ...................................................................206 5.1 SENSITIVITY ANALYSES .............................................................................................206 5.1.1 Level 2A Sensitivity Analyses ................................................................................209 5.1.2 Level 2B Sensitivity Analyses ................................................................................211 5.1.3 Level 3A Sensitivity Analyses ................................................................................215 5.1.4 Choosing the Appropriate Traffic Input Level .......................................................219 5.1.5 Identifying the Changes in Traffic Patterns ............................................................226 5.2 SUMMARY ......................................................................................................................230 REFERENCES ........................................................................................................................234 CHAPTER 6 - CONCLUSIONS, RECOMMENDATIONS ......................................................236 6.1 CONCLUSIONS ...............................................................................................................236 6.1.1 Findings based on the Recalibration .......................................................................236 6.1.2 Findings based on the Cluster Analysis and Traditional Approaches ....................237 6.1.3 Significant Traffic Inputs ........................................................................................242 6.1.4 Assigning a Site to a Cluster or a Group .................................................................244 6.1.5 General Findings .....................................................................................................245 6.2 RECOMMENDATIONS ..................................................................................................245 viii LIST OF TABLES Table 2 -1 Local calibration coefficients for the rigid transverse cracking and IRI models ......... 14 Table 2 -2 Impact of Pavement -ME inputs on rigid pavement performance predictions .............. 17 Table 2 -3 Hypothesis tests for statistical significance .................................................................. 27 Table 2 -4 Traffic data required for the three Pavement ME input levels ..................................... 47 Table 2 -5 The Pavement -ME default hourly distribution factors ................................................. 50 Table 2 -6 FHWA Vehicle Classes ................................................................................................ 51 Table 2 -7 NCHRP 1 -37A Truck traffic classification (TTC) groups (44) ................................... 53 Table 2 -8 NCHRP 1-37A guide for selecting appropriate TTC groups (44) ............................... 53 Table 2 -9 Minimum number of data collection days per season to estimate TTC (19) ............... 54 Table 2 -10 Minimum number of data collection days per season to estimate ALS (19) ............. 54 Table 2 -11 LTPP WIM system per formance requirements .......................................................... 57 Table 2 -12 Summary of NALS categories by weight for different axle group types ................... 59 Table 2 -13 Summary of clustering methodologies used to generate level 2 inputs ..................... 72 Table 3 -1 Standard errors and biases for transverse cracking (recon structs and UCO) ............... 80 Table 3 -2 Standard errors and biases for faulting (reconstructs and UCO) .................................. 82 Table 3 -3 Standard errors and biases for IRI (reconstructs and UCO) ......................................... 83 Table 3 -4 Standard errors and biases for longitudinal cracking ................................................... 85 Table 3 -5 Standard erro rs and biases for fatigue cracking ........................................................... 86 Table 3 -6 Standard errors and biases for surface rutting .............................................................. 87 Table 3 -7 Standard errors and biases for surface roughness (IRI) ................................................ 88 Table 3 -8 Standard errors and biases between measured and predicted transverse cracking (reconstructs and UCO) ................................................................................................................ 90 ix Table 3 -9 Standard errors and biases between measured and predicted faulting (reconstructs and UCO) ............................................................................................................................................. 91 Table 3 -10 Standard errors and biases between measured and predicted IRI (reconstructs and UCO) ............................................................................................................................................. 92 Table 3 -11 Stand ard errors and biases between measured and predicted longitudinal cracking .. 94 Table 3 -12 Standard errors and biases between measured and predicted fatigue cracking .......... 95 Table 3 -13 Standard errors and biases between measured and predicted rutting ......................... 96 Table 3 -14 Standard errors and biases between measured and predicted IRI .............................. 97 Table 3 -15 Local calibration summary for transverse cracking (Option 1) ................................ 105 Table 3 -16 Local calibration summary for transverse cracking (Option 2) ................................ 110 Table 3 -17 Local calibration summary for IRI (Option 1) ......................................................... 112 Table 3 -18 Local calibration summary for IRI (Option 2) ......................................................... 114 Table 3 -19 Summary of attributes used in the model ................................................................. 122 Table 3 -20 Local calibration summary for transverse cracking Œ Option 1 ............................... 124 Table 3 -21 Local calib ration summary for transverse cracking Œ Option 2 ............................... 126 Table 3 -22 Local calibration summary for IRI Œ Option 1 ......................................................... 129 Table 3 -23 Local calibration summary for IRI Œ Option 2 ......................................................... 131 Table 3 -24 Local calibration summary for transverse cracking ................................................ 133 Table 4 -1 List o f PTR sites with classification data only (Non -WIM) ....................................... 139 Table 4 -2 List of PTR sites with WIM and classification data ................................................... 140 Table 4 -3 Number of sites with available WIM and classification data ..................................... 140 Table 4 -4 Number of clusters for each traffic input ................................................................... 164 Table 4-5 Comparison of clustering techniques using various internal indices ......................... 165 Table 4 -6 Confusion matrix for a dataset with two class labels (clusters) ................................. 175 Table 4 -7 Training and testing losses for single decision trees .................................................. 179 x Table 4 -8 Training and testing losses for random forests ........................................................... 181 Table 4 -9 Training and testing losses for naïve Bayes classifier ................................................ 186 Table 4 -10 Possible combination of attributes when chosen two at a time ................................ 190 Table 4 -11 Possible combination of attributes when chosen three at a time .............................. 191 Table 4 -12 Possible combination of attributes when chosen four at a time ............................... 191 Table 4 -13 VCD traffic inputs for the combination of functional class and development type . 192 Table 4 -14 Pairwise Euclidean distances between the sublevel combinations .......................... 193 Table 4 -15 Number of PTR locations in each sublevel combination (2 -way) for road type/ VCD level combination ........................................................................................................................ 193 Table 4 -16 Number of PTR locations in each sublevel combination (3 -way) for road type/ development type/VCD level combination ................................................................................. 194 Table 4 -17 Number of PTR locations in each sublevel combination (4 -way) for road type/number of lanes/ development type/VCD level combination ............................................. 195 Table 4 -18 Descriptive Statistics of the pairwise distances between the sublevel combinations for various attribute c ombinations (VCD) ........................................................................................ 196 Table 4 -19 Number of clusters and road groups formed for Level 2 inputs .............................. 202 Table 5 -1 Baseline designs for flexible pavements .................................................................... 207 Table 5 -2 Baseline designs for rigid pavements ......................................................................... 208 Table 5 -3 Impact designation on predicted pavement performance ........................................... 208 Table 5 -4 ANOVA results for Level 2A VCD clusters for flexible pavements (bottom -up fatigue cracking) ..................................................................................................................................... 210 Table 5 -5 Sensitivity of rigid and flexible pavements to statistical significance Œ Level 2A .... 211 Table 5 -6 Sensitivity of rigid and f lexible pavements to moderate MLD criteria Œ Level 2A ... 211 Table 5 -7 ANOVA results for Level 2B groups ......................................................................... 213 Table 5 -8 Sensitivity of rigid and flexible pavements to statistical significance Œ Level 2B ..... 214 Table 5 -9 Sensitivity of rigid and flexible pavements to moderate MLD criteria Œ Level 2B ... 215 Table 5 -10 ANOVA results for Level 3A VCD clusters or groups ............................................ 215 xi Table 5 -11 Sensitivity of rigid and flexible pavements to statistical significance Œ Level 3A .. 218 Table 5 -12 Sensitivity of rigid and flexibl e pavements to MLD criteria Œ Level 3A ................. 219 Table 5 -13 Summary of statistical significance Œ Levels 2A vs. 2B .......................................... 221 Table 5 -14 Paired t -test results between Levels 2A and 2B for rutting ...................................... 221 Table 5 -15 Summary of statistical significance Œ Levels 2A vs. 3A .......................................... 222 Table 5 -16 Summary of statistical significance Œ Levels 2B vs. 3A .......................................... 222 Table 5 -17 Paired t -test results between Levels 2A and 2B for HDF (IRI and transverse cracking) ..................................................................................................................................................... 222 Table 5 -18 Summary of statistical significance Œ Levels 3A vs. 3B .......................................... 224 Table 5 -19 Pai red t -test results between Levels 2A and 2B for TALS (IRI) ............................. 225 Table 5 -20 Paired t -test results between Levels 2A and 2B for TALS (Faulting) ..................... 225 Table 5 -21 Recommended traffic input levels ............................................................................ 233 Table 6 -1 Recommended traffic input levels .............................................................................. 244 Table 6 -2 Summary of rigid pavement performance models with local coefficients (Initial recalibration) ............................................................................................................................... 246 Table 6 -3 Summary of rigid pavement performance models with local coefficients (Permanent curl model) .................................................................................................................................. 246 xii LIST OF FIGURES Figure 1 -1 Temporal changes in traffic characteristics at the selected sites ................................... 5 Figure 1 -2 Comparison of designs between different versions of Pavement -ME .......................... 7 Figure 2 -1 Demonstration of different levels of standard error and bias ...................................... 28 Figure 2 -2 Default HDFs in the MEPDG ..................................................................................... 50 Figure 3 -1 Comparison of predicted transverse cracking (reconstructs and UCO) ...................... 81 Figure 3 -2 Comparison of predicted faulting (reconstructs and UCO) ........................................ 82 Figure 3 -3 Comparison of predicted IRI (reconstructs and UCO) ............................................... 83 Figure 3 -4 Comparison of predicted longi tudinal cracking .......................................................... 85 Figure 3 -5 Comparison of predicted fatigue cracking .................................................................. 86 Figure 3 -6 Comparison of predicted surface rutting ..................................................................... 87 Figure 3 -7 Comparison of predicted surface roughness (IRI) ...................................................... 88 Figure 3 -8 Predicted vs. measured transverse cracking (reconstruc ts and UCO) ......................... 90 Figure 3 -9 Predicted vs. measured faulting (reconstructs and UCO) ........................................... 91 Figure 3 -10 Predicted vs. measured IRI (reconstructs and UCO) ................................................ 92 Figure 3 -11 Predicted vs. measured fatigue cracking ................................................................... 94 Figure 3 -12 Predicted vs. measured fatigue cracking ................................................................... 95 Figure 3 -13 Predicted vs. measured rutting .................................................................................. 96 Figure 3 -14 Predicted vs. measured IRI ....................................................................................... 97 Figure 3 -15 Measured transverse cracking performance .............................................................. 98 Figure 3 -16 Measured IRI ............................................................................................................. 99 Figure 3 -17 Single fit for the entire data ..................................................................................... 100 Figure 3 -18 Boxplot residuals for each test section for a single fit ............................................ 100 xiii Figure 3 -19 Individual models for each test section ................................................................... 101 Figure 3 -20 Boxplot residuals for each test section for individual models ................................ 102 Figure 3 -21 Boxplot residuals for each test section for mixed -effects model ............................ 102 Figure 3 -22 Local calibration results for transverse cracking using entire dataset Œ Option 1 .. 104 Figure 3 -23 Bootstrap sampling calibration results Œ Option 1 (1000 bootstraps) ..................... 105 Figure 3 -24 Bootstrap sampling validation results Œ Option 1 (1 000 bootstraps) ...................... 106 Figure 3 -25 Reliability plot for transverse cracking ................................................................... 107 Figure 3 -26 Local calibration results for transverse cracking using entire dataset Œ Option 2 .. 108 Figure 3 -27 Bootstrap sampling calibration results Œ Option 2 (1000 bootstraps) ..................... 109 Figure 3 -28 Bootstrap sampling validation results Œ Option 2 (1000 bootstraps) ...................... 109 Figure 3 -29 Local calibration results for IRI using entire dataset Œ Option 1 ............................ 111 Figure 3 -30 Bootstrap sampling calibration results Œ Option 1 (1000 bootstraps) ..................... 112 Fig ure 3 -31 Bootstrap sampling validation results Œ Option 1 (1000 bootstraps) ...................... 113 Figure 3 -32 Local calibration results for IRI using entire dataset Œ Option 2 ............................ 113 Figure 3 -33 Bootstrap sampling calibration results Œ Option 2 (1000 bootstraps) ..................... 115 Figure 3 -34 Bootstrap sampling validation results Œ Option 2 (1000 bootstraps) ...................... 115 Figure 3 -35 Design thickness comparisons for mainline roads .................................................. 117 Figure 3 -36 Relationship between damage and percent slab cracked ........................................ 118 Figure 3 -37 Predicted versus actual permanent curl in Michigan .............................................. 122 Figure 3 -38 Local calibration results for transverse cracking using entire dataset Œ Option 1 .. 123 Figure 3 -39 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 1 ..................... 125 Figure 3 -40 Bootstrap sampling validation results (1000 b ootstraps) Œ Option 1 ...................... 125 Figure 3 -41 Local calibration results for transverse cracking using entire dataset Œ Option 2 .. 126 Figure 3 -42 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 2 ..................... 127 xiv Figure 3 -43 Bootstrap sampling validation result s (1000 bootstraps) Œ Option 2 ...................... 127 Figure 3 -44 Local calibration results for IRI using entire dataset Œ Option 1 ............................ 128 Figure 3 -45 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 1 ..................... 129 Figure 3 -46 Bootstrap sampling validation results (1000 bootstraps) Œ Option 1 ...................... 130 Figure 3 -47 Local calibration results for IRI using entire dataset Œ Option 2 ............................ 131 Figure 3 -48 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 2 ..................... 132 Figure 3 -49 Bootstrap sampling validation results (1000 bootstraps) Œ Option 2 ...................... 132 Figure 3 -50 Design thickness comparisons for mainline roads .................................................. 133 Figure 4 -1 An example of the dendrogram ................................................................................. 147 Figure 4 -2 Single linkage clustering ........................................................................................... 148 Figure 4 -3 Complete linkage clustering ...................................................................................... 149 Figure 4 -4 Group average clust ering .......................................................................................... 150 Figure 4 -5 Ward™s method of clustering ..................................................................................... 151 Figure 4 -6 A hexagonal arrangement of a 10 -by-10 set of neurons ........................................... 154 Figure 4 -7 Number of PTR sites in each node ............................................................................ 156 Figure 4 -8 VCDs of PTR sites in various neurons ..................................................................... 157 Figure 4 -9 SOM weight planes for VCD data ........................................................................... 158 Figure 4 -10 Optimum number of clusters for HDF .................................................................... 164 Figure 4 -11 Cluster dendrograms for various traffic inputs Š Michigan PTR sites .................. 166 Figure 4 -12 Cluster dendrograms for various traffic inputs Š Michigan PTR sites .................. 167 Figure 4 -13 Cluster averages (Level 2A) for various traffic inputs ............................................ 168 Figure 4 -14 Cluster averages (Level 2A) for various traffic inputs ............................................ 171 Figure 4 -15 Geographical distributions for PTRs by clusters for traffic inputs ......................... 173 Figure 4 -16 Geographical distributions for PTRs by clusters for all traffic inputs .................... 174 xv Figure 4 -17 Example of a decision tree ...................................................................................... 177 Figure 4 -18 Single decision tree for HDF using entire data ....................................................... 179 Figure 4 -19 Single decision tree for TALS using entire data ..................................................... 180 Figure 4 -20 Random forests (100th) tree for HDF us ing entire data .......................................... 182 Figure 4 -21 Random forests (180th) tree for TALS using entire data ........................................ 183 Figure 4 -22 Group averages (Level 2B) for various traffic inputs ............................................. 197 Figure 4 -23 Group averages (Level 2B) for various traffic inputs ............................................. 199 Figure 5 -1 Mean design life comparisons between different for Level 2A VCD clusters for flexible pavements (bottom -up fatigue cracking) ....................................................................... 210 Figure 5 -2 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 2A VCD clusters ............................................................................................................... 212 Figure 5 -3 Mean design life comparisons between different clusters for VCD ......................... 213 Figure 5 -4 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 2B VCD road groups ........................................................................................................ 216 Figure 5 -5 Mean design life comparisons between different clust ers for VCD ......................... 217 Figure 5 -6 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 3A VCD groups ................................................................................................................ 218 Figure 5 -7 Number of over and under -designed PTR locations Š Levels 2A and 2B (VCD) .. 221 Figure 5 -8 Number of over and und er-designed PTR locations Š Levels 2A and 2B (HDF) .. 223 Figure 5 -9 Number of over and underdesigned PTR locations Š Levels 2A and 2B (MAF) ... 224 Figure 5 -10 Number of over and under -designed PTR locations Š Levels 2A and 2B (SALS) 226 Figure 5 -11 Number of over and under -designed PTR locations Š Levels 2A and 2B (TALS) ..................................................................................................................................................... 227 Figure 5 -12 Predicted vs measured life differences for flexible pavements ............................... 229 Figure 5 -13 Predicted vs. measured life differences for rigid pavements .................................. 229 xvi KEY TO ABBREVIATIONS AADT: Average Annual Daily Traffic AADTT: Average Annual Daily Truck Traffic AASHO: American Association of State Highway Officials AASHTO: American Association of State and Highway Transportation Officials AGPV: Axle Groups Per Vehicle ANOVA: Analysis of Variance API: Application Programming Interface ATR: Automatic Traffic Recorder AVC: Automatic Vehicle Classification COHS: Corridors of Highest Significance CI: Confidence Interval CLA: Classification DDF: Directional Distribution Factor DOT: Department of Transportation ESAL: Equivalent Single Axle Load FHWA: Federal Highway Administration GIS: Geographic Information System GVW: Gross Vehicle Weight HDF: Hourly Distribution Factor HPMS: Highway Performance Monitoring System IRI: International Roughness Index JPCP: Jointed Plain Concrete Pavements xvii LDF: Lane Distribution Factor LTPP: Long -term Pavement Performance MAF: Monthly Adjustment Factor ME: Mechanistic -emp irical MEPDG: Mechanistic -empirical Pavement Design Guide MLD: Maximu m Life Difference NALS : Normalized Axle Load Spectra NCHRP: National Cooperative Highway Research Program PTR: Permanent Traffic Recorder QALS: Quad Axle Load Spectra QC: Quality Control SALS: Single Axle Load Spectra SEE: Standard Error of Estimate SHA: State Highway Agency SQL: Structured Query Language SSE: Sum of Squared Error TALS: Tandem Axle Load Spectra TMG: Traffic Monitoring Guide TMAS: Travel Monitoring Analysis System TRALS: Tridem Axle Load Spectra TTC: Truck Traffic Classificati on TWRG: Truck Weight Road Group UPGMA : Unweighted Pair Group Method with Arithmetic Mean VC: Vehicle Class xviii VCD: Vehicle Class Distribution VRC: Variance Ration Criterion WGT: Weight WIM: Weigh -in Motion XML: Extensible Markup Language 1 CHAPTER 1 - INTRODUCTION 1.1 PROBLEM STATEMENT AND BACKGROUND In the AASHTO 93 pavement design procedure, the truck traffic is converted to an equivalent number of 18 -kip single -axle loads (ESALs) using the load equivalency factors (LEFs) developed based on Present Serviceability Index (PSI) concept. Several studies have found that the complex failure modes of pavement structures cannot be explained by this single value (1; 2 ). The mechanistic -empirical pavement design guide ( AASHTOWARE Pavement -ME) addresses these limitations by incorporating mechanistic models to estimate stresses, strains, and deformations in pavement layers using site -specific climatic, material, and tra ffic characteristics (3). These structural re sponses are used to predict different performance parameters for each pavement type using empirical models (i.e., transfer functions) . Therefore, the use of ESALs to characterize traffic loadings is not compatible with the Pavement -ME. This new analysis a nd design approach require specific types of traffic data to design new or rehabilitated pavement structures. The se traffic inputs include: Annual average daily truck traffic (AADTT), Vehicle class distribution (VCD), Monthly adjustment factors by vehicle class (M AF), Hourly truck volume distribution factors (HDF), Number of axle groups per vehicle (AGPV), and Axle load distributions by vehicle class and axle group. 2 Pavement -ME addresses the unavailability of detailed traffic data over the years . Hierarchical input level s are used depending on the level of detail of the available traffic data (3-5). These input levels range from site -specific input values to fibest -estimatefl or default values and are classified as follows: Level 1 Œ There is a very good knowledge of past and future traffic characteristics. At this level, it is assumed that the past traffic volume and weight data have been collected along or near the roadway segment to be designed . Level 2 Œ There is a modest knowledge of past and future traffic characteristics. At this level, only regional truck volume and weight data may be available for the roa dway in question. Level 3 Œ There is poor knowledge of past and future traffic characteristics. At this level, the designer will have little truck volume information . In this case , a statewide or some other default value must be used . Traffic patterns in t erms of truck volumes, vehicle class distributions, and axle loads vary considerably along various roads and locations even along the same route. The designer™s ability to assess the current and future traffic patterns is considered significant if WIM site s are present in proximity to the design project. In the event inputs are available only at a regional or a network level (Level 2), the designer™s ability to evaluate current and future traffic patterns is reasonable . Finally, if the designer must rely on default inputs based on national or state traffic patterns, the designer has insufficient knowledge (Level 3) of the current and future traffic characteristics. An improved understanding of the traffic inputs significance and their impact on performance p rediction s make the transition from a purely empirical to a mechanistic -emp irical (ME) design procedure smoother . 3 To address the needs mentioned above , a study completed in 2009 (4) analyzed permanent traffic recorder (PTR) traffic volumes and WIM axle load data in Michigan for evaluating and characterizing traffic -rel ated inputs for the Pavement -ME. The traffic characteristics include d MAF, HDF , VCD, AGPV, and axle load distributions for different axle configurations. Axle weight and vehicle classification data were obtained from 44 WIM and classification stations loca ted throughout the State of Michigan to develop Level 1 (site -specific) traffic inputs. Cluster analyses were conducted to group sites with similar characteristics to develop Level 2 (regional) inputs. Finally, data from all sites were averaged to establis h the statewide Level 3 inputs. While the traffic characterization was based on data collected from 2005 to 2007, the same study recommended that traffic inputs, especially Level 2 clusters should be re -evaluated every five years because of the following reasons (4; 6 ): a. Addition of new classification and WIM sites at different geographical locations or change in the status of the existing site (e.g., down - or up -grading from WIM to classification or vice versa). b. Significant changes in land use in the vicinity of the existing WIM locations. c. Changes in the WIM technology for some locations. For example, if less accurate piezo -polymer sensors are replaced with more accurate piezo -quartz or bending plate sensors. During the last seven (7) years, new traffic data were collected reflecting the recent economic growth, additional , and do wngraded WIM sites . Figure 1 -1 shows the changes in traffic data between the years 2009 and 2016 for selected sites in Michigan. Consequently , the current traffic inpu ts should be re -evaluated and developed with the latest traffic data collected at a ll the PTR locations. Also , the following significant developments , related to the Pavement -ME analysis 4 and design method in the State of Michigan during the last few years further necessitate the re -evaluation of the current traffic inputs: TrafLoad software was used in the previous traffic study for extracting the traffic volumes (by class) and axle load data, and to ascertain the quality of the data in the previous study (4). TrafLoad has since lost endorsement nationally and is no longer supported . However, recently , the PrepME software was developed through the Transportation Pooled -Fund Study TPF-5(242), fi Traffic and Data Preparation for AASHTO Pavement -ME Analysis and Design .fl This software is capable of preprocessing, importing, checking the quality of raw WIM traffic data, and generating three levels of traffic data inputs with built -in clustering methods for the ME design. Therefore, there is a need to employ such tools to improve the quality of traffic data in the re -evaluation of traffic inputs. A study was completed in 2014 to locally calibrate the performance prediction models (7-10). The local calibrations of the performance models were performed by using version 2.0 of the Pavement -ME software. However, AASHTO has released versions 2.2 and 2.3 of the software since the completion of the last study. In the revised versions of the so ftware, several bugs were fixed . Consequently, some of the performance models were modified in the newer software versions. As a result, the concrete pavement IRI predictions have been impacted and have raised some concern regarding the resulting PCC slab thicknesses. 5 (a) VCD at site 4129 (b) VCD at site 9759 (c) HDF at site 8729 (d) HDF at site 9759 (e) Tandem ALS at site 7159 (f) Tandem ALS at site 9189 Figure 1-1 Temporal changes in traffic characteristics at the selected sites 010203040506045678910111213Percentage Vehicle Class 20092016010203040506045678910111213Percentage Vehicle Class 20092016024681005101520HDFHour 20092016024681005101520HDFHour 200920160481216206142230384654627078Frequency Kips 200920160481216206142230384654627078Frequency Kips 20092016 6 For example, for some JPCP pavements in Michigan, the slab thicknesses decreased significantly by using the same local calibration coefficients between versions 2.0 and 2.3 as shown in Figure 1 -2. Consequently, MDOT decided to use AASHTO 1993 thickness design in the interim since version 2.3 may provide under -designed pavements with the previous local calibration coefficients. Because version 2.3 of the Pavement -ME corrected several coding errors in version 2.0 that resulted in the incorrect calculation of rigid pavements IRI, it was deemed inappropriate to use version 2.0. Thus, there is a need to verify the performance predictions for rigid pavements in the State of Michigan for the Pavement -ME ver sions 2.2 and 2.3. If the performance predictions vary significantly from the observed structural and function distresses, the models must be re -calibrated to enhance the MDOT confidence in pavement designs. The calibration studies done in the past have us ed the traditional regression analyses. Since the performance data is measured on the same sections over time, it is expected that the data is correlated . Therefore, in addition to the traditional regression analyses that was done in the past , longitudinal data analyses can be conducted where in random affects can be incorporated to account for the among unit™s variation. The recalibrated coefficients can be used in t he re-evaluation of traffic inputs while conducting their sensitivity analyses to identify the most important ones . 7 OC= old calibration Figure 1 -2 Comparison of designs between different versions of Pavement -ME Lastly, to reduce the frequency of future new traffic studies and streamline the process of generating ME traffic inputs, there is a need to re -evaluate the current methodology, provide enhancements, if found necessary. Also, there is a need for document ing a step -by-step procedure that would allow MDOT to analyze future traffic data and create traffic clusters for ME use. Based on the above discussion, likely, the new traffic data , changes in the Pavement -ME software, and performance model calibrations will affect the existing cluster s methodology and their characteristics . Thus, there is a need to re -evaluate the traffic inputs for the ME analysis and design procedures in the State of Michigan. 1.2 OUTLINE OF REPORT The report consists of the following six chapters: 1. Introduction 2. Literature review 6810121402468101214161820222426Thickness (inch) Project Number AASHTO93 ME_2.0_OC ME_2.3_OC 8 3. Recalibration of rigid pavement performance models 4. Development of traffic inputs 5. Significant traffic inputs for pavement design 6. Conclusion and recommendations and future work Chapter 1 outlines the problem statement and provides an outline of the report. Chapter 2 documents the recalibration and traffic characterization in the Pavement -ME. It also includes findings from the past studies at the national and state levels . It also includes clustering techniques and the review of existing practices in Michigan . Chapter 3 documents the evaluations of performance prediction models in different versions of the Pavement -ME software. The chapter also includes the comparison of predicted a nd measured performance data for rigid and flexible pavements. It discusses the PMS data, calibrations techniques, the results of re-calibration , and the impact of re -calibration on the pavement design practice in Michigan. Chapter 4 cover s the traffic dat a collection and processing in Michigan. This chapter also reviews the clustering techniques , cluster assignment methodologies, and the procedures used for developing Level 2 inputs. Chapter 5 documents t he impact of the developed Level 2 traffic inputs on pavement design s. Also, the chapter includes the findings for appropriate traffic inputs levels (Level 2 or 3) in Michigan . Chapter 6 summarizes the c onclusions and r ecommendations for the implementation of modified traffic inputs in Michigan . 9 REFERENCES 10 REFERENCES [1] Zhang, Z., J. Leidy, I. Kawa, and W. Hudson. Impact of changing traffic characteristics and environmental conditions on flexible pavements. Transportation Research Record: Journal of the Transportation Research Board, No. 1730, 2000, pp. 125-131. [2] Carvalho, R., and C. Schwartz. Comparisons of flexible pavement designs: AASHTO empirical versus NCHRP Project 1 -37A mechanistic -empirical. Transportation Research Record: Journal of the Transportation Research Board, No. 1947 , 2006, pp. 167-174. [3] NCHRP. Guide for Mechanistic -Empirical Design of New and Rehabilitated Pavement Structures.In, Washington D.C, 2004. [4] Buch, N., S. W. Haider, J. Brown, and K. Chatti. Characterization of Truck Traffic in Michigan for the New M echanistic Empirical Pavement Design Guide In Final Report, Michigan Department of Transportation, Lansing, MI, 2009. [5] NCHRP Project 1 -37A. Appendix AA: Traffic Loading.In, ARA, inc., ERES division, 505 west University Avenue, Champaign, Illinois 61820, 2004, 2004. [6] Haider, S. W., N. Buch, K. Chatti, and J. Brown. Development of Traffic Inputs for Mechanistic -Empirical Pavement Design Guide in Michigan. Transportation Research Record, Vol. 2256, 2011, pp. 179 -190. [7] Haider, S. W., N . Buch, W. Brink, K. Chatti, and G. Y. Baladi. Preparation for Implementation of the Mechanistic -Empirical Pavement Design Guide in Michigan - Part 3: Local Calibration and Validation of the Pavement -ME Performance Models.In, No. Final Report RC-1595, Mich igan Department of Transportation, 2014. [8] Haider, S. W., W. Brink, and N. Buch. Local Calibration of Rigid Pavement Cracking Model in the New Mechanistic - Empirical Pavement Design Guide using Bootstrapping.In ASCE T&DI Congress, Orlando, Florida, 2014 . pp. 100-110. [9] Haider, S. W., W. Brink, and N. Buch. Use of Statistical Resampling Methods for Calibrating the Rigid Pavement Performance Models in Michigan.In the 94th Annual Transportation Research Board Annual Meeting, Washington, DC., 2015. [10] Haider, S. W., W. Brink, N. Buch, and K. Chatti. Process and Data Needs for Local Calibration of Performance Models in the Pavement -ME.In the 94th Annual Transportation Research Board Annual Meeting Washington, DC., 2015. 11 CHAPTER 2 - LITERATURE REVIEW This chapter presents a review of literature and state -of-the -practice related to performance models and traffic inputs in the Pavement -ME. For ease of understanding, the review is further divided into the following topics: Pavement -ME rigid pavement performance models Pavement -ME tra ffic inputs National studies for t raffic characterization Traffic studies in other states Review of existing practices in Michigan The traffic inputs needed for pavement analysis and design by the Pavement -ME are briefly discussed below. 2.1 RIGID PAVEMENT PERFORMANCE MODELS IN PAVEMENT -ME The Pavement -ME Design Guide combines the responses from mechanistic models and empirical transfer functions to predict pavement performance. Pavement structural responses (stress, strain , or deflection) for a given cross -section are calculated based on traffic, climate , and material properties using mechanistic models. These structural responses are used to predict pavement performance using empirical models (i.e., transfer functions). Transverse cracking, joint faulting , and IRI are the main performance measures predicted by the performance models in JPCP. The coefficients in these models should , therefore, be calibrated to match the predicted and observed field performances. To this effect, three major national calibrations have been undertaken in the past using the field performance data from the Long Term Pavement Performance (LTPP) test sections (1). The first national calibration was performed for the Mechanistic -Empirical Pavement Design Guide (MEPDG) software Version 0.7 and is no longer 12 in use. The second national calibration was deemed necessary to keep up with the updated mechanistic models and has been in use until the release of the Pavement -ME software Version 2.2. Subsequently, the third calibration for Version 2.2 was necessary because of the change in the testing procedure of the Coefficient of Thermal Expansion (CTE) of concrete (2; 3). In addition, a few other issues , such as calculation errors in JPCP IRI, freezing index , etc., were fixed. Generally, the nationally calibrated models may not perform well for a State if the inputs and performance data used for national calibration do not represent the local conditions. Therefore, it is recommended that each State Highway Agency (SHA) evaluate the national models to determine if the predict performance match es the field performance (4; 5 ). If the predictions are not reasonable , local calibration of the Pavement -ME performance models is recommended to improve the performance predictions accuracy reflecting the unique field conditions and design practices (5-8). SHAs have been actively pursuing local calibration efforts since the first release of the MEPDG for its implementation. The performance models of rigid pavements (transverse cracking and IRI) and the recalibration efforts of various states are briefly discussed below. 2.1.1 Transverse Crackin g Model Transverse cracking in JPCP is load -related distress cause d by repeated axle loads. It can initiate either at the top or bottom of the concrete slab because of curling and warping. However, a given slab can either crack from the bottom or the top but not both. Therefore, the predicted bottom -up and top -down crack types are not particularly meaningful individually. Hence, both types of cracking are combined , excluding the possibility of both occurring at the same time. The percentage slabs cracked ( including all severity levels) in a given traffic lane is predicted using Equation (1): 13 541001() CFCRK CDI (1) FDIis the fatigue damage accumulations considering all critical factors (i.e., age, month, axle type, load level, temperature gradient, ax le wander, and hourly traffic). 4Cand 5Care the calibration coefficients. The top -down and bottom -up fatigue cracking are calculated based on respective damages predicted by the mechanistic models. The total cracking is computed by using Equation (2). i,j,k,l,m,n,o Fi,j,k,l,m,n,o nDIN (2) where: n = Applied number of load applications at conditions i, j, k, l, m, n, o N= Allowable number of load applications at condition i, j, k, l, m, n, o 2Cii,j,k,l,m,n,o1 i,j,k,l,m,n,o MRlogNC (3) The conditions i, j, k, l, m, n, and o refer to the age (change in PCC modulus of rupture, slab base contact friction, shoulder LTE etc.), month of year (base elastic modulus, effective dynamic modulus of subgrade reac tion), axle type, load level, equivalent temperature between top and bottom PCC surfaces, and hourly truck traffic fraction. The applied number of load applications (ni,j,k,l,m,n ) refers to the actual number of axle types k of load level l that passed thro ugh traffic path n under each condition (age, season, and temperature difference). The allowable number of load applications is the number of load cycles at which fatigue failure is expected (corresponding to 50 percent slab cracking) and is a function of the applied stress and PCC strength (see Equation 3) . 14 Since it is difficult to distinguish between top -down and bottom -up cracking, Pavement -ME combines defines total transverse cracking as follows: ---- ( - )100% Bottomup topdown Bottomup topdown TCRACKCRKCRKCRKCRK (4) where: TCRACK = Total transverse cra cking (percent, all severities) CRK Bott om-up = Predicted amount of bottom -up transverse cracking (fr action) CRK Top -down = Predicted amount of top -down transverse cracking (fraction) Table 2 -1 shows the history of national calibration coefficients in addition to the local calibration coefficients used by other states for transverse cracking and IRI mod els. Table 2 -1 Local calibration coefficients for the rigid transverse cracking and IRI models Calibration coefficient Transverse cracking IRI C4 C5 C1 C2 C3 C4 New national coefficients (1) 0.52 -2.17 0.820 0.442 1.493 25.24 Old national coefficients (2) 1 -1.98 0.820 0.442 1.493 25.24 Arizona 0.6 -2.067 0.60 3.48 1.22 45.20 Colorado 1 -1.98 0.82 0.44 1.49 25.24 Iowa (3) 1.08 -1.81 0.04 0.02 0.07 1.17 Louisiana 1.16 -1.73 0.82 0.44 1.49 25.24 Minnesota 0.9 -2.64 - - - - Missouri 1 -1.98 0.82 1.17 1.43 66.80 Ohio 1 -1.98 0.82 3.70 1.71 5.70 Washington 1 -1.98 0.820 0.442 1.493 25.24 *Note : (1) from Version 2.2 onwards , (2) before Version 2.2 , (3) The coefficients f or Iowa predict IRI in SI units. 2.1.2 IRI Model The Long -term IRI in the Pavement -ME is a function of the as -constructed initial smoothness of pavement surface and any change in the longitudinal profile over time due to distresses and foundation movements. The global IRI model was calibrated and v alidated using LTPP field data 15 to assure that it would produce valid results under a variety of climatic and field conditions. The IRI regression model is shown below: 1 2 3 4 oIRIIRICCRKCSPALLCTFAULTCSF (5) where: IRI = Predicted IRI, in/mile IRI o = Initial smoothness measured as IRI, in/mi le CRK = Percent slabs with transverse cracks (all severities) SPALL = Percentage of joints with spalling (medium and high severities) TFAULT = Total joint faulting cumulated per mi le, in ch SF = Site factor C1,2,3,4 = Global calibration coefficients; C 1 = 0.8203; C 2 = 0.4417 ; C3 = 0.4929 ; C4 = 25.24 6200SFAGE10.5556FI1P10 (6) where: AGE = Pavement age, yr. FI = Freezing index, °F -days. P200 = Percent subgrade material passing No. 200 sieve. (12AGESCF) AGE 100SPALL AGE0.0111.005 (7) where: SPALL = Percentage of joints with spalling (medium and high severities) AGE = Pavement age, yr. SCF = Scaling factor -based on-site, design, and climate . 16 0.4 cpcc SCF1400350%air(0.5PREFORM)43.4f -0.2(FTCYCAge)43h536WC_Ratio (8) where: %AIR = PCC air content, percent AGE = Time since construction, years. PREFORM = 1 if preformed sealant is present; 0 if not. f'c = PCC compressive strength, psi. FTCYC = Average annual number of freeze -thaw cycles. hPCC = PCC slab thickness, in. WC_Ratio = PCC water /cement ratio 2.2 REQUIRED INPUTS FOR LOCAL CALIBRATION Generally, the nationally calibrated models do not predict performance accurately for a state if the inputs and performance data used for said calibration do not represent the local conditions. Therefore, it is recommended that each SHA conduct an evaluation of the national models to determine if the predicted performance matches the field performance. If the predictions are not reasonable, local calibration of the Pavement -ME perform ance models is recommended to improve the accuracy of the performance predictions, reflecting the unique field conditions and design practices. To this effect, all the required inputs into the Pavement -ME should reflect the design, material and constructio n properties of the test sections to whose data the performance models are being calibrated . Most inputs can be easily obtained from design specifications, or material testing results via ASTM or AASHTO standards. In cases where inputs are difficult to det ermine, the defaults provided by the software can be used. Table 2-2 lists some of the inputs required for rigid pavement designs and their impact on distress/IRI predictions (9). Some of the inputs such as permanent curl/ warp and PCC zero -stress temperature are difficult to measure. It 17 can be seen from the table that these inputs have a significant impact on the predicted pavement performance . The concept s of these inputs, their significance, and the available method s of quantifying them are discussed in the following sections. Table 2 -2 Impact of Pavement -ME inputs on rigid pavement performance predictions Design/material input Impact on model prediction Faulting Cracking IRI PCC thickness High High High PCC modulus of rupture None High Low PCC coefficient of thermal expansion High High High Joint Spacing Moderate High Moderate Joint load transfer efficiency High None High PCC slab width Low Moderate Low Shoulder type Low Moderate Low Permanent curl/warp High High High Base type Moderate Moderate Low Climate Moderate Moderate Moderate Subgrade type/modulus Low Low Low Truck composition Moderate Moderate Moderate Truck volume High High High Initial IRI NA NA High PCC zero -stress temperature High Low High 2.2.1 Concepts of Curling, Warping and Zero Stress Temperature Concrete slabs are subjected to daily temperature and moisture changes, which in turn affect the volume of the concrete slabs. Differential volume change throughout the slab depth can induce significant deformation and cause the slab to curl upward or down ward, which may lead to separation or liftoff from the underlying layers. When any portion of the pavement becomes unsupported, the stiffness of the entire pavement system is degraded (10). Differential volume changes in a concrete slab can be attributed to any one of (or combination of) five factors : (a) Temperature difference Œ The top of the slab is exposed to the environment and is typically warmer than the bottom of the slab during the daytime, which results in downward curling. 18 (b) Reversible differential drying shrinkage Œ Water loss from the exposed concrete surface due to evaporat ion occurs when the ambient relative humidity (RH) is less than the moisture content of the concrete, which results in shrinkage. Part of this shrinkage is reversible , causing moisture warping. Most of this moisture loss occurs within the top two inches of the slab. (c) Irreversible differential dry shrinkage Œ The irreversible part of the shrinkage is referred to as the differential drying shrinkage which causes permanent warping of the slabs. (d) Built -in curl Œ Built -in curl is the result of the presence of a t emperature gradient at the time of concrete setting . If the transient temperature gradient differs from this built -in temperature gradient, the slab will deform. (e) Creep Œ Creep is defined as the deformation due to sustained loading. It counteracts the defor mations due to curling, warping , and shrinkage and hence is subtracted from the effects of curling/warping produced by other factors. The total temperature gradient due to differential volume changes is as follows: total temp mois shrink Tbuilt_curl Œ creep For design purposes, d ifferential volume change s are represented by temperature gradients required to deform a similar, but theoretically flat slab to the same shape as the actual slab. The measured temperature gradients are usually n on-linear throughout the slab depth but may be converted to equivalent linear temperatu re gradients (ELTG) for ease of use in design. As this quantity cannot be measured directly, any technique use d to characterize the amount of differential volume change in a slab must include both ac tual measurements from the slab and a theoretical model or algorithm used to correlat e the actual measurements to an equivalent linear temperature difference. The actu al measurem ents generally are slab profile measurements or 19 falling weight deflectometer (FWD) drops, while the correlation is typically made with either a model, such as a finite element model, or an ANN (11). The effective built -in temperature difference ( EBITD ) is the cumulative effect of the built -in temperature gradient, irreversible (permanent) drying shrinkage, transient moisture gradient, and creep (12). The EBITD is the linear temperature difference between the top and bottom of a concrete slab that produces the same slab shape as the cumulative effects of the nonlinear built -in temperature gradient, nonlinear moisture gradient, and nonlinear drying shrink age gradient reduced over time by creep. The four components of EBITD are typically grouped because they temp , which changes intraday to a considerably greater extent. Researchers have traditional ly reported the EBITD as filocked -in curvaturefl (13), fizero -stress temperaturefl (14), fiequivalent temperat ure gradientfl (15; 16 ), and fibuilt -in curlfl (17; 18 ). Pavement -ME accounts for differential volume changes when computing stresses in a concrete pavement in two ways: (a) The equivalent linear temperature gradie nts for daily or seasonal moisture and temperature variations are estimated using climatic data and prediction models; one each for temperature and moisture gradients. Unlike temperature gradients, which generally follow a set pattern of diurnal variation , moisture gradients are quite variable, depending on the ambient relative humidity and rain events. For this reason, moisture gradients are generally considered monthly (19). When the top surface of the pavement is dry, the concrete shrinks, causing the slab to warp upwards. W hen the top surface becomes saturated, the shrinkage is at least partially reversed, which reverses the warping. 20 (b) The permanent curl/warp input in the Pavement -ME software is the sum of cumulative effects of the nonlinear built -in temperature gradient (built -in curl), nonlinear shrinkage gradient reduced over time by creep (20). There is no equation to estimate the permanent curl/warp value, but rather, it is a user -defined input with a default value of -10°F. Additionally, the software does not provide any guidance on how much of the permanent curl/warp value is attributable to differential drying shrinkage, construction curl or creep, though it is acknowledged that -10°F is not an accurate estimate in all cases (19). Built -in curl can be measured in the field by subtracting the effective temperature difference from the total curl measured from back calculations. Very little information is readily available on the topic of reversible shrinkage in concrete. A typical value for reversible shrinkage obtained from a highly idealized curve shows that 50% of all shrinkage is reversible but does not state a numeric value for reversible shrinkage (21). Th e default value provide d in Pavement -ME is 50% , which may not be correct in all cases. However, due to lack of information, most designers use the value of 50%. In some cases, the equivalent temperature difference due to moisture warping is often of the sa me magnitude as that due to temperature curling , as stated in the literature review of a study (11). The amount of shrinkage which is reversible depends on the properties of the concrete and the degree and duration of wetting (9; 11; 22 ). The review also stated that this phenomenon had been noted by some other researchers (11) (23-27). The curr ent equation in the Pavement -ME software for the monthly equivalent temperature difference for moisture gradient, ETG SHi is shown below. A review of the model revealed several shortcomings, including terms of unknown origin, and a propensity to predict physically impossible behavior (11). 21 ssuhihaves SHi 2hh3SSh 23ETG h100 (9) where: SHi ETG = Equivalent temperature difference due to the deviation of moisture warping in month i from the annual average, in oF = Reversible shrinkage factor (factor of total shrinkage which is reversible. Default value of 0.5 is used unle ss more accurate information is available. su= Ultimate shrinkage strain, *10 -6 hiS = Relative humidity factor for month i hiS=1.1 for RH a<30% hiS=1.4 Œ 0.01RH a for 30%80% RHa= Ambient relative humidity factor, as a percent have S= Annual average relative humidity factor = annual average of S hi sh= Depth of the shrinkage zone in inches, typically taken as 2 inches h= Thickness of the concrete slab in inches = Coefficient of thermal expansion The original ACI approach in determining shrinkage does not separate the effects of autogenous and drying shrinkage (21); the inclusion of the reversible shrinkage factor in the above equation is to ensure that only the reversib le portion of the shrinkage is used in the ETG term . Equation (9) was rederived in a study and it was observed that there were terms in the Pavement -ME model, which were not found in the re -derivation. The rederived model is shown in the equation below (11). 22 ss2hh6ah 23ETG h (10) When this equation is compared to the actual equation used in the Pavement -ME, there is a difference of 1/200, which will substantially change any numerical values obtained using the Pavement -ME equation . No indication is given in the Pavement -ME as to the source of this term. Also, one assumption made during the derivation of the current Pavement -ME warping model was that sh rinkage varies linearly through the depth of the shrinkage zone. While this approximation is an improvement over the rectangular approximation used by (28), a non -linear approximation would be more representative of actual drying (29). In additio n, when the current Pavement -ME warping model is used to compute the equivalent temperature difference, the values of ETG SHi are on the order of +/ - .001 to .01°F which are exceedingly small compared to the temperature gradients (for e.g., in Sacramento , 50 percent of the observed temperature gradients are in the order of +/- 17°F). It was found that the amount of total shrinkage and the reversible shrinkage factor had an insignificant effect on pavement designs (differences of less than 1% cracks were obse rved ) when a factorial for a standard PCC pavement (different thicknesses and bases) in different locations were run using Pavement -ME (11). Additional analysis o n six and eight -inch -thick concrete pavements, and at higher traffic levels revealed the same insensitive results. Also, the use of the ( Shi Œ Shave ) term to account for cyclic changes in relative humidity presents a fundamental problem. This term relates the relative humidity in any given month to the annual average, essentially saying that any dry months will cancel out any wet months, and therefore, the net annual warp in a slab is zero. An analysis of the implications of this assumption quickly reveals that this was not the case (11). Hence, there is a need for modifying the existing model or 23 developing a new model to accurately predict the moisture warping. Some models dependent upon the variation of relative humidity with depth, (30) and predictive models that require finite element simulations (31) were developed but would be difficult to introduce into the current Pavement -ME program. Another importa nt input in Pavement -ME is the zero -stress temperature (T z), which has substantial effects on the behavior and performance of PCC pavements. Pavement -ME defines the T z as the temperature at the time of set. Although T z varies with depth, Pavement -ME consid ers it as the mean slab temperature. The user can either enter the value or allow the software to calculate it using a built -in T z prediction model as shown below. This model is a function of two input variables: cementitious content and average monthly t emperature for the month of construction as shown below (32). TZ0.202255CCH (11) where: TZ = Zero stress temperature, oF (minimum 70 oF) CC = Cementitious content (lb/yd 3) H = 20.07870.007MMT0.0003MMT MMT = Mean monthly temperature for the month of construction, oF Pavement -ME documentation notes that the equation above has many limitations as it does not consider the effect of many factors on the heat of hydration , including mineral admixtures, cement composition and fineness, chemical admixtures, and others. Yeon e t al. 2012 have concluded that the differences between the predicted and measured Tz s ranged from 4.5 to 7.7 oC. (32) 24 2.2.1.1 Quantifying Built -in Temperature Gradient Built -in curl is generally quantified by the equivalent temperature gradient needed to deform a flat slab to the same shape as the curled slab. There are two main schools of thought regarding the ideal way to measure the amount of curl built into a concrete slab. A surface profiler may be used to measure the deflections along the length of the slab. This can be accomplished throug h a variety of different methods , including dipsticks, on -site profilometers, high -speed profilometers, etc. These surface profiles can then be plotted and an equation fitted to the shape of the surface. This equation is compared to curves generated from a FEM program , which represent the deflected shape s of theoretical flat slab s (of the same geometry and material properties) that are exposed to various temperature gradients. The difference between the temperature gradient used to produce the same deflecte d shape as that of the observed slab and the actual temperature across the slab at the time of profiling is the built -in temperature gradient. This brute force procedure is time -consuming and error prone but does not require specialized techniques or equip ment. Alternatively, an FWD can be used to measure slab response to various applied loads. The data obtained from this test can be run through an ANN, which has been designed and trained to back estimate the built -in curl. This method is much more automate d and therefore, the potential to introduce human error is greatly reduced. However, it does require that the user have an appropriately trained ANN at their disposal, as well as access to FWD data at specific times of the day (11). Another method to measure the built -in curl is by using static (or environmental) strain data from the static strain sensors inserted into the pavement sections d uring the time of construction and consists of the following steps. 25 (a) The measured strain variations in the fresh concrete with respect to the te mperature changes can be used to establish zero -stress time (TZ) for each concrete slab. A transition can be identified from the negligible strain measurements to smooth changes in strain with respect to temperature. This transition point is recognized as TZ in slabs. (b) Once the TZ is established, the temperature measured across the depth of each slab is used to establish the equivalent linear temperature . The zero -stress time, TZ, can be established by following an approach developed in a research study (33) and is based on the changes seen in the measured strain with respect to the tem perature variation in the slab. The effects of creep on the permanent curl/warp , if any, is masked by the ever -increasing drying shrinkage and thereby cannot be quantified bas ed on the behavior of the slab. Creep does not occur instantaneously but is instead due to sustained loading over time. When considering curl due to daily fluctuations in temperature, there is generally not sufficient time for creep to take effect, and therefore, it can genera lly be neglected. However, effects due to moisture vary seasonally, which allows for creep to counteract moisture -induced deformations. In this case, creep must be considered in computations (31). The effects of creep are much more prominent early in the life of the pavement. The zero -stress temperature (T z) in the field can be measured in two ways , as follows: (a) Estimate stress -dependent strains from the stress independent strains measured using non -stress cylinders (NC) which contain vibrating wire strain gages (VWSGs). The temperature at which zero stress was observed was identified as the zero -stress temperature. (32) 26 (b) Estimate the zero -stress time and the corres ponding temperature based on total strain measurements from static sensor data. Note that although Pavement -ME defines the T z as the temperature at the time of set, final set time and zero -stress time are different. (33) One of the above methods can be followed for the determ ination of zero -stress temperature. Once all the necessary inputs are determined, calibration of the performance models can be performed 2.3 METHODOLOGY FOR CALI BRATION The local calibration of the pavement performance prediction models is a challenging task t hat requires a significant amount of preparation. The effectiveness of local calibration depends on the input values and the measured pavement distress and roughness. The local calibration process consists of the following steps : (9) 1. Execute the Pavement -ME software using the global calibration factors to predict the pavement performance for each selected test sec tion. 2. Extract the predicted distresses and compare with the measured distresses of the corresponding test section. 3. Test the accuracy of the global model predictions and determine if local calibration is required. 4. If local calibration is required, divided the available test sections into two sets (calibration and validation sets). Adjust the local calibration coefficients to eliminate bias and reduce standard error using the data from the test sections in the calibration set. 5. Validate the adjusted coeffic ients with the data from the test sections in the validation set. 6. Adjust the reliability equations for each model. 27 The latest available Pavement -ME software version should be executed using the as -constructed inputs , and the actual traffic for all the sel ected pavement sections and the predicted performance (fatigue damages, faulting, spalling, etc.,) should be extracted from the output files. Generally, the predicted and measured performance should have a one -to-one ( 45-degree line of equality) relationsh ip in the case of a good match. Otherwise, biased and/or prediction error may exist based on the spread of data around the line of equality. The global model prediction accuracy can be evaluated using the approaches in Table 2-3. The hypothesis tests indicate model bias. Bias is defined as the consistent under - or over -prediction of distress or IRI. The bias between measured and predicted distress/IRI is determined by performing linear regression, hypothesis tests , and a paired t -test using a significance level of 0.05. Figure 2-1 shows a representation of model bias and standard error for various conditions. The three hypothesis tests are summarized in Table 2-3. If any of these hypothesis tests are rejected (i.e., significance l evel greater than 0.05) for a performance model, then local calibration is recommended. Table 2 -3 Hypothesis tests for statistical significance Hypothesis test Hypotheses Mean difference Ho = (predicted -measured) = 0 H1 = (predicted - Intercept Ho = intercept = 0 H1 Slope Ho = slope = 1 H1 If local calibration is necessary, models should be calibrated by minimizing the sum of squared error between measured and predicted distress es, which is the domain of traditional regression analyses. The following section discusses the basics of regression, covariance among data and methods to the anal ysis of longitudinal data. 28 (a) High bias and low standard error (b) High bias and high standard error (c) Low bias and low standard error (d) Low bias and high standard error Figure 2 -1 Demonstration of different levels of standard erro r and bias 2.3.1 Longitudinal Data Analyses Consider a random variable Y chosen from a population representing a characteristic of the population. Ycan take on many values , and the way in which it can take these values can be represented by a probability distribution. The mean of the population is the average of all possible values that ‚ ™ could take on. The population variance is the spread of all possible 29 values that may be observed from the center of the distribution of all possible values. If all the units in the population were the same, then the variance would be equal to zero. However, since there are variations in a population, the value of the random variable is no t equal to the population mean but rather deviates from the mean by a certain positive or negative amount. For example, consider a random vector with a set of random variables , as shown below. 12nYYYY (12) Each element of the random vector jY,Y where j1,2,3,.....,n is a random variable that has it 's own mean and variance as follows (34): jjEY and 2jjvarY However, for example, if it is assumed that jY is normally distributed, it can be represented as: 2jjj YN, (13) Consider the case of a simple linear regression where at each fixed value, in x,......,x , a corresponding random variable, jY,j1,2,3,.....,n is observed. The form of the model is assumed to be as follows: j01jj Yx (14) Where is a random variable with mean 0 and variance 2 ; that is jE0,2jvar . Thus, j01j EYx and 2jvarY . 30 An essential assumption of the simple regression mod el above is that the random variables jY, or equivalently, the j are independent. That means the value of jY at jx is completely unrelated to the way in which j'Y observes at another position j'x takes on its value. Also, the variances are assumed to be equal at different values of j. Assuming a no rmal distribution, the observed random variable Yhas a probability distribution function (34). 221/2 1fyexpy/2 2 (15) The shape of the distribution depends on the term 22 (y)/(2) . If jYare assumed to be independent, the method of least squares is to minimize the term n22jjj1Y/ . Writing the regression in the matrix form as shown below, YX , the term to be minimized becomes 2YX'IYX/ (16) Minimizing this equation by differentiation, the least square estimates of are as follows: 1II‹XXXY (17) The elements in the above matrix can be assumed to be independent , or they can be assumed that they have some relationships among them. If it is believed that the elements of Yare not independent , it is reasonable to assume that there might be relationships among them. Specifically, if Ycontain s possible observations on the same unit at times indexed by j, then they may not be legitimate ly viewed as independe nt, but rather vary together, or covary and have a joint probability distribution . Each of the random variable s has its own probability distribution with means j and k. The covariance between jY and kY is: jk jjkkjk covY,YEYY (18) 31 The covariance matrix for a vector of random variables is shown in equation (19) . If jY are independent, the covariance between them is zero, and hence the covariance matrix becomes a diagonal matrix. 2112 1n 2211 2n 2n1n2 n (19) Therefore, in case of a random vector whose components are normally distributed and possibly associated, the j oint probability distribution i s given by I1/2 1n/2 1fy||expy||y/2 2 (20) Like equation ( 15), the form of fydepends on I1y||y . The least -squares estimate of are as follows. It is often called as the weighted least square estimators. 1I1I1 ‹XXXY (21) The data available for the calibration should be considered as longitudinal data because they are measured on the same sections over time . Each section can be considered as a unit , and the measured IRI and cracking values on these sections can be considered as repeated measurements. Suppose the response of interest is measured on every test section at n times 12n tt.....t and there are number of test sections, the responses from these test sections can be represented bymn1 random vectors 12m Y,Y,.....,Y corresponding to each of the m th test sections. For th i such vector, 32 12iiin YYYY (22) iEYand ivarY . It is natural to expect that ij Y,j1,2,3,...,n are correlated. ijik CovY,Y0 for any jk1,2,3,...,n in general. Therefore, is unlikely to be a diagonal matrix. It can also be assumed that the random vectors 12m Y,Y,.....,Y are all mutually independent because the observations for test section are unrelated to the way observations may turn out for another test sec tion li. The variation in longitudinal data can be considered to come from two sources (a) among -unit variation, and (b) within -units™ variation. The sources are variation are explained in terms of a model below. The th jelement of iY, ij Y, may be thought of as being the sum of several components, each corresponding to a different source of variation; i.e. ijiijiijijiij1ij2ij Ybebee (23) where ij Eb0 , 1ij Ee0 and 2ij Ee0 . ij b is a deviation representing among unit variation at time jt due to inherent variation. ij bmaybe thought as dictating the fiinherent trendfl for i at jt. 1ij e represents the additional deviation due to within -unit fluctuatio ns about the trend. 2ij eis the deviation due to measurement error (within -units). The sum ij1ij2ij eee denotes the aggregate deviation due to all within -unit sources. The sum ijij1ij2ij bee thus represents the aggregate deviation from j due to all sources. Stacking the ij ,ij b, and ij e,iiii1i2i bebee . The overall pattern of correlation for i (and hence iY) results from the combined effects of these two sources (among - and within -units). 33 Considering the above model, it can be viewed that each section has its own underlying inherent trend. Specifically, the unique intercept and slope, oi and 1i determine the trend for the th itest section. The few test sections for which the data are available for calibration may be thought to arise from a population of such test sections. Each test section can be assumed to have its own intercept and slope. These intercept/slope vectors vary a bout the mean of the mean value of intercept and slope of the population of all such i vectors. Let 0 and 1 represent the mean values of intercept and slope of all the test sections defined as follows: 01 (24) Thus is the mean vector of the population of all i. i therefore can be written as: iib , 0i1bbb which is a shorthand way of saying 0i00i b , 1i11i b . Here, ib is a vector of random effects describing how the intercept and slope for the th itest section deviates from the mean value. The vectors are assumed to have a mean and a covariance matrix that describes the nature of the variation of intercepts and slopes among test sections and the degree of their covariance. Substituting the exp ressions for 0i and 1i in Equation (23), we get : ij00i11iijij Ybbte (25) From the a bove equation, it can be clearly seen that each test section is assumed to have its own intercept and slope which vary about the mean intercept and slope 0 and 1. The within -unit variance random vector ie has mean zero and represents the deviations within a test section . iecan be decomposed as i1i2i eee , where 1i e represents th e deviations due to within -subject 34 fluctuations and 2i e those due to measurement error. To sum up, the model can be written as follows in matrix form: iiiii YXZbe (26) Where iX= inp design matrix that characterizes the systematic part of the response, e.g. depending on covariates and time = p1vector of parameters usually referred to as fixed effects characterizing the systemic part of the response iZ= ink design matrix that characterizes the random variation in the response attributable to among -unit sources ib= k1 vector of random effects that completes the characterization of among -unit variation. ie= in1 vector of within -unit deviations characterizing variation due to sources line within -unit fluctuations and measurement error The model components ib and ie characterize the two sources of variation, among - and within -units. inii e~N0,R Here, iR is a iinn covariance matrix that characterizes variance and correlation due to within -unit sources. ikb~N0,D . Here, D is a kk covariance matrix that characterizes variation due to among unit sources , assumed the same for all units. The dimension of D corresponds to the number of among -unit random effects in the model. 35 Therefore, iniii Y~NX, (27) That is the model with the above assumptions on ie and ibimplies that the iY are multivariate normal random vectors of dimension in with a covariance matrix of some form. The form of i implied by the model has two distinct components, the f irst having to do with variation solely from among -unit sources and the second having to do with variation solely from within -unit sources. To characterize the among and within -subject variation s and correlation, it is required to specify a covariance stru cture model for ivare and ivarb In general, ivare is written as iR which is a iinn covariance matrix. The key is to identify the accurate structure of iR. In general , if the observation times are suffi ciently far apart, the correlation s due to within -subject sources among the ij Y may be regarded as negligible . In this case, it is reasonable to assume that 1i vare is a diagonal matrix. Furthermore, if it is believed that the magnitude of fluctuations is similar across time and units, 1ij vare can be assumed to be equal to 21 for all i and j, so that i21i1n vareI . This assumption is acceptable because of the belief that the 1ij e are independent of i and hence ib, which dictate the magnitude of the inherent trend of the specific unit so that the magnitude of fluctuations is unrelated to any unit -specific responses. Also, it is reasonable to assume that errors in measureme nt are uncorrelated over time; thus, taking 2i vare be a diagonal matrix would be appropriate. Also, if the measurement errors are believed to be of similar magnitude for all units, it is reasonable to assume that 22i2 vare , say, for all j, so that i22i2n vareI . Therefore, the structure of iR would be as follows: 36 iii 222 i1i 2i1n2nn RvarevareIII (28) The assumption that 1i e and 2i e are independent is standard, as is the assumption that 1i e and 2i e (and hence ie) are independent of ib . However, if the times of observation are sufficiently close, then the correlations due to within -unit sources cannot be assumed to be negligible and hence 1i vare cannot be assumed to be diagonal. The random -effects ib have mean 0 and represent variation resulting from the fact that individual units differ; i.e. , exhibit biological or other variation. ivarb characterizes this variation that causes the individual units to have different traject ories. The intercepts and slopes may tend to be large or small together, or large intercepts may tend to happen with small slopes and vice versa. Thus, ivarb does not have to be a diagonal matrix as some correlation between intercepts a nd slopes is expected. Generally, ivarb is represented by some covariance matrix D. For the model being used in this section, D would be a (2 × 2) unstructured covariance matrix. 1112 1222 DDDDD (29) Where: 0i0i11 varvarbD , 1i 1i22 varvarbD ,0i1i 0i1i12 cov,covb,bD 12D0 in general. Also, it is not common for 11D being equal to 22D. While the intercept is on the same scale of measurement as the response, the slope is on the scale firesponse scale per unit timefl and hence the representing variances would be different. ivarb reflects solely the nature of variation c aused solely by variation among units™ due to biology or other features. This is formally represented through the ib . It is often reasonable to assume that populations of intercepts and slopes are approximately normally distributed . Thus, a standard assumption is that 37 the ib have a multivariate normal distribution; e.g. , in the case where the covariance matrix is to D, the assumption would be ikb~N0,D , where k is the dimension of ib (k2 here). With these assumptions, iiEY~X (30) Iiiiii varY~ZDZR (31) Although more complicated structures could be used to model iR and ivarb , it could lead to a model that is too complicated to be estimated given the available data. Even if the two components Dand iR are not chosen correctly, the model Iiiii ZDZR would result in a matrix that is close enough to the matrix generated if they were chosen correctly. If the aim is to problem. However, if ivarb and iR themselves are of interest, then all possibilities should be investigated , and the structures that result in a better model should be selected. Model selection with different random effect structures is discussed in the flowing sections. Not e that fitting very overly complicated models may lead to difficulties and fiover -fitting.fl Once the model form and the covariance structures are chosen, the next step is to estimate the regression parameters and the parameters that characterize i. Because the iY are independent, the joint density function for Y fy is the product of the m individual joint densities i/mmI1/2 1ii iiiiii n2i1i11fyf(y) ||expYX||YX/2 2 (32) Similar to the above, the goal would be to maximize fyfor the unknown parameters and . The maximizing values will be functions of y. These functions applied to the random vector Y will result in the maximum l ikelihood (ML) estimators. The equation above is a complicated 38 function of and . In practice, both and are unknown. Although t he ML estimator of ‹ can be shown as in Equation (33), i t is not possible to write down an expression for the estimator for , , thus, the expression for is really not a closed -form expression, either, despite its form shown below . Thus, finding the values that maximize it for a given set of data is not something that can be done in closed form in general. 1mmI1I1iiiiii i1i1‹XXXX (33) However, there are some disadvantages with maximum likelihood estimations. While the ML estimates of for a particular model are approximately unbiased, the estimators for have been observed to be biased when m i s not too large. The form of the ML estimator for in the longitudinal data regression model has the form as if is known. Thus, it does not acknowledge the fact that must be es timated along with . The result is the biased estimation mentioned above. The fiadjustmentfl involves replacing the usual likelihood (ref). i/mI1/2I11/2 1iii iiiii n2i11|||XX|expYX||YX/2 2 (34) The resulting estimator for has been observed to be less biased for finite values of m than the ML estimator. The objective function above and the resulting estimation method are known as restricted maximum likelihood or REML. Once models are built with different covariance structur es or different estimation techniques, there is a need to compare them to pick the better model given the data. The commonly used approaches are comparing the Akaike™s information criterion (AIC) and Schwarz™s Bayesian information criterion (BIC) of these models. These approaches are based on comparing the penalized versions of the logarithm of the likelihoods obtained under 0H and 1H, where that 39 fipenaltyfl adjusts each log -likelihood according to the number of parameters that must be fitted. The more parameters are added to a model, the larger the (log) likelihood becomes. Thus, if two models are to be compared, then the numbers of parameters in each of the models should be considered. Based on how these fipenal izedfl versions are defined, the model that gives either the smaller or larger value is preferred . Let log bL denote a log -likelihood for a fitted model. AIC and BIC criteria are defined below. 2.3.1.1 Akaike™s information criterion (AIC). The penalty is to subtract the number of parameters fitted for each model. That is, if k is the number of parameters in the model, AIC is defined as follows (35; 36 ): AIC2k2lnL (35) The model with the smaller AIC value id preferred. 2.3.1.2 Schwarz™s Bayesian information criterion (BIC). The penalty is to subtract the numbe r of parameters fitted further adjusted for the number of observations. If N is the total number of observations, BICklnN2lnL (36) The model with the smaller BIC value id preferred. 2.3.1.3 Estimation of Random Effects Once the regression parameters and the parameters of the covariance structures are estimated, the next step is to estimate the random effects ib for each subject. The values of ibare estimated using condition estimations and multivariate normality. It may be shown that iiii2inDEb|Y YnD (37) 40 where iY is the mean of the in ij Y values in iY A larger (positive) ib value leads to a iY that is large r than , and similarly, if ib is small (negative), it leads to a iY that is fismallfl (smaller than the mean ). iiii2inDEEb|Y EY0 nD (38) However, since the values of D, 2 and are not known, they are replaced by their estimates. Hence, iii2inDbEY nD (39) and 1m2iii11m2ii1nDY nD (40) Once the random effect s that account for the subject -specific effects are obtained, the individual models can used to make predictions on test sections of interest. While the individual models successfully account for variations due to the specific subjects in the study, the su bjects are not considered as representatives of a larger population in such analyses. The sampling distribution from which the subjects are drawn is more of interest than the sample itself. Therefore, the purpose of mixed -effects models is to account for s ubject -specific variations more broadly, as random effects varying around population means (37) . Thus, if a mixed -effects model is fit to the available performance data, each test section will have its own set of parameters deviating from the population means of the parameters because of the inherent behavior of the test sections. If one is interested the predicting t he future performance of an existing test section as 41 part of the sample, then the subject -specif ic parameters can be used. However, if the interest is to predict the performance of a test section in the population, then the mean population parameters can b e used. Since the available number of test sections are limited, statistical sampling techniques such as bootstrapping can be utilized for a more robust way of quantifying model standard error and bias in addition to the no sampling technique. In no sampli ng technique, the entire dataset will be used only once to obtain one set of calibration coefficients that minimizes the model standard error. The b ootstrap technique involves sampling repeatedly from the dataset to form a new sample of the same size as th e original dataset for local calibration. 2.3.2 Transverse cracking For the cracking model, the typically calibrated coefficients are the C 4 and C 5. Pavement -ME developers choose ISLAB 2000 among other models for use in Pavement -ME. Even though FEMs have been p roven to be accurate for predicting pavement responses, they are quite inefficient for analyzing damage accumulation which may require the prediction of PCC tensile stresses for many loading and site condition combinations. Therefore, it was decided to us e artificial neural networks (ANN) to predict the critical bending stresses. To reduce the number of runs, equivalency concepts were used to reduce all the different input parameters to a few parameters. Once the stresses are predicted, the allowable numbe r of repetitions are estimated using the fatigue damage model as shown below. The current fatigue damage model in Pavement -ME is based on the fatigue damage developed using the 1999 using Corp of Engineers (COE) field aircraft data, with failure defined as 50 percent slab cracking (see Equation 3). To accurately calibrate C 1 and C 2, as seen in the PCC fatigue model, the peak tensile stresses need to be calculated for each combination of i, j, k, l, m, n, o using a FEM software (or possibly from 42 the strain d ata in the field) until the first crack is observed. Obtaining stresses from the strain data could be very complicated because of the variability in the traffic (different classes of trucks, axles, weights , etc.) and the need to know the exact loading and wander for which the strains were obtained. Therefore C 1, C2 cannot be calibrated unless there is data such as the number of repetitions to failure under different conditions , etc. are available. Further, Miner™s hypothesis has many limiting assumptions b ut used because of its simplicity. It is believed that all the variability due to the drawbacks of Miner™s hypothesis, simplifying assumptions in the structural models, variances in fatigue equations are taken care of by the transfer functions. The transfe r function (C 4 and C 5) can be calibrated to match the damage with the predicted field performance in terms of cracking. 2.3.3 IRI Calibration of the IRI model is relatively straight forward as it is simply a linear regression model. IRI model will be calibrated after the cracking and faulting models are locally calibrated. The predictions from the locally calibrated cracking and faulting models will be used , and a ll the coefficients of the IRI model will be calibrated by minimizing the model SSE. 2.3.4 Reliability Models Reliability has been incorporated into the Pavement -ME for all distresses and pavement types which can be specified by the designer. Design reliability (R) is defined as the probability (P) that the predicted distress will be less than the critical distress level over the design period (ref). The design reliability for all distresses can be shown by the following equations: = [ < ] = [ < ] 43 This means that if 10 projects were designed and constructed using a 90 percent desig n reliability for any distress, one of those projects, on average, would exceed the threshold or terminal value at the end of the design period. This definition deviates from previous versions of the AASHTO 1993 Pavement Design Guide in that it considers m ultiple predicted distresses and IRI directly in the definition. Design reliability levels may vary by distress type and IRI or may remain constant for each. It is recommended, however, that the same reliability be used for all performance indicators (ref) . The designer inputs critical or threshold values for each predicted distress type and IRI. The Pavement -ME procedure predicts the mean distress types and smoothness over the design life of the pavement. The mean value of distresses or smoothness predicte d may represent a 50 percent reliability estimate at the end of the analysis period (i.e., there is a 50 percent chance that the predicted distress or IRI will be higher than or less than the mean prediction). The reliability of a design is dependent on th e model prediction error (standard error) of the distress prediction equations. In summary, the mean distress or IRI value (50 percent reliability) is increased by the number of standard errors that apply to the reliability level selected. For example, a 7 5 percent reliability uses a factor of 1.15 times the standard error, a 90 percent reliability uses a factor of 1.64, and a 95 percent reliability uses a value of 1.96 based on the z scores of a standard normal distribution. The calculated distresses , and IRI are assumed to be approximately normally distributed over the ranges of the distress and IRI that are of interest in the design (9). The standard deviation for cracking model is a function of the error associated with the predicted cracking and the data used to calibrate the cracking model. The following procedure can be used to derive the parameters of the er ror distribution will consist of the following steps: 1. Group all the data by the level of predicted cracking. 44 2. Group the corresponding measured cracking data in the same distribution bins found in step 1 3. Compute descriptive statistics for each group of dat a (i.e. mean and standard deviations of predicted and measured cracking). 4. Determine the relationship between the standard error of the measured cracking and predicted cracking. The reliability model standard error includes the variation related to the fo llowing sources: 1. Errors associated to material characterization parameters assumed or measured for design 2. Errors related to assumed traffic and environmental conditions during the design period 3. Model errors associated with the cracking prediction algorith ms and corresponding data used. The above -discussed method will be used to obtain the reliability equations for the following performance models: 1. Transverse cracking 2. Transverse joint faulting For IRI, the software determines the reliability internally. 2.4 PAVEMENT -ME TRAFFIC INPUTS The Pavement -ME uses hierarchical input level s and provides flexibility to the designer in obtaining the design inputs based on the project importance . Three different input levels can be used in this hierarchical system ranging from site -specific input values to fibest -esti matefl or default values as classified below : a) Level 1 Œ These inputs provide the highest level of accuracy because they are site/ project -specific and are measured directly, b) Level 2 Œ These inputs provide an intermediate level of accuracy and are used when Level 1 inputs are unavailable . Correlation or regression equations are used to estimate these inputs. 45 c) Level 3 Œ These inputs are based on global or regional averages and provide the least amount of knowledge regarding the input parameters (Ideal for low volume roads). The Pavement ME accepts an array of traffic inputs for use in design. Most of these inputs can be obtained through weigh -in-motion (WIM), automatic vehicle classification (AVC), and vehicle counts , etc . Table 2-4 summarizes each of these traffic inputs based on the available hierarchical levels (19). Each of these traffic inputs is briefly discussed below . 2.4.1 Directional distribution factor (DDF) The traffic volume in the design direction expressed as a percentage of the overall volume of traffic in both directions. While a value of 50 percent is assumed, it usually depends on the commodities being transported as well as other regiona l/local patterns. The Pavement -ME assumes it to be constant over time and for vehicle classes. These values can be obtained from the AVCs or traffic count data measured over time. 2.4.2 Lane distribution factor (LDF) Trucks in the design lane expressed as a percentage of trucks in the design direction. For two -lane, two -way highways (one lane in one direction), LDF is equal to 1. For multiple lanes in one direction, it depends on the AADTT and other geometric and site -specific conditions. LDFs can be calculated from the AVCs or traffic count data measured over time. They are assumed constant with time and for all truck classes. 2.4.3 Axles per truck class Axle types per truck class represent the average number of axles for e ach truck class (class 4 to 13) for each axle type (single, tandem, tridem, and quad). The default number of axle types per truck class in the Pavement -ME were estimated by using the LTPP data. L ocal values can be different from the default , especially for unique truck class definitions not included in the 46 Pavement -ME software . However, most studies have found the values to be reasonable for the standard truck class definitions (38). The local defaults can be obtained from the WIM sites. 2.4.4 Axle and t ire spacing The computed pavement responses are sensitive to both wheel locations and the interaction between the various wheels on a given axle. A set of axle spacing defaults were developed from LTPP WIM data. Default axle spacing s are limited to three axle types: tandem, tridem , and quads. Defaults for this input parameter can vary state -by-state and depend on the truck class es (38). These values can be obtained from the truck manu factures specifications. 2.4.5 Tire pressure Pavement responses are dependent on the tire dimensions and inflation pressures. Tire pressure is constant between all truck classes and does not change over time. A default value of hot inflation pressure of 120 psi is used in the Pavement -ME. The reasonableness of this default value is based on a limited number of tire pressure studies conducted by different agencies (38). These values can be obtained from the tire manufacturer specifications. 47 Table 2 -4 Traffic data required for the three Pavement ME input levels Data Elements/Variables Input Level I II III Truck Traffic & Tire Factors Directional distribution factor (DDF) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Truck lane distribution factor (LDF) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Axle/truck class Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Axle and tire spacing Hierarchical levels not applicable for these inputs Tire pressure Traffic growth Vehicle operational speed Lateral distribution (wheel wonder) Monthly adjustment factor (MAF) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Hourly distribution factor (HDF) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Truck Traffic Distribution and Volume AADT or AADTT for the base year Hierarchical levels not applicable for these inputs Truck dist/spectra by truck class (VCD) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Axle load dist/spectra by truck class and axle type (ALS) Site-specific WIM or AVC Regional WIM or AVC National WIM or AVC Truck traffic classification (TTC) group for design Hierarchical levels not applicable for these inputs % of trucks 2.4.6 Traffic growth Nationally, t here is no default value , but a 2% to 4% linear growth is typically used . The value and function do not change over ti me for individual truck classes ; values & growth function can change between truck classes. The site -specific values can be obtained from h istorical AVC or truck count data (38). 48 2.4.7 Operational speed There i s no default value , but the speed limit depends on the functional class , terrain, the percentage of trucks , etc . The value is independent of truck classes. 2.4.8 Lateral Wander Lateral wander v alue is constant for all truck classes and does not change over time. A default value of 10 inches is recommended . Limited data are available from the AASHO Road Tes t and a few limited studies (38). These values can be obtained from site surveys. 2.4.9 Monthly adjus tment factor (MAF) The monthly distribution factors convey the seasonal variations in AADTT by assigning a normalized weight fac tor to each month of the year. The default data in the Pavement -ME assumes a seasonally independent value of ‚1™ for each of the 12 M AFs. Consequently, months with higher AADTT than others will receive a weight factor greater than 1 while months having lower AADTT will be assigned a weight factor less than 1. Other studies (19; 39 -41) which evaluated MDFs, found different distributions. TMG suggests that two traffic patterns exist, consisting of a fiflat urbanfl which is seasonally independent, and a firural summer peakfl in which the summer months experience higher AADTT than the winter (41). The MEPDG Design Guide states that pavements may be sensitive to M AFs and are influenced by factors such as adjacent land use, the location of industries in the area, and whether the site is rural or ur ban (19). 2.4.10 Hourly distribution factor (HDF) HDFs establish the percentage AADTT that travel s on the roadway for each of the 24 hours within a day. As most can relate to the increase of cars on the roadway during rush hour or peak hour, trucks also exhibit time -dependent behavior. Most hourly distr ibution factors exhibit a trend of having a peak period between the hours of 10:00 am and 5:00 pm (42; 43 ). The TMG 49 cites a study (41) in which trucking patterns were found to exhibit two types of patterns. The first one is an almost constant percentage of trucks each hour throughout the day and the other having a single -humpe d peak, typically during the morning. The constant percentage trucks throughout the day signified a greater presence of long -haul through trucks , whereas the peaked distribution was found to be consistent with local trucks (41). The default HDFs in the Pavement -ME are shown in Figure 2 -2, and the actual values by hours are shown in Table 2 -5. 2.4.11 Vehicle class distribution (VCD) The FHWA separates all traffic into 13 vehicle classes (classes 1 through 13) , as shown in Table 2-6. VCD represents the percentage of each truck class (classes 4 through 13) within the AADTT for the base year. The sum of the percent AADTT of all truck classes should be 100. The MEPDG manual (19) reveals that VC 5 and VC 9 vehicles dominate the truck traffic distribution, with varying percentages of other truck classes. Vehicle class distributions are estimated from short -duration counts such as WIM and AVC sites, urban traffic centers, toll facilities, etc . 2.4.12 Axle load spectra (ALS) The P avement -ME establishes an axle load spectra for each axle con figuration within each vehicle class. The percentage of axles is distributed into the following load bins for each axle configuration and vehicle class. Single: 3000 -41000, in 1000 lb increments (39 bins) Tandem: 6000 -82000 in 2000 lb increments (39 bins ) Tridem: 12000 -102000 in 3000 lb increments (31 bins) Quad: 12000 -102000 in 3000 lb increments (31 bins) 50 ALS are dependent on seasons but independent with time (the values do not change over the analysis period; year -to-year). Many sites located on the interstate and primary roadways have axle load spectra that are not likely to be dependent on the season. Table 2 -5 The Pavement -ME default hourly distribution factors Hour HDF Hour HDF 0 2.3 12 5.9 1 2.3 13 5.9 2 2.3 14 5.9 3 2.3 15 5.9 4 2.3 16 4.6 5 2.3 17 4.6 6 2.3 18 4.6 7 5 19 4.6 8 5 20 3.1 9 5 21 3.1 10 5 22 3.1 11 5.9 23 3.1 Figure 2 -2 Default HDFs in the MEPDG 0123456704812162024HDFHour 51 Table 2 -6 FHWA Vehicle Classes FHWA Vehicle Class Description Example Vehicle Configuration 4 Two -Axle Buses 5 Two -Axle, Six -Tire, Single -Unit Trucks 6 Three -Axle Single -Unit Trucks 7 Four or More Axle Single -Unit Trucks 8 Four or Fewer Axle Single -Trailer Trucks 9 Five -Axle Single -Trailer Trucks 10 Six or More Axle Single -Trailer Trucks 11 Five or fewer Axle Multi -Trailer Trucks 12 Six-Axle Multi -Trailer Trucks 13 Seven or More Axle Multi -Trailer Trucks NOTE: In reporting information on trucks the following criteria should be used: Truck tractor units traveling without a trailer will be considered single -unit trucks. A truck tractor unit pulling other such units in a "saddle mount" configuration will be considered one single -unit truck and will be defined only by the axles on the pulling unit. Vehicles are defined by the number of axles in contact with the road. Therefore, "floating" axles are counted only when in the down position. The term "trailer" includes both semi - and full trailers. 2.5 REVIEW OF PREVIOUS STUDIES ON TRAFFIC CHARACTERIZATION Recently, s everal r esearch studies focus ed on the following areas: Analy zing Weigh -in-Motion (WIM), Automated Vehicle Classifier (AVC), and automated traffic recorder (ATR) data with appropriate quality checks to develop traffic inputs for the P avement -ME. 52 Evaluating the effect of traffic inputs on the Pavement -ME distress predicti ons and final pavement design thickness (sensitivity analysis). Appl ying statistical models and techniques such as cluster analysis in identifying homogenous traffic patterns. Review ing current traffic collection infrastructure and practices to meet the tr affic input requirements of the P avement -ME. The re search team has found various guidelines, statistical models , and techniques used to obtain the Level s 2 and 3 inputs for use in the P avement -ME. Therefore, a review of these studies has been conducted to study the application of different approaches in traffic characterization. A summary of the review is presented below. 2.5.1 National Studies Results of the research studies related to loading inputs (ALS) for use in the ME design procedure are discussed in this section . 2.5.1.1 NCHRP 1 -37A Study The NCHRP 1 -37A final report provides guidelines for truck traffic data collection for both axle weights and truck volumes (44). These guidelines are based on the allowable error and permissible bias for each data element in establishing the normalized truck volume distribution and normalized axle load spectra ( NALS ). Truck traffic classification (TTC) groups were developed based on the analysis of national WIM and AVC data collected through the LTPP program. These TTC groups are used to characterize truck volume by vehicle class rather than by vehicle weight. Each TTC group represents a traffic stream with unique truck traffic characteristics (see Table 2 -7). For example, TTC 1 describes a traffic stream that is heavily populated with single -trailer trucks , and TTC 17 contains more buses. A standardized set of TTC 53 groups that best describes the traffic stream for the different road functional classes is prese nted in Table 2 -8. Table 2 -9 presents t he recommended data collection frequency for determining the TTC groups . Table 2 -7 NCHRP 1-37A Truck traffic classification (TTC) groups (44) TTC group TTC description Vehicle/Truck class distribution ( %) 4 5 6 7 8 9 10 11 12 13 1 Major single -trailer truck route (Type I) 1.3 8.5 2.8 0.3 7.6 74.0 1.2 3.4 0.6 0.3 2 Major single -trailer truck route (Type II) 2.4 14.1 4.5 0.7 7.9 66.3 1.4 2.2 0.3 0.2 3 Major single - and multi - trailer truck route (Type I) 0.9 11.6 3.6 0.2 6.7 62.0 4.8 2.6 1.4 6.2 4 Major single -trailer truck route (Type III) 2.4 22.7 5.7 1.4 8.1 55.5 1.7 2.2 0.2 0.4 5 Major single - and multi - trailer truck route (Type II) 0.9 14.2 3.5 0.6 6.9 54.0 5.0 2.7 1.2 11.0 6 Intermediate light and single -trailer truck route (I) 2.8 31.0 7.3 0.8 9.3 44.8 2.3 1.0 0.4 0.3 7 Major mixed truck route (Type I) 1.0 23.8 4.2 0.5 10.2 42.2 5.8 2.6 1.3 8.4 8 Major multi -trailer truck route (Type I) 1.7 19.3 4.6 0.9 6.7 44.8 6.0 2.6 1.6 11.8 9 Intermediate light and single -trailer truck route (II) 3.3 34.0 11.7 1.6 9.9 36.2 1.0 1.8 0.2 0.3 10 Major mixed truck route (Type II) 0.8 30.8 6.9 0.1 7.8 37.5 3.7 1.2 4.5 6.7 11 Major multi -trailer truck route (Type II) 1.8 24.6 7.6 0.5 5.0 31.3 9.8 0.8 3.3 15.3 12 Intermediate light and single -trailer truck route (III) 3.9 40.8 11.7 1.5 12.2 25.0 2.7 0.6 0.3 1.3 13 Major mixed truck route (Type III) 0.8 33.6 6.2 0.1 7.9 26.0 10.5 1.4 3.2 10.3 14 Major light truck route (Type I) 2.9 56.9 10.4 3.7 9.2 15.3 0.6 0.3 0.4 0.3 15 Major light truck route (Type II) 1.8 56.5 8.5 1.8 6.2 14.1 5.4 0.0 0.0 5.7 16 Major light and multi -trailer truck route 1.3 48.4 10.8 1.9 6.7 13.4 4.3 0.5 0.1 12.6 17 Major bus route 36.2 14.6 13.4 0.5 14.6 17.8 0.5 0.8 0.1 1.5 Table 2 -8 NCHRP 1 -37A guide for selecting appropriate TTC groups (44) Highway functional classification descriptions Applicab le TTC group number Principal Arterials Œ Interstate and Defense Routes 1,2,3,4,5,8,11,13 Principal Arterials Œ Intrastate Routes, including Freeways and Expressways 1,2,3,4,6,7,8,9,10,11,12,14,16 Minor Arterials 4,6,8,9,10,11,12,15,16,17 Major Collectors 6,9,12,14,15,17 Minor Collectors 9,12,14,17 Local Routes and Streets 9,12,14,17 54 Table 2 -9 Minimum number of data collection days per season to estimate TTC (19) Expected error (+ %) Confid ence level (%) 80 90 95 97.5 99 20 1 1 1 2 2 10 1 2 3 5 6 5 3 8 12 17 24 2 20 45 74 105 148 1 78 180 295 Š Š For axle loading inputs, only one set of ALS for each truck class was determined and included in the software as none of the ALS for the differ ent roadway functional classes were found to be significantly different in terms of the predicted distress. One reason for the insignificance is that most of the WIM s ites were located along rural interstates and/or primary arterials where local truck traffic may have a lesser impact on the ALS . Table 2-10 provides the frequency of truck weight data collection recommended for establishing the NALS (19; 38 ). Table 2 -10 Minimum number of data collection days per season to estimate ALS (19) Expected error (+ %) Confidence level (%) 80 90 95 97.5 99 20 1 1 1 1 1 10 1 1 2 2 3 5 2 3 5 7 10 2 8 19 30 43 61 1 32 74 122 172 242 2.5.1.2 Federal Traffic Monitoring Guidelines The 20 16 FHWA Traffic Monitoring Guide (TMG) (41) provides recommendations and best practices for highway traffic monitoring , including monitoring of truck loading . The TMG recommends a relativel y small truck weight program, primarily due to the cost of weight data collection and the limitations of available equipment. The following recommendations can be infe rred from the TMG: Collecting a representative sample of traffic loading data using truck weight roadway groups 55 Making sure that the roadway groups should have similar vehicle types and similar truck axle weight distributions for all roads within that group . Collecting w eight data by using permanently installed WIM sites or at least permanently installed in -pavement WIM sensors to achieve accurate data. Calibr ating WIM equipment against systematic errors i s critical to WIM data collection . Obtaining d ata such that it account s for the day -of-week and seasonal changes in vehicle weights that occur within each group . The truck traffic may var y significantly within a state depending on the road and land use . The roadway system could be divided into roadway groups such that each road within a group experiences similar truck -loading patterns. These groups may be defined based on different methods, such as statistical analysis, professional judgment based on local knowledge of loading characteristics, or a combination of both. Characteristics of the freight moved on the roads, including the type of commodities carried, the vehicles used, and the freight movement could be used for dividing the roadway system (41). The developed roadway groups should be simple enough and logical in discriminating roads that are likely to have different traffic loading patterns. The developed roadway groups should be periodically reviewed as more traffic data within th e state becomes available over time. The accuracy of these road groups depend s on the accuracy and precision of the collected weight data . Also , the more data collection sites within a roadway group , the higher will be the confidence level in the traffic inputs generated. A minimum of six WIM sites with permanently installed WIM sensors per truck weight group is recommended (41). 56 2.5.1.3 NCHRP 1 -39 Guidelines The NCHRP 1 -39 report (45) contains guidelines for collecting traffic data to be used in mechanistic -empirical pavement design. Three levels of axle -load distribution (or fiload spectrafl) data are needed for the P avement -ME: (a) s ite -specific , (b) TWRG , and (c) statewide averages . Site-specific data requires an adequate ly calibrated WIM system and near the roadway segment to be constructed or rehabilitated. If the WIM system is unavailable or not pr operly calibrated (according to the ASTM requirements), Level 2 design inputs should be used to characterize traffic for design. TWRG axle -loading data are needed because most State s do not have sufficient site -specific WIM data for the majority of paveme nts they design each year. The TWR Gs are likely to be state -specific, but multiple state s can create firegionalfl axle load distribution values if these State s have similar truck weight laws and enforcement programs . The intent is to group roads by their trucking characteristics so that the load spectra on all the roads in a group are similar. The challenge is to determine the roads (and directions of travel , in some cases) to choose for grouping. The grouping proces s requires analysis of a State™s available weight data and trucking patterns, possibly for different truck classes , and it results in the creation of appropriate TW RGs . Roadways with similar truck classes may carry different loads. For example, a single ro ad could have loaded trucks in one direction and unloaded trucks in the other direction resulting in two TWRG s needed to characterize axle load distributions for that road. Also , it was reported that the simple averages of the load distribution at all sit es in a TWRG produced better results than weighted averages. It is attributed to a significant positive correlation between the volume of tr ucks in a particular vehicle class operating at a site and the 57 average loads of these trucks. Because of this correl ation, weighted averages produce d higher estimates of average pavement load per vehicle than simple averages. It was also recommended that the s tatewide axle -load distribution should be used only when a highway agency has little knowledge of the loads that trucks will carry on the roadway being designed . This means that the agency has little confidence in its ability to predict the T WRG for the pavement section. Statewide load distributions are obtained (for each vehicle class) by combining the data collected from all WIM sites in a State . These distributions then serve to represent fiaverage conditionsfl that can be used whenever better data is unavailable (38; 45 ). 2.5.1.4 LTPP Traffic Pooled -Fund Study The LTPP TPF -5(004) study has generated high -quality traffic loading information for 26 LTPP Special Pa vement Study (SPS) sites located in 23 different state s representing moderate and high volume rural principal arterial interstate and non -interstate highways . LTPP defines research -quality traffic data as at least 210 days of data (in a year) collected at a calibrated WIM site conforming to the LTPP™s WIM performance requirements (tolerance defined as the percent error computed using 95% confidence limit of error) for single axles, axle groups, gross vehicle weight, vehicle length (bumper -to-bumper), vehicl e speed, and axle spacing, as detailed in Table 2-11 (46). Table 2 -11 LTPP WIM system performance requirements Pooled -fund site factors 95 Percent confidence limit of error (tolerance for % error) Loaded Single Axles +/-20 percent Loaded Axle Groups +/-15 percent Gross Vehicle Weights +/-10 percent Vehicle Length greater of +/ -1.5 ft or +/ -3 percent Vehicle Speed +/-1 mph Axle Spacing Length +/- 0.5 ft [150 mm] 58 The WIM data from the LTPP TPF 5(004) study were used to develop a two -tier ALS default in a new FHWA study (46): Tier 1 Global defaul ts representing average loading Tier 2 Defaults representing different loading patterns (clusters) The methodology for developing LTPP Tier 1 NALS defaults is very simil ar to the process used to create the original NCHRP 1 -37A defaults. However, data used to develop LTPP defaults are of higher quality but of lesser quantity (fewer WIM sites) than the original NCHRP 1 -37A defaults. Tier 2 defaults were developed based on hierarchical clustering of axle load data from multiple sites. Sites that had similar loading conditions were clustered together. Clusters were differentiated based on the differences that load spectra representing each clu ster are likely to have on the Pavement -ME outcomes. The Pavement -ME thickness and design life predictions were used to determine what constitutes practical significance in pavement design outcomes to different load spectra clusters . As a result, clusterin g of load spectra was weighted greatly by the presence of heavy loads. Several alternative default axle loading categories were identified for each vehicle class , and axle group and default normalized axle load spectra (NALS) were developed to represent th ese loading patterns. The definitions of different default traffic loading clusters for ALS and their attr ibutes are provided in Table 2-12 (38; 46 ). In addition to the defaults, guidelines for State highway agencies were developed showing how to apply the methodology from the LTPP study to develop State -specific traffic loading defaults for the pavement design use. 59 Table 2 -12 Summary of NALS categories by weight for different axle group types Axle load ing category by weight Average RPPIF per cluster Percent of single axles >= 15 kip Percent of tandem axles >=26 kip Percent of tridem axles >=39 kip Percent of quad axles >=54 kip Very Light (VL) <0.05 <3% 0% n/a n/a Light (L) 0.05 -0.15 <10% <10% n/a n/a Moderate (M) 0.15 -0.30 10-30% 10-30% n/a n/a Heavy (H)* 0.30 -0.50 >30% 30-50% <50% <30% Very Heavy (VH) >0.50 n/a >50% >50% >30% *For roads with a high percentage of Class 9 vehicles, fiHeavy fl loading category was further subdivided to fiHeavy 1fl and fiHeavy 2 fl based on the observed high sensitivity of MEPDG outcomes to Class 9 tandem axle load spectra. fiHeavy 1fl category has RPPIF of 0.3 -0.4 and percentage of heavy tandem axles between 30 and 40 percent. fiHeavy 2fl category has RPPIF of 0.4 -0.5 and percentage of heavy tandem axles between 40 and 50 percent. RPPIF = Relative Pavement Performance Impact Factor; summary statistic developed for the study to identify and group load spectra that likely to have a similar effect on pavement design outcomes use global MEPDG pavement performance prediction models. The n ewly computed ALS defaults ha d a fewer very light and heavy loads compared to the original defaults. This is likely due to the fact that the new defaults were collected with more consistently calibrated and precise WIM equipment than the data set used for the development of the original NALS defaults under the NCHRP 1 -37A project. The better calibration of the WIM scales used to develop the ne w defaults could result in fewer very light loads (caused by under calibrated scales observing light loads) and fewer very heavy loads (caused by over calibrated scales observing heavy loads) are observed in the new default database. Assuming that the new LTPP defaults are more accurate, a conclusion could be drawn that pavement designs using the new defaults will be thinner than the designs using the original Pavement -ME defaults. However, from a practical perspective, the difference in the design thicknes s was significant only for a limited number of pavement scenario s tested (38). 2.5.2 Other States Many other state highway agencies (SHAs) have completed studies to determine the truck traffic weight , and volume defaults to be used with the Pavement -ME. Some of these agencies include Arizona, Alabama, Arkansas, Colorado, Geo rgia, Idaho , Missouri, Montana, North Carolina, and 60 Wyoming. Most studies have found that the axle load spectra deviate from the global default values currently included in the Pavement -ME software, especially for local and secondary routes. Thus, the axle load spectra or distributions can depend on the roadway use and/or geographical location . 2.5.2.1 Arizona A research study (47) spo nsored by the Arizona Department of Transportation (ADOT) addressed the collection, preparation, and use of traffic data required for pavement design . Procedures to collect Level 1 traffic inputs were documented . Level s 2 and 3 recommended inputs and defau lts are provided based on the best historical data available to date using multivariate hierarchical statistical cluster analyses (using correlation coefficient (R 2) method) (47). Although determining the optimum number of clusters within a dataset is a subjective decision, five diagnostic statistics were used for determining the optimum number of clusters. Those were (a) Cubic clustering criterion (CCC), (b) Cumulative and partial sq uared multiple correlations (R 2), (c) Eigenvalue and associated variance (VAR), (d) Pseudo F (PSF) and (e) Pseudo t 2 (PST2). Based on the clustering and sensitivity analyses, two clusters for vehicle class distribution, two clusters for hourly distribution factors, one cluster for monthly distribution factor, three clusters for axle load distribution, and one cluster for axles per truck were recommended. The selection criteria of clusters are based on the highway functional classes (47). 2.5.2.2 Alabama A study (48) was conducted in the State of Alabama to develop traffic data clusters for use as inputs in the Pavement -ME. While the Pavement -ME requires only three input levels, the second level inputs were further split into two subcategories in this study. The levels considered were: 61 (a) Level 1 Œ Site and direction -specific data, (b) Level 2A Œ Cluster or WIM group data, (c) Level 2B Œ Statewide data and (d) Level 3 Œ Nationwide data. Thirteen types of traffic inputs were identified based on the Michigan st udy (49), and clusters were developed for those inputs. Those 13 inputs are 1 HDF, 1 VCD, 4 AGPV ( single, tandem, tridem and quad), 3 MDF (single unit, tractor -trailer and multi -trailer) and 4 ALS (single, tandem, tridem and quad). It was noted in the study that hierarchical cluster analysis was the most popular clustering technique. Citing the disadva ntages of using Euclidean distance, which is the state of the practice, the researchers used Pearson ™s correlation coefficient (k) for clustering purposes. Also, a correlation -based clustering that combines Pearson™s correlation distance measure (to ev aluate similarity) with unweighted pair group method using arithmetic averages (UPGMA) (to cluster WIM sites) was developed in this study. Once the clusters were developed, sensitivity analyses were conducted to quantify the differences in required pavemen t thickness between different traffic inputs levels. Geographical patterns were defined to assist in selecting the appropriate clusters for new pavement design (48). 2.5.2.3 Arkansas Another study conducted in the State of Arkansas analyze d WIM data by using cluster analysis methodologies to identify groups of WIM sites with similar traffic characteristics based on the required traffic attributes (50). The research team normalized the t raffic data attributes on an annual basis. Ten WIM sites located in Arkansas , which passed the truck weigh data quality check process have been used in the analyses. Ward™s minimum -variance method was used . A dissimilarity coefficient matrix based on the Euclidean distance for each pair of objects was computed for the 10 WIM sites. Three clusters were identified when the distribution of the gross weight of Class 9 truck was used as the attribute. T wo other clustering approaches, K -mean , and 62 fuzzy cluster analyses were also applied to the data for comparison purposes. Th e classifications of clusters had little differences among these three approaches used indicating the patterns of the traffic stream were consistent regardless of cluster methodologies. The clusters for vehicle class distribution factors ( VCDs), hourly dis tribution factors (HDFs), and monthly adjustment factors (MAFs) were identified by using the K -means clustering procedure. Three clusters for vehicle class distributions and monthly adjustment factors and two clusters for hourly distribution factors were o bserved . Grouping based on a combination of known geographic, industrial, agricultural, and commercial patterns was done using Fisher™s Exact Test (51) for developing the loading groups . Categorical statistical models (multi -category logit [ML] models) were developed to assign a new pavement design project to a cluster (50). 2.5.2.4 Colorado A study was conducted in Colorado with the main objectives being (1) determine how representative available traffic data are for pavement design in Colorado using the Pavement -ME, (2) detect natural groupings or clusters within the available traffic data, and (3) develop defaults for Level s 2 and 3 traffic inputs for pavement design (52). Statistical analysis to determine natura l clusters within the traffic and the optimum number of clusters was conducted . Natural clusters within the large Colorado traffic data assembled were determined using statistical multivariate hierarchical cluster analysis similar to the analysis done in t he State of Arizona (47). Clusters were formed for vehicle class distribution, hourly truck volume distribution, monthly adjustment factors, axles per truck class f actors, axle load distribution. 2.5.2.5 North Carolina North Carolina Department of Transportation (NCDOT) s ponsored a study for the implementation of the Pavement -ME in the State of North Carolina (53). The study included 63 developing the need for resources, procedures, and guidelines for NCDOT traffic data needed for the Pavement -ME. Clustering analyses were performed to develop the required traffic inputs. Initial clustering analysis of 42 WIM sites based on VCD for different months resulted in three major clusters or factor groups. Each factor group includes WIM sites that tend to remain in the same cluster over the year (from January to December). Even though the cluster analyses led to different clusters, the pavement performance was found to be insensitive to hourly distribution factors and monthly adjustment factors. Hence statewide averages were recommended for use . Multi -dimensional clustering was used to determine the Level 2 inputs for axle load spectra. Multi -dimensional clustering tests the similarity among WIM data based on several attributes, where one-dimensional clustering does it based on one attribute at a time. One dimensional analysi s provides clusters which are distinct by one axle type, but they are difficult to interpret or relate to a definite traffic pattern. Therefore , the cluster representing single axles may not contain the characteristics of roadways where tandem axles are predominant . Moreover, Class 5 (two single axles) and Class 9 (one single axle and two tandem axles) are the predominant truck classes in North Carolina. Class 5 and 9 represent ed single and tandem axles better, respectively (54). Thus, the implementation of multi -dimensional (two -dimen sional clustering using Ward™s method) clustering may improve the results, because it considers the relationship of multiple attributes simultaneously and processes well -explained clusters. For new pavement projects, 48 -hour site -specific classification co unts were used to derive the traffic parameters (53). 64 2.5.2.6 New York A study was perfo rmed to c haracterize the traffic inputs (VCD, MDF, HDF, AGPV , and axle load spectra) for the State of New York. Data were obtained from vehicle classification and WIM sites in New York during the years 2007 to 2011. Cluster analysis was performed only for VCD, MDF , and HDF due to the unavailability of data for a sufficient number of WIM sites. The MEPDG analyses were executed to study the effect on predicted pavement performance using site -specific , regional (cl usters), statewide average and the MEPDG default values on predicted performance measures for convention al new flexible and rigid pavement structures. Ward™s method of cluster analysis was adopted. Semi -partial R -squared (SPR) value s were used to determine the number of clusters to be selected for further analyse s. Four clusters were formed for the vehicle classification distribution (VCD). Th ose are differentiated based on proportions of Class 5 and Class 9 vehicles. The direction of travel has little imp act on the VCD. The results of cluster analysis are consistent for all the years. Multi -dimensional clustering was adopted for monthly distribution factors considering Class 5 and 9 vehicles simultaneously. Four clusters were formed for 2007, 2008 and 2010 . However, three and five clusters were formed for 2009 and 2010 , respectively. Four clusters are found for hourly distribution factors for each of the years. The results of cluster analysis are almost consistent over the years. HDF does not show any impac t on pavement performance . The study recommends statewide average values for VCD, MDF, AGPV, and ALS . 2.5.2.7 Georgia A study was conducted to make recommendations for establishing Georgia Department of Transportation (GDOT) traffic load spectra program and the WIM data collection plan to support the implementation of the Pavement -ME analysis and design (38). There are very few permanent 65 WIM sites in the State of Georgia , and the data obtained from the portable WIM sites were considered inade quate as a Level 1 input. It was mainly due to the limitation of equipment accuracy and challenges with field calibration of the portabl e WIM system. GDOT™s vehicle classification data from automated vehicle classification (AVC) sites were also reviewed and categorized by the MEPDG truck traffic classification (TTC) groups by the researchers . Not all default MEPDG TTCs were observed in Geo rgia. The study recommended that NALS defaults developed as part of the FHWA study (46) be used until more Georgia permanent WIM data become available to compute Georgia -specific loading defaults . This recommendation was based on similarities in loa ding characteristics and the P avement -ME outcomes using Georgia WIM data and LTPP defaults. For new alignments, it was recommended that the new NALS be based on the type of traffic loading condition expected based on aggregated road functional classes, GA freight route designation, and expected AADTT and percent of class 9 vehicles A decision tree based on these factors was developed to assist the pavement designers . 2.5.2.8 Idaho Site-specific traffic inputs were developed based on the analyses of traffic data from 12 out of 25 WIM sites in Idaho as part of the States™ MEPDG i mplementation effort. Statewide axle load spectra and an average number of axles per truck were established . The s ignificance of MEPDG predicted performance in relation to axle load spectra, vehicle class distribution, monthly adjustment factors , and an average number of axle per truck were also investigated . The results showed an average directional distribution , and lane distr ibution factors agree quite well with the MEPDG recommended default values. Also, i n general, Class 9 followed by Class 5 trucks represented the majority of the trucks traveling on Idaho roads. The vehicle class distribution factors at 5 out of 12 investig ated WIM sites did not match any of the MEPDG recommended 66 TTC groups. The developed MAF ranged between 0 and 4, indicating that truck volu mes vary from month to month. The peak locations of the developed statewide and the MEPDG default ALS were fairly simi lar for the majority of the truck classes and axle types. However, the percentages of axles within these peaks were different, especially for the tridem and quad axles (55). The number of single, tandem , and tridem axles per truck for all truck classes based on Idaho data was found quite similar to the MEPDG default values. Idaho data showed few percentages of quad a xles for truck classes 7, 10, 11, and 13 compared to the MEPDG default values , which are all zero. The developed statewide axle load spectra yielded significantly higher longitudinal and alligator cracking compared to the MEPDG default spectra. No significant difference s were observed for predicted AC rutting, total rutting, and IRI based on statewide and the MEPDG default spectra. High prediction errors were found for longitudinal cra cking when statewide/national (L evel 3) axle load spectra, vehicl e class distribution, or monthly adjustment factors were used instead of site -specific (L evel 1) data. Large prediction errors in alligator cracking were only found when the statewide default axle load spectra were used compared to site -specific spectra. Moderate errors were found when the MEPDG typical default monthly adjustment factors or vehicle class distribution were used instead of the site -specific va lues. The input level of the axle load spectra, monthly adjustment factors, vehicle class distributi on, and number of axles per truck had very low impact on predicted AC rutting and negligible im pact on total rutting and IRI. The input level of the number of axles per truck had a negligible influence on the MEPDG predicted performance. (55) 67 2.6 REVIEW OF EXISTING P RACTICES IN MICHIGAN A review of the existing clustering methodology was conducted to see if it is still the fibestfl way for MDOT to develop Levels 2 and 3 inpu ts for the Pavement -ME use. The current Level 2 methodology has some practical limitations (freight data availability issues for cluster assignments). Hence, t he main focus of this study is to develop an alternative simpl ified methodology for the generation of Level 2 inputs. One such method is to use available MDOT traffic inventory data (AADTT, VCD, and road information) to group the PTR sites. Once the groups are identified, the traffic inputs based on the averages of sites in each group should be established. The inputs developed using clustering methodology will be identified as ‚Level 2A™ while inputs developed using the alternative simplified methodology will be called ‚Level 2B™ inputs. Both the se input levels will be te sted for pavement design accuracy. The methodology that balances the a ccuracy and practicality will be recommended for adoption and future updates. These methodologies are discussed briefly below and in greater detail in chapter 4. 2.6.1 Improved Existing Methodology (Level 2A Inputs) Based on the review, several improvements are recommended to enhance the existing methodology to characterize the traffic inputs based on the new (2011 to 2015) WIM and classification data. Several areas were identifi ed during the review that would lead to the improved road groupings or clusters for more representative Level 2 and Level 3 defaults. The following potential improvements are proposed in the current methodology: 1. Level 3 statewide defaults Œ Based on the di stribution of PTR locations, i t may be beneficial to have 2 statewide defaults: one for interstates and roads with a high volume of heavy trucks (i.e. , designated state freight routes) and one for all other roads. 68 2. Level 2 MAF groups/clusters Œ In the previ ous study, vehicle classes were divided into three groups based on body type: VC 4 -7, VC 8 -10, and VC 11 -13 (i.e., single -unit, tractor -trailer combination, and multi -trailer combination). Very little seasonal variations were observed. This could be becaus e different truck classes were combined in the same group (i.e., local service trucks or resource -extraction industry trucks were grouped with long -haul trucks ). Analysis of individual vehicle classes may help to better identify seasonal trends. This would be most applicable to the vehicle classes that are frequently observed on MI roads (typically classes 5 and 9). 3. An easy to follow step -by-step procedure will be developed for future use by MDOT personnel. 4. The current methodology uses freight data and dis criminant analysis for assigning a site to a cluster for significant traffic inputs. However, some routes do not have freight data available. In such cases, there is a need to develop alternative approaches to establish Level 2 traffic inputs for a particu lar project. For example, machine learning techniques such as decision trees based on data availability can be developed and used for cluster assignments. 2.6.2 Alternative Simplified Methodology (Level 2B Inputs) The alternative simplified methodology will in clude the following steps: 1. Use available MDOT traffic data ( AADTT, VCD information, and road inventory/GIS information ) to identify ranges of values for different traffic input parameters for different road functional class groups, e.g.: Rural interstates Urban interstates Urban freeway/expressway and non -interstate principal arterials 69 Rural non -interstate principal arterials All minor arterials and collectors 2. If needed, introduce additional parameters that are available to MDOT such as road type or desi gnation (Interstate, US, State, county, etc.) , the direction of travel, proximity , and size of metropolitan areas, and freight route designation. 3. Once the groups are identified, establish the traffic inputs based on the averages of sites in each group. Averages with these groups should be updated when a PTR site is removed or added , or new traffic data become available. 2.7 SUMMARY This chapter presents a review of the performance models used in the Pavement -ME software, the required inputs , and techniques for the performance models calibration. Also presented in this chapter is a review and state -of-the -practice related to traffic inputs in the Pavement -ME. The Pavement -ME uses hierarchical input levels and provides flexibility to the designer in obtaining the design inputs based on the project importance. Three different input levels can be used in this hierarchical system ranging from site -specific input va lues to fibest -esti matefl or default values as classified below: 1. Level 1 Œ These inputs provide the highest level of accuracy because they are site/ project -specific and are measured directly, 2. Level 2 Œ These inputs provide an intermediate level of accuracy and are used when Level 1 inputs are unavailable . Correlation or regression equations are used to estimate these inputs. 3. Level 3 Œ These inputs are based on global or regional averages and provide the least amount of knowledge regarding the input parameters . 70 Most of these inputs can be obtained through weigh -in-motion (WIM), automatic vehicle classification (AVC), and vehicle counts , etc . Each of these traffic inputs were briefly described in this chapter . Several guidelines used to obtain the Levels 2 and 3 inputs for use in the Pavement -ME are documented . The NCHRP 1 -37A final report (44) provides guidelines for truck traffic data based on the allowable error and permissible bias for each dat a element in establishing the truck volume distribution and axle load spectra. Also, o ne set of ALS for each truck class was determined and included in the software as defaults. The 20 16 FHWA Traffic Monitoring Guide (TMG) (41) provides recommendations and best practices for highway traffic monitoring , including monitoring of t ruck loading . It is recommended that the roadway system be divided into groups using clustering or traditional approaches such that each road within a group experiences similar truck -loading patterns. The NCHRP 1 -39 report (45) also contains guidelines for collecting traffic data to be used in the M EPDG . The LTPP TPF -5(004) study has generated high -quality traffic loading information for 26 LTPP Special Pavement Study (SPS) sites located in 23 different State s representing moderate and high volume rural principal arterial interstate and non -interstat e highways . The WIM data from the LTPP TPF 5(004) study were used to develop a two -tier ALS default (46): (a) Tier 1 Global defaults representing average loading, and (b) Tier 2 Defaults representing different loading patterns (clusters). Tier 2 defaults were developed based on hierarchical clustering of axle load data from multiple sites. The n ewly computed ALS defaults ha d a fewer very light and heavy loads compared to the original defaults. This is likely due to the fact that the new defaults were collected with more consistently calibrated and precise WIM equipment than the data set used for the development of the original NALS defaults under the NCHRP 1 -37A project. 71 Many other state highway agencies (SHAs) have determine d the truck traffic weight , and volume defaults to be used with the Pavement -ME. It was found that that the local axle load spectra for different axle configurations deviate from the global default values currently included in the Pavement -ME software, especially for local and secondary routes. Therefore, the development of regional or statewide traffic defaults is necessary to implement the Pavement -ME design approach. Various approaches hav e been used to obtain the Levels 2 and 3 inputs for use in the Pavement -ME. A review of these techniques used by other States is provided in Table 2 -13. The existing practices in Michigan were reviewed to determine if the clustering methodology used in the 2009 study is still the fibestfl way for MDOT to develop Levels 2 and 3 inputs for the Pavement -ME use. Several areas were identified during the review that would benefit from revisions and would lead to the development of the improved road groupings or clu sters for the determining more representative Level 2 and Level 3 traffic defaults. Potential areas of improvement in the current practices are presented in this chapter. 72 Table 2 -13 Summary of clustering methodologies used t o generate level 2 inputs State Level 2 inputs Methodology Clusters assignment Arizona Yes Hierarchical clustering method using the correlation coefficient Traffic patterns for each of the clusters were defined using highway functional classification. Alabama Yes Hierarchical clustering method using Pearson™s correlation coefficient Traffic patterns for each of the clusters were defined geographically. Arkansas Yes Hierarchical clustering method - Wards method and Euclidean distance. Al so, K-means and Fuzzy methods were used. Multi -category logit [ML] models were developed to assign the probabilities that a site belongs to a cluster. Colorado Yes Hierarchical clustering method Uses statewide defaults for all inputs except VCD. The assig nment for VCD is based on the functional classification of highways. Georgia No Road groups with similarities in traffic loading patterns were identified Assigned LTPP defaults to GA roads based on truck volume, vehicle classification , and road type criteria. 48-hour site -specific classification counts for VCD and AADTT recommended plus portable WIM for ALS default selection. Michigan Yes Hierarchical clustering method Freight data and discriminant analyses are used to assign sites to clusters North Carolina Yes Hierarchical Clustering using Wards method and Euclidean distance. 48-hour site -specific classification counts for VCD and ALS. State wide average values for all other inputs are recommended. New York Yes Hierarchical Clustering using Wards method and Euclidean distance. State wide average values for all inputs are recommended. 73 REFERENCES 74 REFERENCES [1] Mu, F., J. W. Mack, and R. A. Rodden. Review of National and State -level Calibrations of AASHTOWare Pavement ME design for New Jointed Plain Concrete Pavement. International Journal of Pavement Engineering , 2016, pp. 1 -7. [2] Sachs, S., J. M. Vandenbo ssche, and M. B. Snyder. Calibration of the National Rigid Pavement Performance Models for the Pavement Mechanistic -Empirical Design Guide.In Transportation Research Board 94th Annual Meeting , 2015. [3] Sachs, S., J. M. Vandenbossche, and M. B. Snyder. De veloping Recalibrated Concrete Pavement Performance Models for the Mechanistic -empirical Pavement Design Guide.In, University of Pittsburg, , 2014. [4] AASHTO. Guide for the Local Calibration of the Mechanistic -Empirical Design Guide.In, American Associat ion of State Highway and Transportation Officials, 2010. [5] Haider, S. W., N. Buch, W. Brink, K. Chatti, and G. Baladi. Preparation for Implementation of the Mechanistic -Empirical Pavement Design Guide in Michigan Part 3: Local Calibration and Validation of the Pavement -ME Performance Models. Rep. No. RC -1595, Michigan State Univ., East Lansing, MI , 2014. [6] Haider, S. W., W. C. Brink, N. Buch, and K. Chatti. Process and Data Needs for Local Calibration of Performance Models in the AASHTOWARE Pavement M E Software. Transportation Research Record: Journal of the Transportation Research Board , No. 2523, 2015, pp. 80-93. [7] Haider, S. W., W. C. Brink, and N. Buch. Local calibration of flexible pavement performance models in Michigan. Canadian Journal of Ci vil Engineering, Vol. 43, No. 11, 2016, pp. 986 -997. [8] Haider, S .W., W. C. Brink, and N. Buch. Local calibration of rigid pavement performance models using resampling methods. International Journal of Pavement Engineering , 2015, pp. 1 -13. [9] Haider, S . W., N. Buch, W. Brink, K. Chatti, and G. Baladi. Preparation for implementation of the mechanistic -empirical pavement design guide in Michigan, part 3: local calibration and validation of the pavement -ME performance models.In, Michigan. Dept. of Transpor tation. Office of Research Administration, 2014. [10] Armaghani, J. M., T. J. Larsen, and L. L. Smith. Temperature response of concrete pavements . 1987. 75 [11] Lederle, R. E. Accounting for warping and differential drying shrinkage mechanisms in the design of jointed plain concrete payments.In , No. Master of Science , Michigan Technological University, 2011. [12] Rao, S., and J. R. Roesler. Characterizing effective built -in curling from concrete pavement field measurements. Journal of Transportation En gineering, Vol. 131, No. 4, 2005, pp. 320 -327. [13] Byrum, C. R. Analysis by high -speed profile of jointed concrete pavement slab curvatures. Transportation Research Record, Vol. 1730, No. 1, 2000, pp. 1 -9. [14] Eisenmann, J., and G. Leykauf. Effect of p aving temperatures on pavement performance.In 2nd International Workshop on Theoretical Design of Concrete Pavements, Madrid, Spain , 1990. [15] Rao, C., E. Barenberg, M. Snyder, and S. Schmidt. Effects of temperature and moisture on the response of jointe d concrete pavements.In Seventh International Conference on Concrete Pavements. The Use of Concrete in Developing Long -Lasting Pavement Solutions for the 21st CenturyInternational Society for Concrete Pavements, No. 1 , 2001. [16] Fang, Y. Environmental in fluences on warping and curling of PCC pavement.In Seventh International Conference on Concrete Pavements. The Use of Concrete in Developing Long - Lasting Pavement Solutions for the 21st CenturyInternational Society for Concrete Pavements, No. 1 , 2001. [17 ] Yu, H., and L. Khazanovich. Effects of construction curling on concrete pavement behavior.In Seventh International Conference on Concrete Pavements. The Use of Concrete in Developing Long -Lasting Pavement Solutions for the 21st CenturyInternational Socie ty for Concrete Pavements, No. 1 , 2001. [18] Beckemeyer, C., L. Khazanovich, and H. Thomas Yu. Determining amount of built -in curling in jointed plain concrete pavement: Case study of Pennsylvania 1 -80. Transportation Research Record, Vol. 1809, No. 1, 20 02, pp. 85-92. [19] NCHRP. Guide for Mechanistic -Empirical Design of New and Rehabilitated Pavement Structures.In, Washington D.C, 2004. [20] Rao, S., and J. R. Roesler. Nondestructive testing of concrete pavements for characterization of effective built -in curling. Journal of Testing and Evaluation, Vol. 33, No. 5, 2005, pp. 356 -363. [21] Mindess, S., F. Young, and D. Darwin. Concrete 2nd Editio. Technical Documents , 2003. [22] Hveem, F. N., and B. Tremper. Some factors influencing shrinkage of concrete pavements.In Journal Proceedings, No. 53 , 1957. pp. 781-789. [23] L'Hermite, M. R. Le retrait des ciments, mortiers et bétons . Inst. Technique du Batiment et des Travaux Pub lics, 1947. 76 [24] L'Hermite, R., J. Chefdeville, and J. Grieu. Memoires sur la Mécanique -Physique du Beton: Nouvelle Contribution a L™Etude du Retrait des Ciments. Liants Hydrauliques.In Annales de L'Institut Technique du Batiment et des Travaux Publics, No . 106, 1949. pp. 2-28. [25] Shacklock, B. W. The effect of mix proportions and testing conditions on drying shrinkage and moisture movement of concrete. Cement Concr. Assoc. Tech. Report TRA/266 , 1957. [26] Helmuth, R. A., and D. H. Turk. The reversible and irreversible drying shrinkage of hardened portland cement and tricalcium silicate pastes.In, 1967. [27] Granger, L., and M. DIRUY. Simulation numérique du retrait du béton sous hygrométrie variable. Bulletin de liaison des laboratoires des ponts et chaussées , No. 190, 1994. [28] Eisenmann, J., and G. Leykauf. Simplified calculation method of slab curling caused by surface shrinkage.In Proceedings, 2nd International Workshop on Theoretical Design of Concrete Pavements , 1990. pp. 185 -197. [29] Janssen, D. J. Moisture in Portl and cement concrete. Transportation Research Record , No. 1121, 1987. [30] Wei, Y., W. Hansen, J. J. Biernacki, and E. Schlangen. Unified shrinkage model for concrete from autogenous shrinkage test on paste with and without ground -granulated blast -furnace slag. ACI Materials Journal, Vol. 108, No. 1, 2011, p. 13. [31] Lee, C. J., D. A. Lange, and Y. -S. Liu. Prediction of moisture curling of concrete slab. Materials and structures, Vol. 44, No. 4, 2011, pp. 787 -803. [32] Yeon, J. H., S. Choi, S. Ha, and M. C. Won. Effects of creep and built -in curling on stress development of Portland cement concrete pavement under environmental loadings. Journal of Transportation Engineering, Vol. 139, No. 2, 2012, pp. 147 -155. [33] Wells, S. A., B. M. Phillips, and J. M. Vandenbossche. Quantifying built -in construction gradients and early -age slab deformation caused by environmental loads in a jointed plain concrete pavement. International Journal of Pavement Engineering, Vol. 7, No. 4, 2006, pp. 275-289. [34] Davidian, M. Chapter 3 : Random Vectors and Multivariate Normal Distribution https://www4.stat.ncsu.edu/~davidian/st732_sp2007/notes/chap3.pdf . [35] Davidian, M. Chapter 10 : Linear M ixed Effects Models for Multivariate Normal Data . https://www4.stat.ncsu.edu/~davidian/st732_sp2007/notes/chap10.pdf . [36] Davidian, M.. Chapter 9 : Random Coefficient Mode ls for Multivariate Normal Data . https://www4.stat.ncsu.edu/~davidian/st732_sp2007/notes/chap9.pdf . 77 [37] MATLAB. Mixed -Effects Models Using nlmefit and nlmefitsa . https://www.mathworks.com/help/stats/mixed -effects -models -using -nlmefit -and -nlmefitsa.html . Accessed 07/10/2019, 2019. [38] Olga, S., and V. Q. Harold. Traffi c Load Spectra for Implementing and Using the Mechanistic -Empirical Pavement Design Guide in Georgia.In, 2014. [39] Tran, N., and K. Hall. Development and significance of statewide volume adjustment factors in mechanistic Šempirical pavement design guide. Transportation Research Record: Journal of the Transportation Research Board , No. 2037, 2007, pp. 97 -105. [40] Lu, Q., Y. Zhang, and J. Harvey. Estimation of truck traffic inputs for mechanistic -empirical pavement design in California. Transportation Research Record: Journal of the Transportation Research Board , No. 2095, 2009, pp. 62 -72. [41] FHWA. Traffic M onitoring Guide.In, Washington, DC, 2016. [42] Lu, Q., and J. Harvey. Characterization of truck traffic in California for mechanistic -empirical design. Transportation Research Record: Journal of the Transportation Research Board , No. 1945, 2006, pp. 61 -72. [43] On Tam, W., and H. Von Quintus. Use of long -term pavement performance data to develop traffic defaults in support of mechanistic -empirical pavement design procedures. Transportation Research Record: Journal of the Transportation Research Board , No. 1855, 2003, pp. 176-182. [44] NCHRP Project 1 -37A. Appendix AA: Traffic Loading.In, ARA, inc., ERES division, 505 west University Avenue, Champaign, Illinois 61820, 2004, 2004. [45] NCHRP. NCHRP Report 538 : Traffic Data Collection, Analysis, and Foreca sting for Mechanistic Pavement Design.In , No. 538 , Transportation Research Board, Washington, D.C., 2005. [46] Selezneva, O., M. Ayres, M. Hallenbeck, A. Ramachandran, H. Shirazi, and H. Von Quintus. MEPDG Traffic Loading Defaults Derived from LTPP Traffi c Pooled Fund Study.In, Federal Highway Administration,, 2016. [47] Darter, M., L. Titus -Glover, and D. Wolf. Development of a Traffic Data Input System in Arizona for the MEPDG.In, Arizona Department of Transportation, Phoenix, AZ, 2013. [48] Turochy, R . E., D. H. Timm, and D. Mai. Development of Alabama Traffic Factors for use in Mechanistic -Empirical Pavement Design.In, 2015. [49] Buch, N., S. W. Haider, J. Brown, and K. Chatti. Characterization of Truck Traffic in Michigan for the New Mechanistic Emp irical Pavement Design Guide In Final Report , Michigan Department of Transportation, Lansing, MI, 2009. 78 [50] Wang, K. C., Q. Li, K. D. Hall, V. Nguyen, and D. X. Xiao. Development of truck loading groups for the mechanistic -empirical pavement design guide . Journal of Transportation Engineering, Vol. 137, No. 12, 2011, pp. 855 -862. [51] Agresti, A., and M. Kateri. Categorical data analysis . Springer, 2011. [52] Mallela, J., L. Titus -Glover, S. Sadasivam, B. Bhattacharya, M. Darter, and H. Von Quintus. Imp lementation of the AASHTO Mechanistic -empirical Pavement Design Guide for Colorado.In, 2013. [53] Stone, J. R., Y. R. Kim, G. F. List, W. Rasdorf, F. Sayyady, F. Jadoun, and A. N. Ramachandran. Development of Traffic Data Input Resources for the Mechanist ic Empirical Pavement Design Process.In, 2011. [54] Sayyady, F., J. Stone, G. List, F. Jadoun, Y. Kim, and S. Sajjadi. Axle load distribution for mechanistic -empirical pavement design in north carolina: Multidimensional clustering approach and decision tree development. Transportation Research Record: Journal of the Transportation Research Board , No. 2256, 2011, pp. 159 -168. [55] El -Badawy, S. M., F. M. Bayomy, and S. W. Fugit. Traffic characteristics and their impact on pavement performance for the implementation of the mechanistic -empirical pavement design guide in Idaho. International Journal of Pavement Research and Technology, Vol. 5, No. 6, 2012, pp. 386-394. 79 CHAPTER 3 - EVALUATION OF THE PE RFORMANCE PREDICTION MODELS 3.1 INTRODUCTION The original local calibration of performance models was performed based on version 2.0 of the Pavement -ME software (1-3). Since then, several modifications and improvements have been incorporated in later v ersions of the software. Hence, comparisons of performance predictions among different software versions (i.e., versions 2.0, 2.2 , and 2.3) were warranted. These comparisons will highlight the modifications in the models and the need for re -calibration. Th is chapter includes the results of performance prediction comparisons among versions and measured performance data and the recalibration of the rigid pavement performance models using fixed -effects models and mixed -effects models . 3.2 EVALUATION OF PERFOR MANCE MODELS The following comparisons were made for both rigid and flexible pavements: 1. Version 2.2 using version 2.0 global calibration coefficients (V 2.2 with G 2.0) and version 2.0 using version 2.0 global calibration coefficients (V 2.0 with G 2.0) 2. V 2. 3 with G 2.0 versus V 2.2 with G 2.0 3. V 2.3 with G 2.0 versus V 2.0 with G 2.0 4. V 2.3 with Local versus V 2.3 with G 2.0 The first three comparisons will highlight the changes in the performance models among different versions (if any). The last comparison will show the impact of the previous local calibration on the performance predictions using the latest software version. 80 3.2.1 Rigid Pavements Three performance measures were compared for rigid pavements: (a) transverse cracking, (b) faulting, and (c) pavement roughness in terms of International Roughness Index (IRI). Figure 3-1 shows all the above -mentioned comparisons for transverse cracking for both reconstructs and unbonded concret e overlays (UCO) pavement sections. A total of 28 projects (20 reconstructs and 8 UCO) were used in these comparisons. It should be noted that the same pavement sections were used for the previous local calibration. Some differences in predictions were obs erved between versions 2.0 and 2.2, versions 2.0 and 2.3. However, no difference in cracking predictions was observed between versions 2.2 and 2.3. These results imply that there have been changes in the prediction model forms between versions 2.0 and 2.2. Impact of previous local calibration can be observed in Figure 3-1(d). These differences in the performance predictions are quantified in terms of standard error of estimate and bias in Table 3-1. Table 3 -1 Standard errors and bi ases for transverse cracking (reconstructs and UCO) Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 1.72 0.04 1.73 11.17 Bias 0.65 0.02 0.67 5.68 Figure 3 -2 shows all the comparisons for joint faulting for both reconstructs and UCO pavement sections. No differences in predictions were observed between versions 2.0, 2.2, and 2.3. These results imply that there were no changes in the prediction model forms. Impact of previous local calibration can be observed in Figure 3 -2(d). These differences in the performance predictions are quantified in terms of standard error of estimate and bias in Table 3 -2. 81 (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-1 Comparison of predicted transverse cracking (reconstructs and UCO) Figure 3 -3 illustrates all the comparisons for IRI for both reconstructs and UCO pavement sections. Some differences in predictions were observed between versions 2.0 and 2.2, versions 2.0 and 2.3. However, no significant difference in predicted IRI was observed between versions 2.2 and 2.3. These results imply that there have been some modifications in the IRI prediction model form from versions 2.2 onwards. Impact of previous local calib ration can be observed in y = 1.3939x + 0.1118 R² = 0.9646 020406080100020406080100Predicted Cracking (%) -V 2.2 and G 2.0 Predicted Cracking (%) -V 2.0 and G 2.0 y = 1.0018x + 0.0135 R² = 0.9999 020406080100020406080100Predicted Cracking (%) -V 2.3 and G 2.0 Predicted Cracking (%) -V 2.2 and G 2.0 y = 1.3964x + 0.1255 R² = 0.9646 020406080100020406080100Predicted Cracking (%) -V 2.3 and G 2.0 Predicted Cracking (%) -V 2.0 and G 2.0 020406080100020406080100Predicted Cracking (%) -V 2.3 and Local Predicted Cracking (%) -V 2.3 and G 2.0 82 (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-2 Comparison of predicted faulting (reconstructs and UCO) Table 3-2 Standard errors and biases for faulting (reconstructs and UCO) Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 0.002 0.001 0.003 0.034 Bias -0.002 0.000 -0.002 -0.021 Figure 3-3(d). These differences in the performance predictions are quantified in terms of the standard error of estimate and bias in Table 3-3. y = 0.9919x -0.0012R² = 0.999 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.2 and G 2.0 Predicted Faulting (inch) -V 2.0 and G 2.0 y = 0.9962x -6E-05R² = 0.9996 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.3 and G 2.0 Predicted Faulting (inch) -V 2.2 and G 2.0 y = 0.9881x -0.0013R² = 0.9986 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.3 and G 2.0 Predicted Faulting (inch) -V 2.0 and G 2.0 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.3 and Local Predicted Faulting (inch) -V 2.3 and G 2.0 83 (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-3 Comparison of predicted IRI (reconstructs and UCO) Table 3-3 Standard errors and biases for IRI ( reconstructs and UCO ) Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 4.97 0.46 4.96 17.84 Bias -3.2 -0.1 -3.2 8.0 y = 0.9562x + 1.023 R² = 0.9849 5010015020025030050100150200250300Predicted IRI (inch/mile) -V 2.2 and G 2.0 Predicted IRI (inch/mile) -V 2.0 and G 2.0 y = 0.9977x + 0.1641 R² = 0.9998 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Predicted IRI (inch/mile) -V 2.2 and G 2.0 y = 0.9544x + 1.1454 R² = 0.9855 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Predicted IRI (inch/mile) -V 2.0 and G 2.0 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and Local Predicted IRI (inch/mile) -V 2.3 and G 2.0 84 3.2.2 Flexible Pavements Similar comparisons among different software versions were made for flexible pavements. These comparisons will highlight the changes in the performance models among different versions (if any). Four performance measures were compared for flexible pavemen ts: (a) longitudinal cracking, (b) fatigue cracking, (c) surface rutting, and (d) pavement roughness in terms of International Roughness Index (IRI). A total of 25 pavement sections (5 crush & shape, 10 freeway, and 10 non-freeway) were randomly selected from the set of flexible pavement sections used in the previous local calibration. Figure 3-4 shows all the comparisons for longitudinal cracking (top -down) for the randomly selected flexible pavement sections. The main purpose of these comparisons is to s ee if any modifications of performance models were made in the newer versions of the software. No differences in predictions were observed between versions 2.0, 2.2, and 2.3. Impact of previous local calibration can be observed in Figure 3-4(d). These diff erences in the performance predictions are quantified in terms of the standard error of estimate and bias in Table 3-4. 85 (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-4 Comparison of predicted longitudinal cracking Table 3-4 Standard errors and biases for longitudinal cracking Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 2.41 10.17 10.14 314.50 Bias -0.71 1.04 0.29 235.17 Figure 3-5 shows all the comparisons for fatigue cracking (bottom -up) for the same subset of flexible pavement sections. No differences in predictions were observed between versions 2.0, y = 0.9956x -0.1309R² = 1 0500100015002000250005001000150020002500Predicted Cracking (ft/mile) -V 2.2 and G 2.0 Predicted Cracking (ft/mile) -V 2.0 and G 2.0 y = 1.0084x -0.5451R² = 0.9994 0500100015002000250005001000150020002500Predicted Cracking (ft/mile) -V 2.3 and G 2.0 Predicted Cracking (ft/mile) -V 2.2 and G 2.0 y = 1.0038x -0.5905R² = 0.9992 0500100015002000250005001000150020002500Predicted Cracking (ft/mile) -V 2.3 and G 2.0 Predicted Cracking ( ft/mile ) -V 2.0 and G 2.0 0500100015002000250005001000150020002500Predicted Cracking ( ft/mile )-V 2.3 and Local Predicted Cracking (ft/mile) -V 2.3 and G 2.0 86 2.2, and 2.3. Impact of previous local calibration can be observed in Figure 3-5(d). These differences in the performance predictions are quantified in terms of standard error of estimate and bias in Table 3-5. (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 and G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-5 Comparison of predicted fatigue cracking Table 3-5 Standard errors and biases for fatigue cracking Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 0.00 0.03 0.03 6.41 Bias 0.00 0.01 0.01 4.46 y = 0.9995x -0.0018R² = 1 020406080100020406080100Predicted Cracking (%) -V 2.2 and G 2.0 Predicted Cracking (%) -V 2.0 and G 2.0 y = 1.0084x -0.0002R² = 0.9998 020406080100020406080100Predicted Cracking (%) -V 2.3 and G 2.0 Predicted Cracking (%) -V 2.2 and G 2.0 y = 1.0078x -0.0016R² = 0.9998 020406080100020406080100Predicted Cracking (%) -V 2.3 and G 2.0 Predicted Cracking (%) -V 2.0 and G 2.0 020406080100020406080100Predicted Cracking (%) -V 2.3 and Local Predicted Cracking (%) -V 2.3 and G 2.0 87 Results similar to cracking were observed for surface rutting and IRI as shown in Figures 3-6 and 3-7, respectively. The differences in the performance predictions are quantified in terms of standard error of estimate and bias in Tables 3-6 and 3-7, respectively. (a) V 2. 2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-6 Comparison of predicted surface rutting Table 3-6 Standard errors and biases for surface rutting Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 0.01 0.01 0.01 0.36 Bias 0.00 0.00 0.00 -0.35 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Faulting (inch) -V 2.2 and G 2.0 Predicted Rutting (inch) -V 2.0 and G 2.0 y = 1.016x -0.0047R² = 0.9973 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Faulting (inch) -V 2.3 and G 2.0 Predicted Rutting (inch) -V 2.2 and G 2.0 y = 1.0095x -0.0009R² = 0.9967 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Faulting (inch) -V 2.3 and G 2.0 Predicted Rutting (inch) -V 2.0 and G 2.0 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Faulting (inch) -V 2.3 and Local Predicted Rutting (inch) -V 2.3 and G 2.0 88 (a) V 2.2 with G 2.0 versus V 2.0 with G 2.0 (b) V 2.3 with G 2.0 versus V 2.2 with G 2.0 (c) V 2.3 with G 2.0 versus V 2.0 with G 2.0 (d) V 2.3 with Local versus V 2.3 with G 2.0 Figure 3-7 Comparison of predicted surface roughness (IRI) Table 3-7 Standard errors and biases for surface roughness (IRI) Comparison V 2.2 vs. V 2.0 V 2.3 vs. V 2.2 V 2.3 vs. V 2.0 V 2.3 Local vs V 2.3 SEE 0.95 0.51 0.94 11.59 Bias -0.2 0.2 0.0 -10.4 3.3 COMPARISONS WITH MEASURED PERFORMANCE The measured performance data were compared with a predicted performance from different software versions. This evaluation includes the changes in performance prediction models (if any) and additional measured performance for the selected pavement sections (1 to 3 years of y = 1.003x -0.3958R² = 0.9982 5010015020025030050100150200250300Predicted IRI (inch/mile) -V 2.2 and G 2.0 Predicted IRI (inch/mile) -V 2.0 and G 2.0 y = 1.0006x + 0.1303 R² = 0.9994 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Predicted IRI (inch/mile) -V 2.2 and G 2.0 y = 1.0016x -0.0699R² = 0.9982 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Predicted IRI (inch/mile) -V 2.0 and G 2.0 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and Local Predicted IRI (inch/mile) -V 2.3 and G 2.0 89 additional data). The primary objective of the comparison is to see if the previous local calibration is still valid with reasonable accuracy. Also , the comparisons of measured with a predicted performance from different versions will highlight the need for the local re -calibration. The following comparisons were made for rigid and flexible pa vements: 1. Measured performance versus predicted performance by version 2.0 using version 2.0 global calibration coefficients (V 2.0 with G 2.0) 2. Measured performance versus V 2.2 with G 2.0 3. Measured performance versus V 2.3 with G 2.0 4. Measured performan ce versus V 2.3 with previous local calibration coefficients 3.3.1 Rigid Pavements Three performance measures were compared for rigid pavements: (a) transverse cracking, (b) faulting, and (c) IRI. Figure 3-8 shows all the above -mentioned comparisons for transver se cracking for both reconstruct and unbonded pavement sections. It can be seen that versions 2.0, 2.2, and 2.3 under predicts transverse cracking, highlighting the need for local calibration of the model [see Figures 3-8(a), (b), and (c)]. These compariso ns were warranted to verify if the previous local calibration still holds for the available additional time series distress data. Figure 3-8(d) shows the comparison between the measured and predicted cracking using the previously local calibrated cracking model. 90 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-8 Predicted vs. measured transverse cracking (reconstructs and UCO) It can be seen that the previous locally calibrated cracking model fits the measured data reasonably; however, there is higher variability (SEE) as compared to original local calibration (see Table 3-8). The causes of this variability could be attributed to modifications in the m odel and additional measured cracking data. Table 3-8 Standard errors and biases between measured and predicted transverse cracking (reconstructs and UCO ) Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 10.19 9.06 9.04 8.74 Bias -4.40 -3.75 -3.73 1.95 020406080100020406080100Predicted Cracking (%) -V 2.0 and G 2.0 Measured Cracking (% ) 020406080100020406080100Predicted Cracking (%) -V 2.2 and G 2.0 Measured Cracking (%) 020406080100020406080100Predicted Cracking ( % ) -V 2.3 and G 2.0 Measured Cracking (%) 020406080100020406080100Predicted Cracking (%) -V 2.3 and Local Measured Cracking (% ) 91 Similarly, Figure 3-9 shows the above -mentioned comparisons for joint faulting. It can be seen that the previous locally calibrated faulting model still predicts the measured faulting accurately. The SEE determined based on measure d and predicted joint faulting is comparable to the previous model despite additional faulting data (see Table 3-9). a. V 2.0 with G 2.0 b. V 2.2 with G 2.0 c. V 2.3 with G 2.0 d. V 2.3 with Local Figure 3-9 Predicted vs. measured faulting (reconstructs and UCO) Table 3-9 Standard errors and biases between measured and predicted faulting (reconstructs and UCO) Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 0.053 0.052 0.051 0.025 Bias 0.024 0.023 0.023 0.002 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.0 and G 2.0 Measured Faulting (inch) 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.2 and G 2.0 Measured Faulting (inch) 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.3 and G 2.0 Measured Faulting (inch) 0.000.100.200.300.400.500.000.100.200.300.400.50Predicted Faulting (inch) -V 2.3 and Local Measured Faulting (inch) 92 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-10 Predicted vs. measured IRI (reconstructs and UCO) Table 3-10 Standard errors and biases between measured and predicted IRI (reconstructs and UCO) Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 20.9 21.1 21.1 19.9 Bias -0.6 -3.8 -3.8 4.2 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.0 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.2 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and Local Measured IRI -inch/mile 93 Figure 3 -10 presents the comparisons for IRI. Although the previously calibrated IRI models fit the observed IRI reasonably, there is a need to re -calibrate the IRI model because of the following reasons: a. Modification in the new IRI model b. Additional data are available in the PMS database The standard error of estimate (SEE) and bias are higher. The SEE and bias increased to 19.9 inches /mile and 4.2 inches /mile (see Table 3 -10) from 13.4 inches /mile and -0.38 inch/mile (2), respectively. 3.3.2 Flexible Pavements Comparisons between measured and predicted performance were also made for flexible pavements. Four performance measures were compared for flexible pavements: (a) longitudinal cracking, (b) fatigue cracking, (c) surface rutting, and (d) IRI. Figures 3-11 to 3-14 show all the above -mentioned comparisons for flexible pavement sections. These comparisons were warranted to verify if the previous local calibration still holds for the available additional time -series distress data. Tables 3 -11 to 3 -14 present SEE and bias for all the comparisons for flexible pavement performance. For all the performance measures, the performance predictions show the minimum SEE and bias using the previous locally calibrated coefficients in the models. It should be noted that these comparisons were only made to verify the previous local calibration efforts. This was accomplished by only using a subset of flexible pavement sections from a larger set of the sections considered in previous calibration study. Therefore, the current loca l calibration coefficient for flexible pavement models are still valid. 94 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-11 Predicted vs. measured fatigue cracking Table 3-11 Standard errors and biases between measured and predicted longitudinal cracking Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 1213.53 1209.94 1212.40 1021.92 Bias -756.71 -747.22 -756.42 -441.21 0500100015002000250005001000150020002500Predicted Cracking ( ft/mile ) -V 2.0 and G 2.0 Measured Cracking (ft/mile ) 0500100015002000250005001000150020002500Predicted Cracking (ft/mile) -V 2.2 and G 2.0 Measured Cracking (ft/mile) 0500100015002000250005001000150020002500Predicted Cracking (ft/mile ) -V 2.3 and G 2.0 Measured Cracking (ft/mile) 0500100015002000250005001000150020002500Predicted Cracking (ft/mile) -V 2.3 and Local Measured Cracking (ft/mile ) 95 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-12 Predicted vs. measured fatigue cracking Table 3-12 Standard errors and biases between measured and predicted fatigue cracking Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 11.79 11.92 11.79 9.72 Bias -7.11 -7.14 -7.11 -2.12 020406080100020406080100Predicted Cracking (%) -V 2.0 and G 2.0 Measured Cracking (% ) 020406080100020406080100Predicted Cracking (%) -V 2.2 and G 2.0 Measured Cracking (%) 020406080100020406080100Predicted Cracking ( % ) -V 2.3 and G 2.0 Measured Cracking (%) 020406080100020406080100Predicted Cracking (%) -V 2.3 and Local Measured Cracking (% ) 96 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-13 Predicted vs. measured rutting Table 3-13 Standard errors and biases between measured and predicted rutting Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 0.38 0.41 0.39 0.10 Bias 0.35 0.37 0.35 0.01 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Rutting (inch) -V 2.0 and G 2.0 Measured Rutting (inch) 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Rutting (inch) -V 2.2 and G 2.0 Measured Rutting (inch) 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Rutting (inch) -V 2.3 and G 2.0 Measured Rutting (inch) 0.000.200.400.600.801.001.200.000.200.400.600.801.001.20Predicted Rutting (inch) -V 2.3 and Local Measured Rutting (inch) 97 (a) V 2.0 with G 2.0 (b) V 2.2 with G 2.0 (c) V 2.3 with G 2.0 (d) V 2.3 with Local Figure 3-14 Predicted vs. measured IRI Table 3-14 Standard errors and biases between measured and predicted IRI Version V 2.0 V 2.2 V 2.3 V 2.3 Local SEE 22.6 23.4 22.6 22.8 Bias 12.2 12.2 12.2 0.1 The results of the comparisons show that performance models for rigid pavements (transverse cracking and IRI) have changed since the Pavement -ME version 2.0. Because of these changes, and additional time -series data being available, re -calibration of the rigid pavement transverse cracking and IRI performance models is warranted. For flexible pavement s, the previous local 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.0 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.2 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and G 2.0 Measured IRI -inch/mile 050100150200250300050100150200250300Predicted IRI (inch/mile) -V 2.3 and Local Measured IRI -inch/mile 98 calibration coefficients can still be used because the prediction models are not modified since the Pavement -ME ve rsion. 3.4 AVAILABLE PMS CO NDITION DATA The measured performance data were extracted from the MDOT PMS database from years 1992 to 2016 for transverse cracking, 1998 to 2015 for faulting and IRI. Figure 3 -15(a) shows measured transverse cracking for the rigid pavement sections. A fe w sections were either considered partially (i.e., cracking data over time was only considered until a treatment was applied ) or fully removed from the re -calibration dataset because of the distress index (DI) progression over time . The cracking performance of these sections is shown in Figure 3 -15(b). (a) All pavement sections (b) Pavement sections for re -calibration Figure 3-15 Measured transverse cracking performance Figure 3 -16 presents the IRI progressions for the JPCP pavement sections. The sections with abnormal IRI performance were not included in the re -calibration efforts. 010203040506070809010005101520Transverse Cracking (%) Year 010203040506070809010005101520Transverse Cracking (%) Year 99 (a) All pavement sections (b) Pavement sections for re -calibration Figure 3-16 Measured IRI 3.5 RE-CALIBRATION OF RI GID PAVEMENT MODELS The local calibration of the pavement performance prediction models is a challenging task that requires a significant amount of preparation. The effectiveness of local calibr ation depends on the input values and the measured pavement distress and roughness. Pavement -ME software version 2.3 was executed using the as -constructed inputs for all the selected pavement sections and the predicted performance was extracted from the ou tput files. The measured performance data from reconstruct and unbonded overlays were used for recalibration of performance prediction models by minimizing the sum of squared error between the measured and predicted distresses . This forms the basis of regression analysis. Since the available performance data can be considered as longitudinal data (repeated measures on the same subject over time) , mixed effects models were used in recalibration in addition to the standard fixed e ffects models. Fixed effects models in general do not account for the subject -specific effects. This is illustrated in this section. Figure 3 -17 shows a model fit to all the data ignoring the subject specific effects. Figure 3-18 shows the boxplot of the r esiduals per test section. 05010015020025005101520IRI (inch/mile) Year 05010015020025005101520IRI (inch/mile) Year 100 Figure 3 -17 Single fit for the entire data Figure 3 -18 Boxplot residuals for each test section for a single fit The figure shows that the boxes are mostly above or below ze ro, indicating that the model has failed to account for subject -specific effects. Figure 3 -19 shows the plot of models fit foe each test section. These models result in a pair of parameters (C4 and C5) for each of the test sections , which also results in a significant reduction of the sum of the squared errors. Figure 3 -20 shows the boxplot of the residuals for each test section. It can be seen that the box plot shows that the individual models account for most of the subject -specifi c effects. The spread of the residuals is much smaller than in the box plot in Figure 3 -18, and the boxes are now mostly 00.20.40.60.811.21.41.61.82Fatigue Damage020406080100Measured Cracking (%)1234567891011121314151617181920212223242526272829303132333435Section-25-20-15-10-50510152025Residual Cracking (%)101 centered on zero. While the individual models successfully account for variations due to the specific subjects in the study, the subjec ts are not considered as representatives of a larger population in such analyses. Figure 3 -19 Indiv idual models for each test section The sampling distribution from which the subjects are drawn is more of interest than the sample itself. Therefore, the purpose of mixed -effects models is to account for subject -specific variations more broadly, as random effects varying around population means (4). Thus, if a mixed -effects model is fit to the available performance data, each test section will have its own set of parameters deviating from the population means of the parameters because of the inherent behavior of the test sections. Therefore, if one is interested the predicting the future performance of an existing test section as part of the sample, then the subject specific parameters can be used. 00.20.40.60.811.21.41.61.82Fatigue Damage020406080100Measured Cracking (%)102 Figure 3 -20 Boxplot residuals for each test section for individual models However, if the interest is to predict the performance of a test section in the population, then the mean population parameters can be used. Figure 3 -21 shows the boxplot of the residuals for each test section for the NLME model. Note that the residuals ar e based on the predicted values at the individual test section level. Figure 3 -21 Boxplot residuals for each test section for mixed -effects model 1234567891011121314151617181920212223242526272829303132333435Section-25-20-15 -10-50510 1520 25Residual Cracking (%)1234567891011121314151617181920212223242526272829303132333435Section-25-20 -15-10-5051015 20 25Residual Cracking (%)103 Due to the data limitations, especially due to the limited sample size for rigid p avements, bootstrapping was used for a more robust way of quantifying model standard error and bias. Bootstrapping estimates statistical parameters such as the population mean and its confidence interval from the sample by resampling with replacement. The bootstrap method involves sampling from the dataset to form a new sample of the same size as the original dataset repeatedly and finding the distribution of the parameters. The major assumption behind bootstrapping is that the sample distribution is a good approximation to the population distribution (2). The following rigid pavement performance models in the Pavement -ME were locally re -calibrated for Michigan conditions. Transverse cracking IRI 3.5.1 Transverse Cracking Model The global and previously calibrated local transverse cracking model was verified by comparing the predicted and measured cracking. The model adequacy and goodness -of-fit were tested by comparing the standard error of the estimate (SEE) and bias of the models. Measured d ata for 35 sections from 2 8 projects for a total 205 data points were available after the quality checks. As previously mentioned, two models were used to obtain the estimates of the calibration coefficients: (a) Option 1: Non linear fixed -effects model The tr ansverse cracking model was calibrated using all the available MDOT JPCP pavement sections using fino samplingfl technique. Figure 3-22 shows the comparison between the measured and predicted transverse cracking and a comparison of the transfer function for the global and locally re -calibrated models. The SEE, bias, and model coefficients (C 4 and C 5) are 104 summarized in Table 3-15. Based on the results, SEE reduced from 6.1 to 4.9 percent slabs cracked, and the bias reduced from -2.23 to -1.61 percent slabs cracked. The C 4 and C 5 coefficients were changed from 0.52 and -2.17 to 0.13 and -3.18, respectively. (a) Measured vs. predicted cracking (b) Fatigue damage predicted cracking Figure 3 -22 Local calibration results for transverse cracking using entire dataset Œ Option 1 1000 bootstrap samples are selected randomly with replacement from the total number of the selected pavement sections . Figure 3 -23 illustrates the parameter distributions for the 1000 bootstrap calibratio ns. Figure 3 -24 illustrates the SEE and bias distributions for the 1000 bootstrap validations. The red dotted lines show 95% confidence interval for the mean value (red dashed line) of the distribution while the blue line shows the median value of the dist ribution. The average and median values for SEE, bias, C4, and C 5 are summarized in Table 3-15. Since the distributions of C 4 and C 5 coefficients are not normally distributed , it is better to use median values to represent the central tendency of a non -normal distribution. The non Œnormal distribution is caused by the one section that has nearly 80% cracking. Depending on whether the section is included in the bootstrap sample, the distribution of the coefficients varies and hence a bi-modal distribution is seen. Based on the results, SEE reduced from 5.8 to 4.3 percent slabs 020406080100Measured transverse cracking (% slabs cracked)020406080100Predicted transverse cracking (% slabs cracked)10-610-410-2100102104106Fatigue Damage020406080100Predicted transverse cracking (% slabs crackedGlobal ModelMeasured CrackingLocally Calibrated Model105 cracked, and the bias reduced from -2.2 to -1.3 percent slabs cracked. The SEE and bias increased for th e validations. The C 4 and C 5 coefficients were changed from 0.52 and -2.17 to 0.16 and -2.82, respectively. Figure 3 -23 Bootstrap sampling calibration results Œ Option 1 (1000 bootstraps) Table 3 -15 Local calibration summary for transverse cracking (Option 1) Sampling technique Parameter SEE Bias C4 C5 No sampling Global model 6.07 -2.23 0.52 -2.17 Local model 4.93 -1.61 0.13 -3.18 Bootstrapping Global Model Mean 5.90 -2.26 0.52 -2.17 Global Model Median 5.83 -2.21 0.52 -2.17 Local Model Mean 4.33 -1.14 0.66 -2.63 Local Model Median 4.35 -1.33 0.16 -2.82 Local Model Mean - Val 7.28 -1.76 0.66 -2.63 Local Model Median - Val 6.86 -1.84 0.16 -2.82 Note: Bold values are the recommended calibration coefficients 02468(a) SEE0100200Frequency050100Percentage-3-2-101(b) Bias0100200Frequency050100Percentage-20246(c) C40200400600Frequency406080 100Percentage-6-4-20(d) C50200400Frequency050100Percentage106 Figure 3 -24 Bootstrap sampling validation results Œ Option 1 (1000 bootstraps) The standard error of the re -calibrated cracking models based on bootstrapping was used to establish the relationship between the standard deviation of the measured cracking and mean predicted cracking (2). These relationships are used to calculate the cracking for specific reliability. Figure 3 -25 presents the relationship for the cracking model. This relationship is used to predict standard error for mean predicted cracking (50% reliability). Thi s standard error is then used to calculate cracking at given reliability. 051015(a) SEE0200400Frequency050100Percentage-505(b) Bias0200400Frequency050100Percentage0246(c) C40200400600Frequency406080 100Percentage-6-4-20(d) C50200400Frequency050100Percentage107 Figure 3 -25 Reliability plot for transverse cracking (b) Option 2: Non linear mixed -effects model As previously discussed, it is assumed that each test section has its inherent trend , which is dictated by the values of C 4 and C 5. Therefore, each test section is assumed to have its unique C 4 and C 5 varying around the population C4 and C5 values. If the interest is to predict the behavior of the individual sections used in the model , then the individual -specific parameters can be used. However, since we are interested in the predictions of test sections across the state of Michigan, the population parameters should be used instead of the individual sections. The covariance structure of the model should be decided . There are two parts of the covariance structure (Recall: Iiiiii varY~ZDZR ). The structure of iRis assumed to be equal to i2nIbased on the dis cussion in Chapter 2. However, several options exist for the structure of D. Since D is a 22matrix , two option s exist (a) a diagonal matrix and with no correlation between the coefficients (b) a matrix assuming a correlation between the two coefficients. Two models were developed with both the covariance structures. The AIC and BIC values for the first model were 1073.9 and y = 3.8013x 0.2948 R² = 0.9122 05101520020406080100Measured Standard Error (%) Predicted Transverse Cracking (%) 108 1081.4. The AIC and BIC values for the sec ond model were 1031.8 and 1040.7. Hence the second model was chosen because of the lower AIC and BIC values. Figure 3-26 shows the comparison between the measured and predicted transverse cracking and a comparison of the transfer function for the global a nd locally re -calibrated models. The SEE, bias, and model coefficients (C 4 and C 5) are summarized in Table 3-16. Although the SEE and Bias are higher compared to the fixed effects models when the entire dataset was used, the results, the bootstrapping resu lts yield unimodal distributions for the C 4 and C 5 (see Figure 3 -27). The validation results can be seen in Figure 3 -28. This indicates that the mixed -effects model accounted for the subject -specific variations and was not influenced by the one test sectio n that has 80 percent cracking. While the SEE was not reduced, the bias was substantially reduced to -0.10 from -2.12, which is on par with the validation results. The C 4 and C 5 coefficients were changed from 0.52 and -2.17 to 0. 72 and -1.82, respectively. (a) Measured vs. predicted cracking (b) Fatigue damage predicted cracking Figure 3 -26 Local calibration results for transverse cracking using entire dataset Œ Option 2 020406080100Measured transverse cracking (% slabs cracked)020406080100Predicted transverse cracking (% slabs cracked)10-610-410-2100102104106Fatigue Damage0204060 80100Predicted transverse cracking (% slabs crackedGlobal ModelMeasured CrackingLocally Calibrated Model109 Figure 3 -27 Bootstrap sampling calibration results Œ Option 2 (1000 bootstraps) Figure 3 -28 Bootstrap sampling validation results Œ Option 2 (1000 bootstraps) 050100(a) SEE0200400600800Frequency0204060 80Percentage050100(b) Bias05001000Frequency050100Percentage0510(c) C40100200300400500Frequency203040506070Percentage-4-3-2-10(d) C50100200300400Frequency020406080Percentage050100(a) SEE05001000Frequency050100Percentage020406080(b) Bias0200400600800Frequency020406080Percentage0510(c) C40100200 300 400 500Frequency20304050 6070Percentage-4-3-2-10(d) C50100200300400Frequency020406080Percentage110 Table 3-16 Local calibration summary for transverse crack ing (Option 2) Sampling technique Parameter SEE Bias C4 C5 No sampling Global model 6.07 -2.23 0.52 -2.17 Local model 6.23 -1.68 0.76 -1.70 Bootstrapping Global Model Mean 5.78 -2.16 0.52 -2.17 Global Model Median 5.90 -2.12 0.52 -2.17 Local Model Mean 6.74 -0.10 0.72 -1.82 Local Model Median 5.90 -0.98 0.58 -1.80 Local Model Mean - Val 6.99 -0.11 0.72 -1.82 Local Model Median - Val 5.89 -1.13 0.58 -1.80 3.5.2 IRI Model The IRI model was re -calibrated after the local calibration of the transverse cracking model. This distress is considered directly in the IRI model along with the site factor and spalling predictions. Measured data for 34 sections from 20 projects for a total 247 data points were availa ble after the quality checks. Simila r to the recalibration of the transverse cracking model , both fixed effects and mixed -effects models were fit. However, in these models, only the intercepts were assumed to vary for each test section. (a) Option 1 : Linear Fixed Effects Model Figure 3 -29 (a) shows the predicted and measured IRI for the global model for JPCP sections. It can be seen that the global IRI model coefficient does not predict the measured IRI reasonably well . Figure 3 -29 (b) shows similar plot for the local calibrated model using no samp ling. The figures show that the local calibration improves the IRI predictions for JPCP sections. 111 (a) Measured vs. predicted IRI before calibration (b) Measured vs. predicted IRI after calibration Figure 3 -29 Local calibration results for IRI using entire dataset Œ Option 1 The SEE, bias, and model coefficients are summarized in Table 3-17. Based on the results, SEE reduced from 29.9 to 9.8 inch/mile, and the bias reduced from 11.9 to -0.5 inch/mile. Similar to the cracking model, 1000 bootstrap samples were used to recalibrate the IRI model. Figure 3 -30 shows the parameter distributions f or the 1000 bootstrap calibrations. Figure 3 -30 shows the parameter distributions for the 1000 bootstrap validations. The average and median values for SEE, bias, and model coefficients (C 1 to C 4) are summarized in Table 3-17. Based on the results, SEE red uced from 29.1 to 9.7 inch/mile, and the bias reduced from 11.7 to -0.5 inch es/mile. For the validations, the SEE reduced from 29.1 to 12.93 inch/mile, and the bias reduced from 11.7 to -0.73 inches /mile. 04080120160200240Measured IRI (in/mile)04080120160 200240Predicted IRI (in/mile)04080120160200240Measured IRI (in/mile)04080120 160200240Predicted IRI (in/mile)112 Figure 3 -30 Bootstrap sampling calibration results Œ Option 1 (1000 bootstraps) Table 3 -17 Local calibration summary for IRI (Option 1) Sampling technique Parameter SEE Bias C1 C2 C3 C4 No sampling Global model 29.89 11.92 0.82 0.44 1.49 25.24 Local model 9.96 -0.45 0.92 2.87 1.27 45.23 Bootstrapping Global Model Mean 29.03 11.77 0.82 0.44 1.49 25.24 Global Model Median 28.62 11.34 0.82 0.44 1.49 25.24 Local Model Mean 9.72 -0.45 0.96 2.88 1.21 46.86 Local Model Median 9.73 -0.44 0.92 2.89 1.26 45.69 Local Model Mean - Val 12.93 -0.73 0.96 2.88 1.21 46.86 Local Model Median - Val 11.48 -1.11 0.92 2.89 1.26 45.69 Note: Bold values are the recommended calibration coefficients 681012(a) SEE0100200Frequency050 100Percentage-2-1012(b) Bias0200400Frequency050100Percentage0510(c) C102004006008001000Frequency050100Percentage12345(d) C202004006008001000Frequency050100Percentage0123(e) C30200400 600 8001000Frequency050 100Percentage050100(f) C40200400Frequency050100Percentage113 Figure 3 -31 Bootstrap sampling validation results Œ Option 1 (1000 bootstraps) (b) Option 2 : Linear Mixed Effects Model In this model, each test section is assumed to have its own intercept. Figure 3 -32 (b) shows similar plot for the local calibrated model using no sampling. The figures show that the local calibration improves the IRI predictions for JPCP sections. (a) Measur ed vs. predicted IRI before calibration (b) Measured vs. predicted IRI after calibration Figure 3 -32 Local calibration results for IRI using entire dataset Œ Option 2 050100150(a) SEE02004006008001000Frequency050 100Percentage-200204060(b) Bias0200400600 8001000Frequency050 100Percentage0510(c) C102004006008001000Frequency050 100Percentage12345(d) C202004006008001000Frequency050100Percentage0123(e) C302004006008001000Frequency050100Percentage050100(f) C402004006008001000Frequency050100Percentage04080120160200240Measured IRI (in/mile)04080120 160 200240Predicted IRI (in/mile)04080120160200240Measured IRI (in/mile)04080120160200240Predicted IRI (in/mile)114 The SEE, bias, and model coefficients are summarized in Table 3-18. Based on the results, SEE reduced from 29.9 to 10.94 inch/mile, and the bias reduced from 11.9 to 0.04 inch/mile. Similar to the cracking model, 1000 bootstrap samples were used to recali brate the IRI model. Figure 3 -33 shows the parameter distributions for the 1000 bootstrap calibrations. Figure 3 -34 shows the parameter distributions for the 1000 bootstrap validations. The average and median values for SEE, bias, and model coefficients (C 1 to C 4) are summarized in Table 3-18. Based on the results, SEE reduced from 29. 9 to 10.9 inch/mile, and the bias reduced from 12.4 to 0.19 inches /mile. For the validations, the SEE reduced from 29.9 to 12.93 inch/mile, and the bias reduced from 11.7 to -0.80 inches /mile. The bias is significantly reduced in the case of mixed -effects models , while the SEE values are slightly higher than in the case of the fixed -effects model. Table 3-18 Local calibration summary for IRI (Option 2) Sampling technique Parameter SEE Bias C1 C2 C3 C4 No sampling Global model 29.89 11.92 0.82 0.44 1.49 25.24 Local model 10.83 0.04 1.78 3.13 1.38 24.59 Bootstrapping Global Model Mean 29.93 12.44 0.82 0.44 1.49 25.24 Global Model Median 29.86 12.36 0.82 0.44 1.49 25.24 Local Model Mean 10.94 0.19 1.77 3.12 1.27 28.08 Local Model Median 10.86 0.22 1.81 3.10 1.37 27.49 Local Model Mean - Val 11.86 -0.80 1.77 3.12 1.27 28.08 Local Model Median - Val 11.53 -0.90 1.81 3.10 1.37 27.49 115 Figure 3 -33 Bootstrap sampling calibration results Œ Option 2 (1000 bootstraps) Figure 3 -34 Bootstrap sampling validation results Œ Option 2 (1000 bootstraps) 51015(a) SEE0100200Frequency050100Percentage-505(b) Bias0200400Frequency050100Percentage02468(c) C10200400600 8001000Frequency050100Percentage12345(d) C202004006008001000Frequency050100Percentage00.511.52(e) C302004006008001000Frequency050100Percentage020406080(f) C40100200Frequency050100Percentage0204060(a) SEE0200400 6008001000Frequency050 100Percentage-20-1001020(b) Bias02004006008001000Frequency050 100Percentage02468(c) C10200400 600 8001000Frequency050100Percentage12345(d) C20200400 600 8001000Frequency050100Percentage00.511.52(e) C30200400 600 8001000Frequency050 100Percentage020406080(f) C40200400 6008001000Frequency050100Percentage116 3.6 IMPLEMENTATION CHALLENGES After the re -calibration of cracking and IRI models, several previously ME designed pavement projects were redesigned using the revised coefficients. These design evaluations highlighted potential issues in implementing the Pavement -ME for rigid pavements in Michigan , as discussed below. 3.6.1 Impact of Re -calibration on Pavement Designs The initial design thicknesses are based on AASH TO 93. For the same design, different Pavement -ME versions were used to analyze the thickness for which the predicted pavement performance is 15% slabs cracked or 172 inch es/mile for transverse cracking and IRI, respectively at 20 years. Another criterion used by MDOT is to limit the designed slab thickness by the Pavement -ME within ±1 inch of AASHTO 93 design thickness. That means even if the designed slab thickness lesser than 1 inch as compared to AASHTO 93, it will be restricted to 1 inch thinner than t he initial design and vice versa. For example, if the designed slab thickness is 9 inches , and the initial AASHTO design thickness corresponds to 10.5 inches, the final slab thickness will be 9.5 inches. Figure 3 -35 shows the comparison of PCC slab thickn esses designed by using AASHTO 93, the Pavement -ME versions 2.0 and 2.3 using previous local calibration coefficients, and the version 2.3 using re -calibrated coefficients for four mainline pavement designs. The critical distress for all the designs was IR I, and all designs showed no predicted cracking. However, noticeable cracking was observed in a few pavement sections used for re -calibration. MDOT engineers were concerned with the no cracking prediction by the Pavement -ME. The probable causes for no crac king prediction were investigated , and the findings are discussed in the next section. 117 OC = originally local ly calibrated models, NC = newly local ly calibrated models Figure 3 -35 Design thickness comparisons for mainline roads 3.6.2 Lessons Learned The main reason for zero cracking prediction is due to very low fatigue damage predictions at the end of design life (20 years). It should be noted that fatigue damage (top -down and bottom -up) is dependent on the material p roperties, pavement structure, climate, and traffic inputs for the pavement sections. The damage is calculated by mechanistic models in the Pavement -ME. The local calibration uses this damage and calibration coefficients to match predicted and measured cra cking. The predicted damage for the re -calibration pavement sections was obtained from the Pavement -ME outputs. Figure 3 -36 shows the predicted and measured cracking and the associated damage to all pavement sections used in the re -calibration. It can be s een that the predicted cracking fits the measured cracking reasonably well. For cumulative damage lower than 0.1, the cracking levels are negligible. If the inputs for a design results in very low damage, it will not result in any predicted cracking. 67891011121314051015202530354045Thickness (inch) ESALS (million) AASHTO93 ME_2.3_NC ME_2.3_OC 118 Figure 3 -36 Relationship between damage and percent slab cracked A few pavement sections used in the re -calibration exhibited high levels of cracking (i.e., more than 30% slab cracked in 10 years); however, the predicted cracking for those sections using the as-built design inputs showed negligible cracking. As a result, the input permanent curl/warp effective temperature difference was changed from -10oF, which is the recommended default to a lower value for a section. The values of permanent curl/warp were selected such that the predicted cracking is close to measured cracking. Note that similar adjustments for the permanent curl/warp were also made in the national calibration (5; 6 ). However, when designing a new pavement project, the default value of -10oF is used since it is difficult to anticipate the as -constructed permanent curl/warp for future projects. More discussion on this topic can be found in Chapter 2. Also, no measured curl values for different pavement secti ons in Michigan are available. Since no guidance is available at the design stage, using a default permanent curl/warp value will result in a negligible damage prediction in Michigan. Therefore, some guidance should be available to the designers for estima ting a future project -specific permanent curl/warp value; however, it can be very challenging since the permanent curl/warp value is construction related. One has to realize that the Pavement -ME predicts only structural cracking. Changing the 119 permanent cur l/warp value to match the measured cracking implies that the predicted cracking also includes non -structural cracking (i.e., material and construction related) if any. Generally, PCC paving is performed in the mornings of summer days in Michigan. These conditions can expose the PCC slabs to a high positive temperature difference from intense solar radiation plus the heat of hydration. Depending on the exposure conditions , a significant amount of positive temperature gradient (upper portion of the slab is m uch warmer than the bottom) may be present at the time of hardening of PCC slabs. This temperature is termed as the fibuilt -in temperature gradientfl or the fizero -stress temperature gradientfl (7). When the temperature gradient in the slabs fall below the locked -in gradient at the time of construction (the zero -stress gradient), the slabs will attempt to curl upward causing tensile stress at the top of the slab which can lead to top -down cracking of JPCP. Thus, an effective negative temperature gradient is permanently fibuiltfl into t he slabs. The upward curling of PCC slabs is restrained by several factors , including the slab self -weight, dowels, and the weight of any base course bonded to the slab. It is possible that this negative curl is present on the severely cracked pavement sec tions in Michigan. To make matters worse, if the PCC paving is done in the morning, the maximum heat of hydration and the maximum solar radiation may coincide at about the same time resulting in a large built -in temperature gradient when the slab solidifies. However, if PCC paving is performed later in the afternoon or at night so that the highest temperature from the heat of hydration does not correspond with the most intense solar radiation, the amount of permanent temperature gradient fibuiltfl into the slab will be much lower and could potentially even be negative. Also , moist curing can assist in producing a lower fizero -stressfl or fibuilt -infl permanent temperature gradient than regular curing compound. Therefore, there is a need to estimate 120 permanent curl/warp value for a given location for future designs based on the mat erial, climate , and design parameters. The caveat for adopting such an approach for increasing cracking predictions is making an assumption that all the observed cracking is due to a combination of load and curling/warping but not material -related distress es. The challenges mentioned above for design can be addressed by considering the project -specific permanent curl/warp value. A procedure to estimate the permanent curl/warp value from the project -specific design, material , and climatic inputs was developed as documented below (8). 3.7 PERMANENT CURL/WARP MODEL The permanent curl/w arp of a pavement section depends on it 's material, climate , and design parameters. Several inputs related to material properties (i.e., compressive strength, aggregate type , etc. ), design parameters (slab thickness, slab width, joint spacing , etc. ), and c limatic data (wind speed, temperatures, and precipitation related , etc. ) were obtained for each of the pavement sections used in re -calibration. For each section, the permanent curl values were estimated by matching measured and predicted cracking (version 2.3 global coefficients). These estimated permanent curl values were used as the dependent variable, and all other inputs mentioned above were used as impendent varia bles. Several models were fit to explain the relationship between the site -specific perm anent curl and the inputs. Based on the goodness -of-fit (20.918R) and practical impact of the variables on the curl, t he following model was chosen to predict permanent curl per inch . The predicted permanent curl is a function of compressive strength, average wind speed, mean annual precipitation, and maximum temperature range for a location in Michigan. 2'' 0.2090.08050.00 Curl/Warp2.9 270.2770.0003 755WSMRfcMP fcWS 121 where; Curl/Warp = Permanent curl/warp effective temperature differ ence (ºF) per inch WS = Average wind speed (mi/hr) MR = Maximum range ( oF) 'fc = 28-day PCC compressive strength (psi) MP = Mean annual precipitation (inches) The values for average wind speed and maximum temperature range should be obtained from the first row ( month of construction ) from the Pavement -ME output file fiMonthly Climate Summary .csv .fl The mean annual precipitation data is obtain ed from the pdf output file for a design project. Figure 3 -37 shows the goodness -of-fit for the developed model. It should be noted that predicted curl /warp obtained from the above model needs to be multiplied by the slab thickness times -1 to obtain its total value , as shown below: PCETD = Curl/Warp1 slab Th (2) where; PCETD = Permanent curl/warp effective temperature difference (ºF) Curl/Warp = Obtained from Equation (1) slab Th= Slab thickness (inches) The PCETD value depends on the slab thickness , and hence , it is recommended that the value be changed for each trial design in the Pavement -ME. Note that the actual permanent curl values in Figure 3 -37 are not measured in the field . Th e value s were obtained by matching the predicted to measured cracking performance for each pavement section. Table 3 -19 shows the ranges of all the inputs used to develop the model. 122 Figure 3 -37 Predicted versus actual permanent curl in Michigan Tabl e 3-19 Summary of attributes used in the model Job No. Predicted curl (ºF) Compressive strength (psi) Maximum temp. range Average wind speed (mi/hr) Mean annual precipitation (in) 28215 -25 5412 28 6.1 27.25 28215 -22 5412 28 6.1 27.25 32516 -28 4892 41 5.5 29.78 34014 -26 5765 27 6.8 31.25 34014 -23 5765 27 6.8 31.25 34963 -40 6398 34 6.1 27.09 34963 -40 6398 34 6.1 27.09 36003 -30 5334 28 6.1 27.25 36003 -36 5334 28 6.1 27.25 36005 -17 4765 25 5.6 27.89 36005 -16 4765 25 5.6 27.89 38063 -27 6498 25 5.6 27.89 38094 -27 5958 33 5.7 30.24 38094 -40 5958 33 5.7 30.24 38100 -20 4799 25 5.6 27.89 53168 -19 4735 31 6.9 31.44 53168 -21 4735 31 6.9 31.44 54361 -30 4961 26 7.7 32.33 59066 -20 5600 27 6.8 31.25 59066 -18 5600 27 6.8 31.25 Average -26 5454 29 6.2 29.26 Std 8 590 4 0.6 1.92 Min. - 4,735 25 5.5 27.09 Max. - 6,398 41 7.7 32.33 Note: * All these independent variables can be obtained from the Pavement -ME output in the climatic summary. y = 0.8939x + 0.0537 R² = 0.918 0.0 1.0 2.0 3.0 4.0 5.0 0.0 1.0 2.0 3.0 4.0 5.0 Predicted Permanent curl oF per inch Actual Permanent curl oF per inch 123 The model will be valid within these ranges. It is also recommended to use the minimum and maximum values of -30 and -10 oF respectively, for the permanent curl. 3.8 RE-CALIBRATION BASED ON PERMANENT CURL MODEL Based on Equation (1), the permanent curl for each pavement section in the calibration dataset can be predicted . These values were used to predict the pavement damage for each section. The predicted damage was then used to match the measured cracking in ri gid pavement sections. The results of the re-calibration of cracking and IRI models are presented below. Transverse Cracking Model (a) Option 1: Nonlinear Fixed effects model Figure 3-38 shows the comparison between the measured and predicted transverse crack ing and a comparison of the transfer function for the global and locally re -calibrated models. The transverse cracking model was calibrated using all of the available MDOT JPCP pavement sections for no sampling technique. The SEE, bias, and model coefficie nts (C 4 and C 5) are summarized in Table 3-20. (a) Measured vs. predicted cracking (b) Fatigue damage predicted cracking Figure 3 -38 Local calibration results for transverse cracking using entire dataset Œ Option 1 020406080100Measured transverse cracking (% slabs cracked)0204060 80100Predicted transverse cracking (% slabs cracked)10-610-410-2100102104106Fatigue Damage020406080100Predicted transverse cracking (% slabs crackedGlobal ModelMeasured CrackingLocally Calibrated Model124 Figure 3 -39 presents the parameter distributions for the 1000 bootstrap calibrations. Figure 3 -39 presents the parameter distributions for the 1000 bootstrap validations. The average and median values for SEE, bias, C4, and C 5 are summarized in Table 3-20. Since the distributions of C 4 and C5 coefficients are not norma lly distributed , it is better to use median values to represent the central tendency of a non -normal distribution. Table 3 -20 Local calibration summary for transverse cracking Œ Option 1 Sampling techniqu e Parameter SEE Bias C4 C5 No sampling Global model 6.51 -2.68 0.52 -2.17 Local model 6.06 -1.28 0.71 -1.38 Bootstrapping Global Model Mean 6.40 -2.70 0.52 -2.17 Global Model Median 6.43 -2.66 0.52 -2.17 Local Model Mean 5.65 -1.24 1.01 -1.76 Local Model Median 5.67 -1.10 0.70 -1.34 Local Model Mean - Val 7.41 -1.51 1.01 -1.76 Local Model Median - Val 7.67 -1.48 0.70 -1.34 Note: Bold values are the recommended calibration coefficients (b) Option 2: Nonlinear mixed -effects model Similar to calibration with no curl values, the model with a covariance structure with non -diagonal elements was chosen because of the lower AIC and BIC values. Figure 3-41 shows the comparison between the measured and predicted transverse cracking and a comparison of the transfer function for the global and locally re -calibrated models. The transverse cracking model was calibrated using all of the available MDOT JPCP pavement sections for no sampling technique. The SEE, bias, and model coefficients (C 4 and C 5) are s ummarized in Table 3-21. 125 Figure 3 -39 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 1 Figure 3 -40 Bootstrap sampling validation results (1000 bootstraps) Œ Option 1 0510(a) SEE0100200Frequency050100Percentage-4-3-2-10(b) Bias0100200Frequency050100Percentage0246(c) C40200400Frequency050100Percentage-5-4-3-2-10(d) C50200400Frequency050100Percentage0102030(a) SEE0200400Frequency050100Percentage-10-50510(b) Bias0200400Frequency050 100Percentage0246(c) C40200400Frequency050100Percentage-5-4-3-2-10(d) C50200400Frequency050100Percentage126 Figure 3 -42 presents the parameter distributions for the 1000 bootstrap calibrations. Figure 3 -43 presents the parameter distributions for the 1000 bootstrap validations. The average and median values for SEE, bias, C4, and C 5 are summarized in Table 3-21. Although the SEE is higher compared to the fixed effects models when the entire dataset was used, the results, the bootstrapping results yield unimodal distributions for the C 4 and C 5. This indicates that the mixed -effects model accounts for the subject -specific va riations and was not influenced by the one test section that has 80 percent cracking (a) Measured vs. predicted cracking (b) Fatigue damage predicted cracking Figure 3 -41 Local calibration results for transverse cracking using entire dataset Œ Option 2 Table 3 -21 Local calibration summary for transverse cracking Œ Option 2 Sampling technique Parameter SEE Bias C4 C5 No sampling Global model 6.51 -2.68 0.52 -2.17 Local model 9.17 -0.45 0.09 -2.47 Bootstrapping Global Model Mean 6.09 -2.40 0.52 -2.17 Global Model Median 6.10 -2.32 0.52 -2.17 Local Model Mean 8.53 0.58 0.47 -2.11 Local Model Median 7.56 -0.53 0.21 -2.19 Local Model Mean - Val 7.55 -0.97 0.47 -2.11 Local Model Median - Val 7.25 -1.99 0.21 -2.19 Note: Bold values are the recommended calibration coefficients 020406080100Measured transverse cracking (% slabs cracked)02040 6080100Predicted transverse cracking (% slabs cracked)10-610-410-2100102104106Fatigue Damage02040 6080100Predicted transverse cracking (% slabs crackedGlobal ModelMeasured CrackingLocally Calibrated Model127 Figure 3 -42 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 2 Figure 3 -43 Bootstrap sampling validation results (1000 bootstraps) Œ Option 2 020406080(a) SEE050100150200250Frequency01020 304050Percentage-50050100(b) Bias0200400600Frequency020 40 60Percentage-100102030(c) C40200400 600Frequency36384042Percentage-10-505(d) C50500Frequency050Percentage050100(a) SEE0500Frequency050Percentage-50050100(b) Bias0100200300400 500Frequency010203040 50Percentage-100102030(c) C40200400600Frequency3638 4042Percentage-10-505(d) C50500Frequency050Percentage128 IRI Model (a) Option 1: Linear fixed effects model The IRI model was re -calibrated after the local calibration of the transverse cracking. This distress is considered directly in the IRI model along with the site factor and spalling predictions. Figure 3 -44 (a) shows the predicted and measured IRI for the global model for JPCP sections. It can be seen that the global IRI model coefficient does not predict the meas ured IRI reasonably. Figure 3 -44 (b) shows similar plot for the local calibrated model using no sampling. The figures show that the local calibration improves the IRI predictions for JPCP sections. (a) Measured vs. predicted IRI before calibration (b) Measured vs. predicted IRI after cali bration Figure 3 -44 Local calibration results for IRI using entire dataset Œ Option 1 The SEE, bias, and model coefficients are summarized in Table 3-22. Similar to the cracking model, 1000 bootstrap samples were used to recalibrate the IRI model. Figure 3 -45 shows the parameter distributions for the 1000 bootstrap calibrations. Figure 3 -46 shows the parameter distributions for the 1000 bootstrap validations. 04080120160200240Measured IRI (in/mile)04080120160200240Predicted IRI (in/mile)04080120160200240Measured IRI (in/mile)04080120160200240Predicted IRI (in/mile)129 Tabl e 3-22 Local calibration summary for IRI Œ Option 1 Sampling technique Parameter SEE Bias C1 C2 C3 C4 No sampling Global model 64.63 43.52 0.82 0.44 1.49 25.24 Local model 15.41 -1.28 0.53 9.90 0.43 43.10 Bootstrapping Global Model Mean 63.04 43.29 0.82 0.44 1.49 25.24 Global Model Median 62.67 42.89 0.82 0.44 1.49 25.24 Local Model Mean 14.34 -0.99 0.42 9.33 0.70 33.29 Local Model Median 14.30 -0.90 0.49 9.37 0.52 36.78 Local Model Mean - Val 22.26 -0.88 0.42 9.33 0.70 33.29 Local Model Median - Val 22.10 -2.02 0.49 9.37 0.52 36.78 Note: Bold values are the recommended calibration coefficients Figure 3 -45 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 1 510152025(a) SEE0100200Frequency050100Percentage-4-202(b) Bias0200400Frequency050100Percentage0123(c) C105001000Frequency050100Percentage05101520(d) C202004006008001000Frequency050100Percentage00.511.52(e) C302004006008001000Frequency050100Percentage050100(f) C40100200Frequency050100Percentage130 Figure 3 -46 Bootstrap sampling validation results (1000 bootstraps) Œ Option 1 (b) Option 2 : Linear mixed -effects model Similar to the model where no curl was considered, each test section is assumed to have its intercept. Figure 3 -47 (b) shows a similar plot for the local calibrated model using no sampling. The figures show that the local calibration improves the IRI predictions for JPCP sections. It can be seen that the bias is significantly reduced in the case of mixed -effects models , while the SEE values are comparable to the fixed -effects model. 0204060(a) SEE0200400 600 8001000Frequency050100Percentage-2002040(b) Bias0200400 600 8001000Frequency050100Percentage0123(c) C105001000Frequency050100Percentage05101520(d) C202004006008001000Frequency050100Percentage00.511.52(e) C302004006008001000Frequency050100Percentage050100(f) C402004006008001000Frequency050100Percentage131 (a) Measured vs. predicted IRI before calibration (b) Measured vs. predicted IRI after calibration Figure 3 -47 Local calibration results for IRI using entire dataset Œ Option 2 Table 3-23 Local calibration summary for IRI Œ Option 2 Sampling technique Parameter SEE Bias C1 C2 C3 C4 No sampling Global model 64.63 43.52 0.82 0.44 1.49 25.24 Local model 15.72 -0.40 0.90 11.27 0.59 21.25 Bootstrapping Global Model Mean 69.89 47.23 0.82 0.44 1.49 25.24 Global Model Median 68.90 46.14 0.82 0.44 1.49 25.24 Local Model Mean 14.77 -0.13 0.87 11.31 0.47 28.96 Local Model Median 14.78 -0.20 0.96 11.13 0.43 27.93 Local Model Mean - Val 17.18 -1.87 0.87 11.31 0.47 28.96 Local Model Median - Val 17.37 -1.84 0.96 11.13 0.43 27.93 Note: Bold values are the recommended calibration coefficients The SEE, bias, and model coefficients are summarized in Table 3-23. Similar to the cracking model, 1000 bootstrap samples were used to recalibrate the IRI model. Figure 3 -48 shows the parameter distributions for the 1000 bootstrap calibrations. Figure 3 -49 shows the parameter distributions for the 1000 bootstrap validations. The average and median values for SEE, bias, and model coefficients (C 1 to C 4) are summarized in Table 3-6. 04080120160200240Measured IRI (in/mile)04080120160200240Predicted IRI (in/mile)04080120160200240Measured IRI (in/mile)04080120160200240Predicted IRI (in/mile)132 Figure 3 -48 Bootstrap sampling calibration results (1000 bootstraps) Œ Option 2 Figure 3 -49 Bootstrap sampling validation results (1000 bootstraps) Œ Option 2 101520(a) SEE0200Frequency050100Percentage-505(b) Bias0500Frequency050100Percentage01234(c) C102004006008001000Frequency050100Percentage101520(d) C202004006008001000Frequency050100Percentage00.511.52(e) C30200400600 8001000Frequency050100Percentage020406080(f) C40200Frequency050100Percentage0204060(a) SEE0200400 600 8001000Frequency050100Percentage-20-1001020(b) Bias0200400 600 8001000Frequency050100Percentage01234(c) C10200400600 8001000Frequency050 100Percentage101520(d) C20200400 6008001000Frequency050100Percentage00.511.52(e) C30200400600 8001000Frequency050100Percentage020406080(f) C402004006008001000Frequency050100Percentage133 3.8.1 Impact of Re -Calibration on Pavement Designs Table 3 -24 shows locations, design slab thicknesses (version 2.3) , curl va lues , and failure criteria for a few of the d esign projects. Figure 3 -50 show s the comparison of PCC slab thicknesses designed by using AASHTO 93, and Pavement -ME version 2.3 new calibration coefficients, and calibration coefficients using curl model . The critical distress for all the designs is IRI. However, noticeable cracking lev els were predicted for these sections. The designed slab thicknesses for all the projects have increased by at least one inch when compared with those from AASHTO 93. Table 3-24 Local calibration summary for transverse cracking Project name Location Design thickness Curl value Failure criteria 118618 mainline Grand Rapids 10.5 -17.32 IRI 117992 mainline Grand Rapids 11.5 -14.42 IRI 115799 mainline Flint 11 -29.84 IRI 115799 ramp Flint 9.5 -25.77 IRI 100014 mainline Flint 13.5 -36.6 IRI 100014 ramp Flint 10.5 -28.5 IRI NC = newly l ocal ly calibrated models, NC_Curl = newly local calibrated models using the unique permanent curl/warp for each project. Figure 3 -50 Design thickness comparisons for mainline roads 67891011121314051015202530354045Thickness (inch) ESALS (million) AASHTO93 ME_2.3_NC ME_2.3_NC_Curl 134 3.9 SUMMARY This section summarizes the local re -calibration of the transverse cracking and IRI models in the Pavement -ME. The local calibration process includes several sequential steps , as described elsewhere (2). The following is a summary of the findings: The initial local re -calibration of rigid pavement performance models showed no predicted cracking mainly because the inputs used for design are different from those used for re -calibration. For example, the comp ressive strength of 5600 psi is used in design while an average value of 3500 psi ( MOR = 560 psi) was used in initial re -calibration. All the previous ME designs showed no or very limited predicted cracking (i.e., negligible damage) and were controlled by IRI performance. As a consequence, the performance models were re -calibrated using as -constructed compressive strength (an average value of 5600 psi). Since there was no predicted cracking using a permanent curl of -10oF (default value), the permanent cur l was varied to match the measured performance for each pavement section in the calibration dataset. Climate data, material properties , and design parameters were used to develop a model for predicting permanent curl for each location. This model can be us ed at the design stage to estimate permanent curl for a given location in Michigan. The SEE and bias for the global models for cracking and IRI models are much higher as compared to the locally re -calibrated models using all the selected pavement sect ions (i.e., the entire dataset). 135 The main advantage of using a resampling technique such as bootstrapping is to quantify the variability (i.e., confidence interval) associated with the model predictions and parameters. Also , for a limited data set , these t echniques will help in reducing the SEE and bias for the calibrated model. The quantification of the variability will also help in determining more robust design reliability in the Pavement -ME. Since the distributions of SEE and bias of the fixed effects models are non -normal, the median model coefficients based on bootstrapping should be used . Those coefficients also showed much lower SEE and bias than global coefficients for the cracking model. The previously calibrated local joint faulting model coeffic ients are still valid based on lowest SEE and bias and should be used . The average IRI model coefficients from fixed -effects modeling using bootstrapping showed significantly lower SEE and bias as compared to the global coefficients. Using mixed -effects m odels instead of fixed effects models reduces the bias significantly and takes care of the inherent subject variations and therefore , should be used for calibration purposes in the future. 136 REFERENCES 137 REFERENCES [1] Haider, S. W., W. C. Brink, and N. Buch. Local calibration of rigid pavement performance models using resampling methods. International Journal of Pavement Engineering , 2015, pp. 1 -13. [2] Haider, S. W., N. Buch, W. Brink, K. Chatti, and G. Baladi. Preparation for Implementation of the Mechanistic -Empirical Pavement Design Guide in Michigan Part 3: Local Calibration and Validation of the Pavement -ME Performance Models. Rep. No. RC -1595, Michigan State Univ., East Lansing, MI , 2014. [3] Haider, S. W., W. C. Brink, N. Buch, and K. Chatti. Process and Data Needs for Local Calibration of Performance Models in the AASHTOWARE Pavement ME Software. Transportation Research Record: Journal of the Transportation Research Board , No. 2523 , 2015, pp. 80-93. [4] MATLAB. Mixed -Effects Models Using nlmefit and nlmefitsa . https://www.mathworks.com/help/stats/mixed -effects -models -using -nlmefit -and -nlmefitsa.html . Accessed 07/10/2019, 2019. [5] Sachs, S., J. M. Vandenbossche, and M. B. Snyder. Developing Recalibrated Concrete Pavement Performance Models for the Mechanistic -empirical Pavement Design Guide.In, University of Pittsburg, , 2014. [6] NCHRP Project 1 -37A. Appendix FF: Calibration Sections for Rigid Pavements.In, ARA, Inc., ERES divisi on, 505 west University Avenue, Champaign, Illinois 61820, 2004, 2003. [7] NCHRP. Guide for Mechanistic -Empirical Design of New and Rehabilitated Pavement Structures.In, Washington D.C, 2004. [8] Rao, C., L. Titus -Glover, B. Bhattacharya, M. Darter, M. Stanley, and H. Von Quintus. Estimation of Key PCC, Base, Subbase, and Pavement Engineering Properties from Routine Tests and Physical Characteristics.In, 2012. 138 CHAPTER 4 - DEVELOPMENT OF TRAFF IC INPUTS This chapter presents the permanent traffic recorder (PTR) data collection and processing for developing traffic input defaults for the Pavement -ME. Also, the methodologies used to develop traffic inputs , including the classification models used for assignment of PTR sites to clust ers are documented . Classification models were also developed to assign a site to the 4.1 DATA COLLECTION AND PROCESSING The data from the existing PTR sites maintained by MDOT were reviewed by a team of MDOT experts . Quality control checks were conducted on the data , which lead to the final selection of PTR sites to be used for traffic data extraction. 4.1.1 Review of Existing Data Collection Sites Continuous WIM and classification sites were utilized to acquire Level 1 traffic data throughout the State of Michi gan. These PTR sites are distributed in the entire State. Tables 4 -1 and 4 -2 show the list of the MDOT permanent classification and WIM sites, respectively. The data from 62 PTR sites (19 Classification + 43 WIM & Classification) were evaluated. One class ification site (Eagle) and two WIM sites (Omer and Luna Peir PrePass) did not pass the quality checks in the PrepME. However, traffic data from 59 PTR sites were used , as shown in Table 4 -3. The PrepME (1) was used for analyzing the raw traffic data from PTR sites and developing the Level 1 traffic inputs. MDOT provided the final PrepME database (January 2011 to December 2015) after quality checks (availability of at least one week of dat a in each month, front and gross vehicle weights). The research team extracted the Level 1 data for traffic inputs through PrepME. 139 Table 4 -1 List of PTR sites with classification data only (Non -WIM) PTR Site Route Latitude Longitude 096479 US-10 43.596 -84.032 117139 US-31 41.774 -86.324 137069 M-60 42.146 -84.841 183029 M-115 43.927 -85.013 256309 M-57 43.179 -83.678 256349 I-75 43.044 -83.754 397109 US-131 42.074 -85.636 478219 I-96 42.563 -83.834 533269 US-10 43.946 -86.059 595249 US-131 43.439 -85.492 638209 I-96 42.514 -83.597 638409 M-59 42.625 -83.111 645269 US-31 43.783 -86.403 766069 I-69 42.803 -84.337 787329 US-12 41.797 -85.586 807289 M-43 42.309 -86.074 828440 US-24 42.406 -83.277 829799 I-75 42.259 -83.180 4.2 GENERATION OF MULTIPLE TRAFFIC INPUT LEVEL S Site -specific traffic inputs (Level 1 ) were generated for each of the 4 1 WIM sites using the PrepME after extensive QC checks. Development of regional inputs (Level 2) is crucial when site -specific data are not available. The averages from nearby sites (regional) with similar traffic characteristics (groups or clusters) can be used as Level 2 data (2). Two approaches were adopted for devel oping Level 2 inputs, (a) cluster analyses (Level 2A), and (b) grouping roads with similar attributes (Level 2B) . In addition, Level 3 data are further split int o Levels 3A, and 3B, where 3A represents the average of freeways and non -freeways , and 3B repre sents the overall statewide average for traffic inputs. 140 Table 4 -2 List of PTR sites with WIM and classification data PTR Site Route Sensor Type Latitude Longitude 037319 I-196 Quartz 42.437 -86.248 096429 I-75 Quartz 43.629 -83.960 117189 I-94 Quartz 41.769 -86.738 127269 I-69 Quartz 41.849 -84.996 137159 I-94 Quartz 42.298 -85.039 137169 I-94 Quartz 42.283 -84.873 195019 US -127 Quartz 43.024 -84.546 211459 US -2 Quartz 45.728 -87.231 212229 US -2 Quartz 45.921 -86.992 221199 M-95 Quartz 46.029 -88.060 238869 I-69 Quartz 42.534 -84.821 256119 I-75 Quartz 43.210 -83.770 256449 I-69 Quartz 42.967 -83.782 271009 US -2 Load Cell 46.466 -90.192 308129 US -12 Quartz 41.990 -84.647 338029 US -127 Quartz 42.538 -84.443 345299 I-96 Quartz 42.879 -85.057 387029 I-94 Quartz 42.285 -84.284 387049 US -127 Quartz 42.174 -84.365 403069 US -131 Quartz 44.823 -85.136 419759 M-6 Quartz 42.850 -85.607 478049 I-96 Quartz 42.646 -84.085 478219 I-96 Quartz 42.563 -83.834 492029 US -2 Quartz 46.003 -84.998 588729 US -23 Quartz 41.784 -83.696 615289 US -31 Quartz 43.224 -86.205 694049 I-75 Quartz 45.144 -84.670 705059 I-196 Quartz 42.866 -85.802 705099 I-96 Quartz 43.053 -85.937 724129 US -127 Quartz 44.264 -84.803 724149 I-75 Quartz 44.317 -84.451 752199 M-28 Quartz 46.345 -85.983 776369 I-69 Quartz 42.978 -82.803 776469 I-94 Quartz 42.940 -82.507 787119 US -131 Quartz 41.842 -85.677 807219 I-94 Quartz 42.220 -85.821 818239 US -23 Quartz 42.414 -83.765 828839 I-94 Quartz 42.220 -83.466 829189 I-275 Piezo BL 42.180 -83.388 829209 I-275 Quartz 42.309 -83.442 829699 I-75 Quartz 42.110 -83.241 Table 4 -3 Number of sites with available WIM and classification data Type of Data Available QC Passed Weight and Classification 43 41 Weight Only 0 0 Classification Only 19 18 Total 62 59 141 4.2.1 Level 2A Inputs Level 2A inputs were generated using c luster analysis . The methodology used to develop these inputs is detailed in the following sections. Cluster analysis is a data mining technique that identifies homogeneous subsets of data (also known as clusters) within a dataset using only the information found in the data. It uses a mathematical similarity of two data objects to group them. The next few sections briefly discuss the steps involved in cluster analyses. The steps involved in a typical cluster analyses include (a) obtaining the data for cluster analyses, ( b) identifying the significant attributes of the data, (c) choosing a distance measure, (d) selecting a clustering technique, (e) deciding on the number of clusters, and (f) interpreting the results (3). A dataset used for cluster analyses often contains a collection of data objects. Each data object has some attributes that capture the object™s fundamental character istics . The primary aim of cluster analyses is to separate the dataset into subsets such that objects within a subset are like one another and are different from those in o ther subsets . The word ‚Clustering™ commonly refers to a n entire collection of clusters . A multivariate mn data matrix, D, usually represents the data used for cluster analyses , as shown below. Each row contains a data object , and the colum ns contain the attribute values describing each object in the dataset. 1111nmmnddDdd (1) where: ij d = value of the j th attribute of object i. 142 Attributes of the data objects in the matrix may either be continuous, or ordinal or a mixture of both. Some attributes of an object might not define it very well and might be irrel evant . Such attributes should be excluded if possible , or weights could be added to the essential attributes in the data matrix (3). Most clustering techniques convert the data matrix into a mm ‚distance matrix™ of inter -object simila rities or dissimilarities. The similarity/ dissimilarity between two objects with attributes on a contin uous scale is a numerical measure of the degree to which they are alike. Two objects are said to be close to each other when the dissimilarity is small. Several similarity measures can express likeness between a pair of objects. In general, two categories broadly define these measures : (a) distance, and (b) correlation -type. ‚Euclidean Distance™ is the most commonly used distance measure , which is the straight -line distance between two objects. It is calculated using Equation (2). 1/2 2,1() ijpoo ikjk kdoo (2) where; ,ikjk oo= kth component of the p-dimensional objects ,ijoo To illustrate, consider six two dimensional points (or objects) with x and y coordinates as follows (4): (a) P1 (0.40, 0.53), (b) P2 (0.22,0.38), (c) P3 (0.35,0.32), and (d) P4 (0.26,0.19), (e) P5 (0.08, 0.41), and P6 (0.45, 0.30 ).T he distance calculation between P 1 and P 2 is as follows : 22(1,2)(0.400.22)(0.530.38)0.234 dist (3) Likewise, the distances between all possible pairs of objects can be computed and expressed in the form of a ‚distance matrix .' The non -diagonal elements in this matrix represent the distances between pairs of objects and the diagonal elements represent the distance from each object to 143 itself (hence, they are always zero) as seen in Equation (4) (5). The objects used in this exa mple has only two dimensions ( attributes ), but distances for objects with more than two attributes can be computed ( e.g. , Monthly adjustment factors would have twelve attributes, i.e. , one for each month). 00.230.220.370.340.24 0.2300.140.190.140.24 0.220.1400.160.280.10 dist-matrix 0.370.190.1600.280.22 0.340.140.280.2800.39 0.240.240.100.220.390 (4) Mikowski distance is a more generalized form of the Euclidean distance , as seen in Equation (5). When r is equal to 1, it is called ‚Manha ttan™ or ‚City block™ distance , which is the distance between two points measured at right angles along the axes . 1/1(,)|| rnrkkkdxyxy (5) The other commonly used similarity measure is the ‚Mahalanobis™ distance , as shown in Equation (6). It is a generalization of the Euclidean distance and could be used when the attributes of the data objects are correlated or have different ranges of values. 1Mahalanobis(,)()() Txyxyxy (6) where: = covariance matrix whose ijth entry is the covariance of the i th and j th attribute 1 = the inverse of the covariance matrix 144 Correlation measures are the other category of similarity measures. The most widely used correlation meas ure is the Pearson correlation coefficient (6). The correlation between two data objects is a measure of the linear relationship between the ir attributes. A correlation value of ‚1™ indicates a strong relationship, and value of ‚ -1™ indicates a fragile relationship between the data objects and is calcul ated using Equation ( 7). covariance(,) (,) ()() xyxyxycorrxy stdevxstdevy (7) A number of similarity/dissimilarity measures exist, which makes choosing a measure for cluster analyses a challeng e. Earlier research studies have come up with categorizations of the similarity or dissimilarity measures based on the critical properties of the data ( e.g., scale of data, metric s, and Euclidean prope rties of similarity matrices). However , the properties are not very conclusive for choosing between the measures (6-10). The general observation is that the nature of the data should strongly influence the choice of the similarity measure . Sometimes, a similarity measure may already have been used previously and thus may have answered the choice of the similarity measure. Occasionally, the clustering technique might lim it the choices of the similarity measures that could be used . Different simil arity measures can be used to see which ones produce better clusters. However , for continuous data, distance measures should be used when the magnitude of the data is essential . Assuming that the attributes of a data object ( e.g. , the traffic volume in January does not affect the traffic volume in Febr uary) are not correlated , the Euclidean distance was chosen as the similarity measure in this study as it can be easily interpreted . More inf ormation on the choice of similarity or dissimilarity measure and a decision -making table that may help choose the similarity measure can be found elsewhere (4; 9 ). 145 Clustering techniques divide data into a set of clusters commonly refer red to as ‚clusterings .™ While the clusterings can be described in many ways, the commonly used distinction between different types of clusterin gs is whether it is partitional or hierar chical . A partitional clustering is merely a division of the set of data objects into nonoverlapping subsets (cluste rs) such that each data object is precisely in one subset. If these clusters are permitted to have sub -clusters , then a hierarchical clustering is obtained. The clustering techniques evaluated in this study are discussed below (4). 4.2.1.1 K-means It is a partitional clustering technique that divides the dataset into a predetermined number of clusters. The algorithm includes choosing ‚K™ initial cent roids (which equals the desired number of clusters). Each data object is assigned to the closest centroid, and each coll ection of data objects assigned to a centroid forms a cluster. After each iteration, the centroid of each cluster is updated based on the data objects assigned to that cluster. The iterations stop when no objects change clusters, or equivalently, the centroids remain the same. Only a few times, K -means reaches a state in which no objects are shifting from one cluster to another. Usually, a rule is set to stop the iterati ons after reaching a steady -state ( e.g ., only less than 1% of the objects are changin g the clusters) (4). Like every clustering t echnique, K -means needs a similarity measure to assign the object to a centroid. As discussed before, several options exist , but the most often used measure is the Euclidean distance. Given two sets of clusters from different K -means runs, the overall sum of squared error (SSE) often decides the better cluster , as shown below (4). 21(,) iKiixC SSEdistcx (8) 146 where: K= number of cl usters iC= cluster iC ic= centroid of cluster ic x = data object in cluster iC While K -means is a very general algorithm that can be used with different data types, it has various limitations. One of them is choosing the initial centroids. Randomly choosing the init ial centroids often results in poor clusters and sometimes in empty c lusters. While there are remedie s, it is often an exhaustive procedure to find the optimum init ial centroids. One procedure would be to run K -means multiple times with random initial centroids and use the one with the least SEE. Another procedure is t o use the hierarchical clustering at first to find the clusters and their centroids and then rerun the K -means . Also , the number of clusters required is an input into the K -means algorithm, which can be difficult to determine before the clustering. 4.2.1.2 Hierarchical Clustering Hierarchical clustering techniques are another prominent category of clusterin g methods. There are two appr oaches for generating hierarchical clusterings (a) divisive, and (b) agglomerative . In divisive clustering approach, a ll the objects in the dataset are considered as a single cluster at the beginning and are divided it into two clusters at each stage until all the clusters have only a singl e object. In the first step of the clustering, all the possible partitions of the dataset need t o be considered , which equals to 121ncombinations (where n is the number of data objects). Many combinations makes divisive clustering diff icult to implement. The a gglomerative approach performs clustering in the opposite manner as compared to a divisive approach. Agglomerative 147 clustering technique considers each object in the dataset as an individual cluster and merges them one at a time in a series of sequential steps (11). The number of clusters during the first step equals the number of objects in the dataset. At each subsequent step , the clusters that are ‚closest™ (or most similar) to each other are merged to form a new cluster . The final step merges all the objects in the dataset into one single cluster . The objects that are merged to form clusters at each step cannot be reassigned to different clusters at a later stage. Agglomerative hierarchical clustering technique is characterized by two choices: (a) the measurement of similarity between two objects in a dataset, and (b) the type of linkage between clusters (4; 12 ). The clusters formed using the hierarchical techniques can be represented using a two -dimensional diagram known as a ‚dendrogram .' A dendrogram illustrates the clusters formed at each stage of the clustering process. Figure 4 -1 shows an example o f a dendrogram. Two clusters are merged at each step of the process at a distance , or the height represented by the y-axis . The clustering technique (or the ‚linkage -type™) needs to be established after choosing the distance or similarity measure (5). Discussed below are the most commonly used methods. Figure 4 -1 An example of the dendrogram 148 Single Link age Method This method is also called the nearest neighbor or minimum method. The similarity between two clusters is the minimum of the distance between any two objects in two different clusters (11). Consider the six two dimensional points and the distance matrix discussed above [see Equation (4)]. In the first step of this clustering technique, p oints ‚3™ and ‚6™ are merged into a cluster (see Figure 4 -2a) because they have the smallest distan ce of 0.102 between them . *5*2*6*3IIIII I*4*1IVV (a) (b) Figure 4 -2 Single link age clustering As seen in Figure 4 -2b (dendrogram), that is the height at which they are merged into a single cluster. In the second step, points ‚2™ and ‚5™ are merged into a cluster . The distances between the newly formed clusters in step 3 are calculated as follows: ({3,6},{2,5}) min((3,2),(6,2),(3,5),(6, 5)) min(0.143,0 .244,0.285,0.386)0.143 ({3,6},{4}) min((3,4),(6,4),) dist distdistdistdist dist distdist min(0.158,0.220)0.158 3625410.10.120.140.160.180.20.22149 In the third step, the clusters {3,6},{2,5} are combined to form a third cluster at the height of 0.143, as seen in the dendrogram. This process goes on until all the points are combined into one single cluster at the height of approximately 0.22. Complete Linkage Method This method is also called the furthest neighbor or maximum method. The similarity of two clusters is defined as the maximum of the distance between any two points in two clusters . As with the single linkage method, clusters {3,6} and {2,5} are formed first (See Figure 4 -3a and 4-3b). *5*2*6*3IIIVI*4*1III V (a) (b) Figure 4 -3 Complete link age clustering In the third step, instead of {3,6} and {2,5} merging next, {3,6} is merged with {4} because the procedure takes into account the least maximum distance between clusters. 3641250.10.150.20.250.30.350.4150 ({3,6},{2,5}) max((3,2),(6,2),(3,5),(6, 5)) max(0.143,0 .244,0.285,0.386)0.386 ({3,6},{4}) max((3,4),(6,4)) dist distdistdistdist dist distdist max(0.158,0.220)0.220 ({3,6},{1}) max((3,1),(6,1)) max(0.216,0 .235)0.235 dist distdist Group Average Method For the group average version of hierarchical clustering, the similarity of two clusters is the average of pairwise similarity among all pairs of points in the different clusters. Group average method is an intermediate approach to the single , and complete link approaches. Figure 4 -4 show s the dendrogram for this method. The similarity between two clusters A and B of sizes An and Bn is calculated using Equation (9). (,) (,) *aAbB ABdistab proximityAB nn (9) *5*2*6*3IIIVI*4*1III V (a) (b) Figure 4 -4 Group average clustering 3642510.10.150.20.25151 An illustration of the group average method is given below. (3,1)(6,1),(4,1) ({3,6,4},{1}) 0.273(31) (2,1)(5,1) ({2,5},{1}) 0.288(21) (3,2)(3,5)(6,2)(6,5)(4,2)(4,5) ({3,6,4},{2,5}) (distdistdist dist distdist dist distdistdistdistdistdist dist 0.25632) Because ({3,6,4},{2,5}) dist is smaller than ({3,6,4},{1}) dist and ({2,5},{1}) dist , clusters {3,6,4} and {2,5} are merged in the four th step. Ward™s Method Ward™s method is another general agglomerative hierarchical clustering technique. The similarity of two clusters depends on the objective function s™ optimal value. The most widely used objective function is the error sum of squares or within -cluster variance. Clusters with the least increase in the overall within -cluster var iance when combined are merged at each step. Figures 4-5a and 4 -5b show the dendrogram for thi s method. *5*2*6*3IIIVI*4*1III V (a) (b) Figure 4 -5 Ward™s method of clustering 3641250.10.150.20.250.30.35152 The similarity between two clusters A and B of sizes An and Bncan be calculated by using Equation (10), where ACand BCare the centroids of clusters A and B. 2(,) (,) ABABABnnproximityAB distCC nn (10) In the first two steps, {3,6} and {2,5} are combined separately because of the least increase in the sum of squares when combined. The third step is illustrated below. 221 ({3,6},{4}) ({3,6},{4})0.213 (21) 221 ({3,6},{1}) ({3,6},{1})0.254 (21) 222({3,6},{2,5}) ({3,6},{2,5})0.373 (22) dist dist dist dist dist dist The third step combines clusters {3,6} , and {4} since the increase in the sum of squares is minimum among other combinations. The different linkage methods discussed above define the distance between the pair of c lusters in a certain way. Each of these linkage algorithms can yield entire ly different results when used on the same dataset. The single linkage method tends to produce one large cluster with other clusters conta ining very few objects because several clus ters may be joined togethe r (called ‚chaining effect™) mere ly because one of their objects is within proximity of an object from a separate cluster. This problem is sp ecific to a single linkage because only the minimu m distance between objects is used . As mentioned before, the objects that are merged to form clusters at one step cannot be reassigned to different clusters at a later stage , and therefore , the chai ning effect could lead to abnormal clusters. However, the single linkage method can be used to detect the outliers, as these will always be merged during the final step of the clusterin g proce ss (5; 11 ). 153 Complete linkage method solves the problem of chaining. However , they tend to pr oduce large globular clusters. Outliers can profoundly influence the outcome of the clusters as this clustering technique uses the maximum distance between the objects. The average link method is a compromise between the single and complet e linkage methods as it takes into account the average distances between the objects. This method is relatively robust compared to the single and complete linkage methods as it takes into account the cluster structure (6). Ward™s method is sensitive to outliers and tend s to find the same size and spheric al clusters . The average linkage and Ward's method are the most frequently used methods . There are no guidelines on choosing the right linkage method. However, some studies found that Ward™s method performed better than the average linkage method (13). 4.2.1.3 Sel f-Organizing Maps Self -Organizing Feature Maps (SOMs) are unsupervised learning techniques mainly used fo r clustering and data visualization. They are based on neural networks and are used to represent multidimensional data in much lower -dimensional spaces. While supervised training techniques such as backpropagation need training data consisting of the inpu t vector and a target vector, SOMs require no target vectors and learn to classify the training data without any external supervision. A SOM network consists of an input layer and a network layer with nodes or neurons organized in a topological position wi th coordinates (x, y) in a rectangular lattice , as shown in Figure 4 -6. The goal of a SOM is to find centroids of each node to assign each object in the dataset that is closest to one of the nodes (4; 14; 15 ). The training of the network occurs in iterations as desc ribed below: (a) The centroids of all nodes are initialized. (b) A randomly chosen object from the dataset is presented to the lattice. 154 (c) Every node in the lattice is examined, and the node most similar to the object is selected. This winning node is known as the B est Matching Unit (BMU) (d) The radius of the BMU neighborhood is calculated , which diminishes with every iteration. The centroid of the winning node along with the centroids of the nodes within the neighborhood of the BMU is updated proportionally to their di stance. (e) The process is repeated until no changes in the centroids are observed. (f) Each object in the dataset is assigned to the closest centroids. The BMU is determined by calculating the Euclidean distance between each node's centroid the current input vector . The node with the centroid closest to the input vector is selected as the BMU. The next step is to find the radius of the neighborhood of the BMU. All the nodes found within this rad ius are deemed to be inside the BMU's neighborhood . Figure 4 -6 A hexagonal arrangement of a 10 -by-10 set of neurons 155 The initial value is typically set to the radius of the lattice and diminishes each iteration. An exponential dec ay function is used to estimate the radius , as shown below: ()exp onRtR (11) Where oR = Initial radius n = Iteration number = Time constant The time constant depends on the initial radius and the desired rate of decay. At the last iteration , the size of the neighborhood is the size of just one node. The weights of the centroids of the nodes in the BMU neighborhood are adjusted as follows: (1)()()()() kkkk SnSnLnOnSn (12) At iteration ‚n+1™ , the centroid of node ‚ k™ in a neighborhood ‚S™ is updated by adding the term ()()() kk LnOnSn which is proportional to the difference between object ‚O™ and centroid of node ‚k™ at iteration ‚n.™ The learning rate ()kLn decreases with iterations and distance from the BMU . The most commonly used function for the learning rate is the Gaussian function , as shown below. 22(,) ()expexp 2ko ndistBMUk LnL n (13) The major advantage of using self -organizing maps is the ease of data interpretation and cluster visualization. A study compared SOMs and hierarchical clustering and found that SOMs were more accurate when deal ing with messy data (14). The limitations of SOMs include the fact that 156 the neighborhood functions, grid type (e.g., hexagonal or rectangular, etc.) and the number of centroids should be chosen before training the network. For example, vehicle class distribution (VCD) data for 893 PTR sites located across the United States and Canada were extracted from the LTPP database , and a network was trained using the data. The number of PTR sites associated with each node is shown in Figure 4 -7. The PTR sites are not fairly distributed across all the neurons, and it indicates that there are very few sites with highly specific patterns. Figure 4 -7 Number of PTR sites in each node When the objects in the dataset are high dimensional ( i.e. , there are 10 dimensions for VCD; classes 4 to 13) it is difficult to visualize all the weights at the same time. However, SOMs allow visualizing the weights for each attribute or dimension (10 in this case) u sing the weight plane figure , as shown in Figure 4 -9. The lighter colors represent larger weights while, the darker 157 colors represent, the smaller weights . If the color patterns of the two attr ibutes are very similar , it can be assumed that they are highly correlated. (a) Neuron 1 (b) Neuron 2 (c) Neuron 91 (d) Neuron 100 Figure 4 -8 VCDs of PTR sites in various neurons 02040608010045678910111213Percentage Vehicle Class 02040608010034567891011121314Percentage Vehicle Class 02040608010034567891011121314Percentage Vehicle Class 02040608010034567891011121314Percentage Vehicle Class 158 Figure 4 -9 SOM weight planes for VCD data 159 In Figure 4 -9, the patterns of VC5 and VC9 are very different. The neurons are numbered from left to right and from bottom to top. Consider neuron 1 (bottom left) for both VC5 and VC9, and the PTR sites in that neuron have higher VC5 levels and lower levels. Figure 4 -8(a) s hown the VCD distribution of PTR sites in neuron 1. Neuron 2 has a slightly lower VC5 level and slightly higher VC9 percentage compared to neuron 1 as seen in Figure 4 -8(b) and 4 -9. For neuron 91 (top left), VC9 levels are higher than VC5 levels, and for n euron 100 (top right), the VC5 level is very less compared to the VC9 level. 4.2.1.4 Choosing the Clustering Technique and the Optimal Number of Clusters The clustering techniques discussed in the sections above have certain advantages and disadvantages. Research studies suggest the clustering techniques should be chosen based on the data to be clustered. The challenge lies in choosing the best clustering method to cluster the traffic data in Michigan. The data could be clustered using all the discussed techniques , and the resulting clusters could be evaluated using various validity indexes to choose the best technique. These validity indices can be categorized into two types (a) external and, (b) internal. While external validation indices require a priori knowled ge of the data , which can be difficult to obtain, internal validity indexes do not require prior information from the dataset (16). In addition, finding the optimal number of clusters from the data is a difficult task. As discussed earlier, the number of clusters required is an input for K -means. Also, for SOM, the number of nodes should be de cided before training the network while hierarchical clustering provides limited guidance on the number of clusters to be retained from the data. Internal indices can also be used to find the optimal number of clusters. A research study (17) evaluated over 30 such indices and ranked ‚ Calinski -Harabasz ™ index as the top -performing index. The study also notes that the indices could be data dependent. S everal indices worked well while some performed 160 rather poorly . The following indices are chosen to choose the optimal number of clusters and better clustering technique. ‚Calinski -Harabasz ™ Index The Calinski -Harabasz index is also called the variance ratio criterion (VRC). The Calinski -Harabasz index is defined as (18) ()(1) BKWSSNkVRC SSk (14) where: SSB = overall variance between clusters SSw = overall variance within clusters K = number of clusters N = number of observations A dataset that has distinct clusters should have a large between -cluster variance ( SSB) and a small within -cluster variance ( SSw). The larger the KVRC ratio, the better the data partition. KVRC value is the highest for t he optimal number of clusters . It should also be noted that the ‚Calinski -Harbasz Criterion™ is best suited for K -means clustering (19). Davies -Bouldin Index Davies -Bouldin index is based on the ratio of within -cluster and between cluster distances. It is defined as 1,1NnmnmnnmDBmax N (15) Where: 161 m,n = the a verage distance of all points in clusters m and n to their centroids ,mn = Euclidean distance between the centroids of clusters m and n N = number of clusters For a set of clusters N, the maximum of ,mnmn values is computed. The maximum value represents the maximum within to between cluster ratio for cluster n. The Davies -Bouldin index is the m ean value among all clusters. Naturally, the clusters with the smallest index are optimal (20). Dunn Index The Dunn index is based on the ratio of minimum inter -cluster distance and the maximum cluster size. It is designed as: ,,min max 1,,, min{min()} maxmax() nmnnmiCjCij nNijCijij dDunn d (16) ,,{min()} nmiCjCij measures the smallest distance between two clusters Cn and Cm whereas, min dis the smallest of such distances between any two clusters. ,,, max() nijCijij measures the maximum distance between any two points in each cluster Cn whereas, max dis the largest of such distances for any cluster. Larger distan ces between clusters and smaller cluster sizes lead to a higher Dunn Index value and better clusters (20). 162 Silhouette Index Silhouette index is typically computed for each object in a cluster. It measures the extent of an objects™ similarity to other objects in a cluster. Silhouette width for an object ‚k™ is defined as ()() ()max((),(()) bkak Skakbk (17) Where, ()akis the average distance between object ‚k™ and other objects in the cluster. ()bkis the minimum of the average distance of object ‚k™ to objects in other clusters. Silhouette width ranges between -1 and 1. A value near 1 indicates tha t the object ‚k™ is well matched to its own cluster , and an object of -1 indicates a poorly matched cluster. The mean of the silhouette widths for a cluster Ck with ‚n™ objects is computed as: 1()kkkCsSk n (18) Moreover, the silhouette index for all the ‚K™ clusters is computed as follows: 11KkkSsK (19) If there is only one object in a cluster, then the silhouette coefficient cannot be defined. Gap Criteria Gap crit eria is a recently developed method that can be u sed with virtually any clusterin g technique (21). The gap value is calculated as follows: (){()}( )nnkk ElogWlo Gap gkW (20) where: n = sample size 163 k = number of clusters Wk = pooled within -cluster dispersion measurement 112kkrrrWDn (21) where: nr = number of data points in cluster r Dr = the sum of pairwise distances for all objects in cluster r The optimal number of clusters has the largest gap value (could be local or global) within a tolerance range. Monte Carlo sampling from a reference (null) distribution is used to determine the expected value {()} nk ElogW . The ()klogW value is obtained from the sample data. Several other measures are widely used but are not too efficient and often could result in substantial errors (11). The optimal number of clusters for hierarchical clustering was obtained by computing the Calinski -Harbasz Criter ion™ and ‚Gap Crit erion™ indices. Figure 4 -10 shows the values of these indices for hourly distribution factors (HDF). The results show that five clusters will have the highest KVRC and gap values. Hence, the HDF dataset was split int o five clusters. The ‚Calinski -Harbasz Criterion™ and ‚Gap Crit erion™ were computed for other traffic inputs as well. Using both criteria, and engineering judgments , the traffic datasets were split into the following number of clusters for each traffic inp ut (see Table 4 -4) 164 Figure 4 -10 Optimum number of clusters for HDF Table 4 -4 Number of clusters for each traffic input Input Number of clusters (Level 2A) Hourly distribution factors (HDF) 5 Monthly adjustment factors (MAF) based on VC 5 4 MAF based on VC9 5 Vehicle class distribution (VCD) 5 Single axle load spectra (SALS) 4 Tandem axle load spectra (TALS) 5 Tridem axle load spectra (TRALS) 6 Quad axle load spectra (QALS) 3 All the three clustering techniques were used to generate the clusters for each traffic input shown in Table 4 -4. The resulting clusters were evaluated using the ‚Calinski -Harabasz™, ‚Davies -Bouldin™, ‚Dunn Index™ and ‚Silhouette™ indices. The results of t his evaluation are presented in Table 4 -5. The values in the table are color -coded from light green to dark green based on the magnitude of the value. The closer the values, the closer are the shades of the color. The clustering technique with the darkest shade for any index is desirable for that input and is ranked 1. 23456Number of clusters31323334 35 36CalinskiHarabasz valuesCalinskiHarabasz method0246Number of clusters0.10.20.30.40.50.6Gap valuesGap criterion 165 Table 4 -5 Comparison of clustering techniques using vario us internal indices Input Index Hierarchical SOM K-Means VCD Davies -Bouldin 0.800 0.377 0.650 Dunn 0.235 0.240 0.187 Silhouette 0.377 0.404 0.432 Calinski -Harabasz 69.120 74.320 65.420 HDF Davies -Bouldin 0.620 0.683 0.633 Dunn 0.325 0.184 0.309 Silhouette 0.341 0.314 0.316 Calinski -Harabasz 35.850 34.080 35.477 MAF5 Davies -Bouldin 0.606 0.530 0.498 Dunn 0.198 0.099 0.119 Silhouette 0.288 0.218 0.222 Calinski -Harabasz 53.090 40.630 40.690 MAF9 Davies -Bouldin 0.310 1.426 1.422 Dunn 0.277 0.263 0.180 Silhouette - - - Calinski -Harabasz 13.247 12.994 10.711 SALS Davies -Bouldin 0.665 0.900 0.899 Dunn 0.247 0.120 0.122 Silhouette - 0.360 0.316 Calinski -Harabasz 69.610 31.980 31.740 TALS Davies -Bouldin 1.400 0.350 1.300 Dunn 0.217 0.213 0.285 Silhouette 0.198 - 0.182 Calinski -Harabasz 13.130 12.618 12.450 TRALS Davies -Bouldin 1.030 0.482 1.170 Dunn 0.210 0.224 0.022 Silhouette 0.360 0.360 0.223 Calinski -Harabasz 23.510 24.040 18.763 QALS Davies -Bouldin 1.160 0.930 0.987 Dunn 0.152 0.130 0.133 Silhouette 0.205 0.218 0.215 Calinski -Harabasz 20.604 22.234 21.931 The lightest shade is ranked 3. Table 4 -6 lists the performance of each clustering technique and hierarchical clustering outperforms the other two techniques. Also, the results obtained using the hierarchical clustering are repeatable. For the other techni ques, it depends on the choice of the 166 initial centroids although the clusters formed from two different runs are almost similar after several iterations. Owing to these reasons, it was decided that hierarchical clustering should be used for clustering of t raffic data in Michigan. Figures 4-11 and 4-12 show the dendrogram s of all the traffic inputs using Euclidean distance and Wards method and the cutoff line to develop the required number of clusters listed in Table 4 -4. (a) HDF (b) MAF( VC5) (c) MAF (VC9) (d) VCD Figure 4 -11 Cluster dendrograms for various traffic inputs Š Michigan PTR sites 167 (a) SALS (b) TALS (c) TRALS (d) QALS Figure 4 -12 Cluster dendrograms for various traffic inputs Š Michigan PTR sites 1. Hourly distribution factors (HDF) Œ 5 clusters The cluster analysis resulted in five clusters for HDF , as shown in Figure 4 -13(a). Cluster 1 contains heavier evening proportions of trucks and average AADTT values of less than 500. Cluster 2 has a similar percentage of trucks as sites in cluster 1, but on average shifts left by an hour and average AADTT val ues less than 1000. It is to be noted that the sites in both clusters 1 and 2 are mostly located on US -2 and I -75. Cluster 3 average has roughly a 1 -2% lower truck percentage between the hours of 7:00 am and 4:00 pm than either cluster 1 or 2. Sites in thi s cluster are located on principal interstates with average AADTT values of more than 2300. Sites 168 in cluster 4 have the highest HDF from 8 am to 12 noon of all clusters. Most of these sites are on US routes with varying traffic levels. Sites in cluster 5 h ave the flattest curve among all the clusters with all the sites located on I -94, I-69, and I -75, suggesting long haul traffic. (a) HDF (b) MAF( VC5) (c) MAF (VC9) (d) VCD Figure 4 -13 Cluster averages (Level 2A) for various traffic inputs 2. Monthly adjustment factors (MAF) based on vehicle class 5 Œ 4 clusters Four cluster averages for MAF based on VC5 are shown in Figure 4 -13(b). Cluster 1 exhibits slight seasonal variability (MAF > 1) having MAFs close to 1.4 during summer months with lower values in winter. Most of these sites were in the Lower Peninsula on a variety of roads 012345678904812162024HDFHour 123450.00.51.01.52.02.50123456789101112MAF Month 12340.00.20.40.60.81.01.21.41.60123456789101112MAF Month 1234502040608010034567891011121314Percentage Vehicle Class 12345169 with varying functional class and AADTT levels. Cluster 2 depicts very little seasonal variability with MAFs close to 1. Major routes, such as I-94, I-96, and I -275 are present in this cluster , and most sites are in the Lower Peninsula. Cluster 3 displays higher MAF in summer and fall, with much lower MAF in winter and spring. However , there are only two sites (M -28 and US -2) in this cluster and are in the Upper Peninsula with low AADTT. Sites in cluster 4 also have higher MAF in summer and fall and are mostly located on north -south routes such as I -75 and US -127. 3. MAF based on vehicle class 9 Œ 5 clusters Five cluster averages for MAF based on V C9 are shown in Figure 4 -13(c). Almost all the sites in all the clusters have no seasonal variability between months. Since VC9 trucks are used for long haul throughout the year, a uniform presence of such trucks is expected on all the sites. 4. Vehic le clas s distribution (VCD) Œ 5 clusters Figure 4 -13 (d) illustrates the five clusters, each distinguished by the percentage of VC5 and VC9. While four clusters have higher VC9 truck levels than VC5, their proportions are different. Sites in cluster 1 have percen tage VC9 trucks in the ranges of 45 to 70 while the VC5 truck percentage was in the range of 15 to 25. Most of these sites were found on state routes such as US-127 and US -2 with one -way AADTT ranging from 700 to 3600. Cluster 2 contained most sites with p ercentage VC9 trucks less 45 while the VC5 truck percentage was in the range of 20 to 30. Sites in this cluster are located mostly on rural arterials, such as US -2, US -31, M-95, generally with AADTT of less than 1500. Cluster 3 has sites that have a slight ly higher percentage of VC5 trucks than VC9 trucks. Most of these sites are on rural arterials with an AADTT of less than 800. Sites in cluster 4 have the highest percentage of VC9 trucks (above 75) with a very low percentage of VC5 trucks (below 10). All the sites in this cluster are located on I-94, I-69, and I -75 with AADTT values ranging from 2500 to 8000. Sites in cluster 5 have a 170 percentage of VC9 trucks between 55 and 70 with a percentage of VC5 trucks between 10 and 20. Most of the sites in this clu ster are located on the interstates with a few sites on US -23. The AADTT values ranging from 1200 to 3500. 5. Single axle load spectra (SALS) based on vehicle class 5 Œ 4 clusters The single axle load spectra clusters for VC5 trucks are presented in Figure 4 -14 (a). Four clusters were formed and are directly related to the peaks observed in the data. For all the sites in the clusters , the first peak occurs at approximately 4 to 6 kips while the second peak occurs at 8 to 10 kips. A review of the individual sin gle axles for all VCs at all sites revealed that the axle load spectra is not influenced so much by the shape of the axle load spectra itself but instead the actual distribution of the truck traffic, particularly the presence of VC5. Cluster 1 has an almos t equal proportion of axles in the 4 -6 kip range and the 8 -10 kip range. Cluster 2 has a higher proportion of 4 -6 kip axles than 8 -10 kip axles. The sites located in clusters 1 and 2 are a mixture of interstates and state routes and have varying AADTT levels. Cluster 3 has only one site (US -2) in the Upper Peninsula , and the pattern seen in the figure is unique to that site. Cluster 4 has sites with a higher proportion of axles in the 8 -10 kip range than the 4 -6 kip range. All the sites in this cluster are located on I -94, I-96, and I -75 with AADTT levels ranging from 1500 to 8300. 6. Tandem axle load spectra (TALS) based on vehicle class 9 Œ 5 clusters The overall tandem axle load spectra clusters can be seen in Figure 4 -14 (b). Five clusters resulted fro m the data. The two peaks in the clusters correspond to unloaded (9 -14 kips) and loaded (30 -33 kips) trucks. Clusters 1, 3 and 4 have more light axles than heavy, whereas 171 (a) SALS (b) TALS (c) TRALS (d) QALS Figure 4 -14 Cluster averages (Level 2A) for various traffic inputs Clusters 2 and Cluster 5 have heavier tandem axles. Clusters 1, 3 , and 4 consist of mostly secondary arterials and rural freeways scattered throughout the state. All sites have AADTT less than 3500. Nearly all sites in cluster 2 are located on major routes, I -94, I-96, and I -69 in the Lower Peninsula and have AADTT ranging from above 200 to 8000. Cluster 5 has sites mostly located on I -94 and I -96 with AADTT ranging from 400 to 5000. 7. Tridem axle load spectra ( TRALS ) based on vehicle class 13 Œ 6 clusters A total of six tridem axle load spectra clusters were generated , as shown in Figure 4 -14 (c). The general trend of the tridem axle clusters show s a large proportion of light axles around 12 kips 0510152025303540010203040Frequency Load (kips) 123405101520020406080Frequency Load (kips) 123450102030405060050100Frequency Load (kips) 1234560102030405060020406080100Frequency Load (kips) 123172 followed by a peak value around 40 -45 kips. Sites found in the first cluster have the least average AADTT and were primarily located on I -75, M-28, and I -94. Sites contained in cluster 2 were also mainly on I-94, I-69 that had AADTT ranging from 400 to 5200. All the sites in the other clusters have varying functional class es and AADTT levels. 8. Quad axle load spectra (QALS) based on vehicle class 13 Œ 3 clusters The quad -axle load spectra clusters can be seen in Figure 4 -14 (d). A total of three clusters were formed. Peak values for the quad -axle load spectra occur at the 18 -24 kips, 45 -60 kip ranges. Dominant characteristics could not be established for the clusters as they have varying functional classificat ions and AADTT levels. Figures 4 -15 and 4 -16 show the geographical distributions of the PTR locations and associated clusters for all traffic inputs. 4.2.1.5 Cluster Assignment Methodology The next step after the generation is to develop a cluster assignment methodology. Cluster assignment methodology (classification technique) usually involves developing a classification model which assigns new sites to one of the previously developed clusters. Examples of classification techniques include decision tree classifiers, discriminant analysis, neural networks, support vector machines, and naïve Bayes classifiers. The input data for a 173 (a) HDF (b) MAF VC5 (c) MAF VC9 (d) VCD Note: blue = cluster 1, green = cluster 2, red = cluster 3, orange = cluster 4, and black = cluster 5 Figure 4 -15 Geographical distributions for PTRs by clusters for traffic inputs 174 (a) SALS (b) TALS (c) TRALS (d) QALS Note: blue = cluster 1, green = cluster 2, red = cluster 3, orange = cluster 4, black = cluster 5, dark blue = cluster 6 Figure 4 -16 Geographical distributions for PTRs by clusters for all traffic inputs classification model includes a data array D , as shown below. Each row contains a data object , and the colum ns contain the attribute values and the class label of each data object. 1111 1nmmnm xxy Dxxy (22) where: ij x = value of the j th attribute of object i. 175 iy= Class label of the object i The attribute values (e.g., road class and development type) could be either discrete or continuous while the class label (clusters) should always be discrete. All the classification techniques use a learning algorithm to identify a model that best fits the r elationship between the attribute set and the class label of the input data. For this study, the attribute set includes various attributes of the PTR locations. The class labels are the pre -defined cluster numbers. The models generated by the learning algo rithms should fit the input data well and correctly predict the clusters of a new PTR location it has never seen before. The general approach in developing a model is to have a training set of PTR locations whose cluster numbers are known. This training set is used to develop a classification model. This model is then applied to a test set , which consists of PTR locations and their clusters numbers not used by the model. Evaluation of any classification model is based on its accurate number predictions of t he cluster numbers and can be presented in a tabular form called the confusion matrix (see Table 4 -6). Table 4 -6 Confusion matrix for a dataset with two class labels (clusters) Confusion matrix Predicted Class Cluster = 1 Cluster = 2 Actual class Cluster = 1 P11 P12 Cluster = 2 P21 P22 Each element in the diagonal (bolded) are predicted accurately , and the non -diagonal elements are inaccurate predictions. One could use a performance metric of a model such as an accuracy as defined below. 1122 11122122 Number of accurate predictions Accuracy = Total number of predictions PP PPPP (23) 176 Consequently, the error rate is 1221 11122122 Number of inaccurate predictions Error rate = Total number of predictions PPPPPP (24) The error rate of a classification mode l can be divide d into two categories (a) training error, and (b) testing error. The training error is the misclassification rate of the model on the training records. The testing error is the misclassification rate of the model using the data which it has not seen before. A good classification model should have high accuracy and low error rate. As previously mentioned, several techniques exist for building a classification model. A few of these models are discussed below. Decision Tree Classifiers Decisi on tree classifiers are widely used classification techniques are very popular for its ease of use. An example of a decision tree can be seen in Figure 4 -17. A decision tree has three types of nodes: (a) a root node that has no incoming edges and zero or m ore outgoing edges (Development type) (b) internal nodes, which have exactly one incoming edge and two or more outgoing edges (Functional Class), and (c) leaf or terminal nodes, which has exactly on e incoming edge and no outgoing edges. Each leaf node is assigned the cluster number. To predict the cluster number, the decision must be followed from the root node to one of the leaf nodes. Many decision trees can be constructed from a given set of attrib utes. While some of the trees are more accurate than others, finding the optimal tree is computationally infeasible. 177 Development Type Functional Class Functional Class Urban Rural Freeway Cluster 5Non -Freeway Number of Lanes Two Three Cluster 1Cluster 3Freeway Cluster 2Non -Freeway COHS National Regional Cluster 4Cluster 3 Figure 4-17 Example of a decision tree A decision tree is grown in a recursive passion by sp litting the training data into purer subsets (i.e. , a parent node is split into child nodes). They are grown from top to bottom by choosing an attribute that best splits the training data. Several metrics are used to measure the qual ity of the splits. If (/) pik represents the fract ion of training data belonging to cluster ‚ i™ at node ‚ k™, then various measures of impurities can be calculated as follows: 21()1(/) niGinikpik (25) 21()(/)log(/) niEntropykpikpik (26) 178 where: (/) pik = the fract ion of training data belonging to cluster ‚ i™ at node ‚ k™ n= number of clusters The impurity of the child node of the parent node before splitting is compared to the impurities of the child nodes. The larger the difference, the better is the split (4). The difference in the impurities is the gain and is calculated as follows. 1()()() niiiNcIPIc N (27) where: ()iNc= number of objects belonging to cluster ‚ i™ N= total number of objects ()IP= impurity measure at the parent node ()iIc= Impurity at child node ‚ ic™ n= number of clusters The advantages of decision trees are that they are easy to understand and interpret. They can handle both numerical and categorical data input data and can be used to solve problems with multi -class labels (as opposed to certain techniques that can handle only binary class labels). However, decision tree classifiers can create long and complex trees that tend to overfit the data. The overfitting can occur due to lack of repre sentative samples, or presence of noise. A deep and complex decision tree (i.e. , a tree with more nodes and leaves) has high accuracy for training data and very low accuracy for testing data while a shallow tree behave s oppositel y. Decision 179 trees formed fo r HDF and TALS can be seen in Figure 4 -18 and 4 -19. Table 4 -7 presents the error rate of the decision trees. Table 4 -7 Training and testing losses for single decision trees Model HDF MAF VC5 MAF VC9 VCD SALS TALS TRALS QALS Entire Data 0.00 0.00 0.05 0.02 0.00 0.00 0.12 0.00 Training Data (75 -25) 0.10 0.06 0.03 0.06 0.03 0.00 0.10 0.06 Testing Data (75 -25) 0.71 0.71 0.52 0.57 0.40 0.71 0.64 0.26 VC9 VC9 Cluster 5< 66.35>= 66.35< 58.15VC4>= 58.15VC6< 3.75>= 3.75Cluster 1Cluster 2Cluster 3< 1.75Cluster 4> = 1.75 Figure 4-18 Single decision tree for HDF using entire data Ensemble Classifiers Decision trees use a single classifier to predict the cluster number of unknown data. To improve the classification accuracy, multiple classifiers can be developed, and their predictions can be aggregated. Such techniques are called ensemble methods. 180 VC7 VC8 < 0.75>= 0.75< 4.85>= 4.85< 0.40>= 0.40Cluster 2Cluster 5VC4< 0.03Cluster 2> =0.03< 2.50>= 2.50Cluster 3Cluster 1< 2.25>= 2.25Cluster 4Cluster 1Food Products Metallic Ores Misc Manufacturing Figure 4-19 Single decision tree for TALS using entire data Ensemble methods construct multiple classifie rs from the training data, predict the class labels , and picks the one with the most predicted class label. Baggin g and boosting are two examples of ensemble methods . Bagging (also known as bootstrap aggregating ) is a technique that repeatedly samples (wit h replacement) from a data set according to a uniform probability distribution. Each bootstrap sample ha s the same size as the original data. On average, a bootstrap sample contains approximately 63% of the original training data becaus e each sample has a probability. After training a bunch of classifiers, a data object is assigned to the cluster that receives the highest number of votes (4). Random forest s are also a class of ensemble methods specifically designed for decision tree classifiers. They combine the predictions made by multiple decision trees, where each tree is generated based on the values of an independent set of rando m vectors. The fundamental difference between random forests and bagging is that only a subset of attributes are selected at 181 random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in bagging where al l the attributes are c onsidered for splitting a node. It was theoretically proven that the upper bound for the testing error of random forests converges to the following when the number of trees is sufficiently large (4), 22(1) sTestingError s (28) where: = Average correlation among the trees s= Average performance of the classifiers Random forests were developed using clustered traffic data. This number of trees grown was 500. Table 4 -8 presents the training and testing losses for the classification models of all traffic inputs . One of the trees grown as part of the random forests for HDF can be seen in Figure 4 -20 while one of the trees for TALS can be seen in Figure 4 -14. Note that the data from only 41 WIM sites are used for the models. Although bagging (random forests) techniques are better than a single decision tree, the accuracy cannot be improved drastically unless there are more WIM sites or more data are available that describe these 41 PTR sites better. Table 4 -8 Training and testing losses for random forests Model HDF MAF VC5 MAF VC9 VCD SALS TALS TR ALS QALS Entire Data 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Training Data (75 -25) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Testing Data (75 -25) 0.48 0.68 0.19 0.57 0.26 0.52 0.64 0.26 182 Farm Products <0.55<9.21Cluster 1>=9.21Farm Products <12.97>=12.97Cluster 2Cluster 4<5.45Cluster 3>=5.45Cluster 2Cluster 4>=0.55Cluster 2 VC9Cluster 5VC11< 66.35>= 66.35VC11< 1.2>= 1.2VC6Farm Products <17.50>=17.50 Figure 4-20 Random forests (100 th) tree for HDF using entire data Naïve Bayes Classifiers In most applications, the relationship between the attribute set and the class variable is nondeterministic. The class label of a test record cannot be predicted with certainty even though its attribute set is identical to some of the training examples. 183 <7.27<14Cluster 2>=14Farm Products <4.74>=4.74Cluster 1Cluster 5<20.7Cluster 2>=20.7Cluster 3Cluster 4>=7.27VC8VC9< 5.05>= 5.05< 76.6>=76.6VC5Farm Products <18.62>=18.62<18.97>=18.97Cluster 3Cluster 1<50.65Cluster 5VC9>=50.65Cluster 4Waste Scrap Farm Products VC13 Figure 4-21 Random forests (180 th) tree for TALS using entire data This situation may ari se because of noisy data or the presence of certain confounding factors that affect classification but are not included in the analysis. In this classification technique, d uring the t raining phase, the posterior probabilities (|) PYX for every combination of attributes and the corresponding clusters based on the training data are determined. Based on these probabilities, a new site XI can be classified by finding the cluster YI that maximizes the posterior probability, (|) IIPYX . 184 Estimating the posterior probabilities accurately for every possible combination of the class label and the attribute value is a difficult problem because it requires a very large traini ng set, even for a moderate number of attributes. The Bayes theorem is useful because it allows the posterior probability to be expressed in terms of the prior probability ()PY , the class -conditional probability (|) PXY , and the evidence ()PX: (|)() (|) ()PXYPY PYX PX (29) When comparing the posterior probabilities for different values of Y (clusters) , the denominator term, ()PX, is always constant, and thus, can be ignored. The prior probability ()PY can be easily estimated from the training set by computing the fraction of tr aining records that belong to each c luster . To estimate the class -conditional probabilities (|) PXY , the naïve Bayes classifier can be used . A naive Bayes classifier estimates the class -conditional probability by assuming that the attrib utes are conditionally independent, given the cluster ‚ Y™. The conditional independence assumption can be formally stated as follows: dii1PX|YyPX|Yy (30) Where each attribute set 12d XX,X,..........,X consists of d attributes. With the conditional independence assum ption, instead of computing the class -conditional probability for every combination of X, only the conditional p robability of each set of attributes Xi, given the cluster ‚ Y™ needs to be estimated . This approach does not require a ve ry large training set to obtain a good estimate of the probability. To classify a test record, the naiv e Bayes class ifier computes the posterior probability for each class Y: 185 dii1PYPX|Y PY|X PX (31) Since ()PXis fixed for every Y, it is enough to choose the class that maximizes The numerator term dii1PYPX|Y . For a categorical attribute s Xi, the conditional probability iiPXx|Yy is estimated according to the fraction of training instances in class ‚y™ that take on a particular attribute value ix. For continuous attributes, they are discretized and are replaced with the corresponding discrete interval. This approach transforms the continuous attributes into ordinal attributes. The conditional probability iPX|Yy is estimated by computing the fraction of training records belonging to cluster ‚Y™ that falls within the correspon ding interval for ix(4). Naïve Bayes classifiers generally have the following characteristics: (a) They are r obust to isolated noise points because such points are averaged out when estimating conditional probabilities from data. Naive Bayes classifiers can also handle missing values by ignoring the example during model building and classification. (b) They are robust to irrelevant attributes. If iXis an irrelevant attribute, then (|) iPXY becomes almost uniformly distributed. The class conditional probability iX has no impact on the overa ll computation of the posterior probability. (c) Correlated attributes can degrade the performance of naive Bayes classifiers because of conditional independence (4). 186 Table 4 -9 presents the training and testing losses for the classification models of all traffic inputs. It can be seen that while the training errors reduced, significant errors on the testing data still arise. This is due to lack of sufficient data and many clusters having only one site in them. Table 4 -9 Training and testing losses for naïve Bayes classifier Model HDF MAF VC5 MAF VC9 VCD SALS TALS TRALS QALS Entire Data 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Training Data (75 -25) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Testing Data (75 -25) 0.71 0.71 0.55 0.57 0.55 0.68 0.64 0.26 The a dvantages of cluster analyses are that the groups are formed objectively (based on a mathematical function) and not subjected to bias or subjective decisions. The other advantage is that it finds patterns in the data that are not intuitively obvious , therefore providing new insights into traffic patterns in a region. One main disad vant age is the lack of guidelines for establishing the optimal number of clusters. While various techniques were developed to identify the optimal number of clusters, none of them are perfect and have their drawb acks. The other disadvantage is that, since clustering is purely a mathematical technique, the objects in a cluster might not have the same identif iable attributes making the assign ment of data from a new location to the existing clusters difficult . Single decision tree classifiers are very convenient to use but has errors (resubstitution errors) when the entire data set is used and even higher errors when the data is split into testing and training sets. Random forets and naïve bayes classifeirs were used to reduce the classification error rate. Whiel they do a better job reducing the training data errors, the testing data errors still exist. These errors vould be attributed to limited data and in many instances, cluster having very few sites within them (as low as one site in some cases). Hence it was deemed that there is a need to develop an alternative methodology to develop traafic inputs 187 at Level 2 that is intuitive and easily used. The inputs developed using this alternative methodology are called Level 2B inputs ans are discussed in the next secti on. 4.2.2 Level 2B Inputs Grouping roads by attributes is more subjective and involve s identifying roads that are expected to behave similarly (i.e., similar traffic patterns). Attributes of the roadway s (e.g., road class freeway vs. non -freeway) can be used to identify groups that have similar traffic patterns. Such groups based on these attributes are easy to interpret by the users. The roadway s that have unique traffic patt erns could consist of the same functional classification . Attributes specific to a Stat e road network could be used to define roads subgroup. Data analyses results and knowledge of specific route information should determine the appropriate number of these groups. The traffic monitoring guide (TMG) recommends a minimum of three to six groups are required, but more groups may be appropriate if significant regional differences exist (22). The advantage s of this methodology are that the creation of groups is intu itive , and the short -term count from a new site can be used to assign it to an existing group. The drawback of this process is that it is not entirely objective (involves a lot of subjec tive decisions which may not explain the variability of traffic patt erns within a group ). The attributes used to classify groups need not necessarily be the same for all the traffic inputs and instead should depend on the type of traffic input. For monthly adjustment and hourly distribution factors, the primary groups could be based on road class and development type s such as (a) rural i nterstate s, (b) rural n on-interstate s, (c) urban i nterstate s, and (d) urban n on-interstate s. The TMG recommends adding a fifth group which consists of roadway s used by recreational traffic (22). Recreational traffic patterns should be identified using local knowledge 188 of specific locations that could generate recreational traffic; they cannot be defined based on functional class or area boundaries. An SHA coul d further expand these four groups, but more groups would require more data (more WIM and classification sites) which in turn would increase the cost to the agency. The coefficient of variation of monthly patterns in urban areas is usually under 10 percent , while in rural areas it range s between 10 and 25 percent. Values higher than 25 percent indicate highly variable travel patterns, which reflect recreational patterns but may also be due to reasons other than recreational travel. For MAF and HDF, function al class (freeway or non -freeway) , development type (rural or urban), and geographic location within the State have been the traditional characteristics for grouping the roadways. Recreational (or geographic) designations can be used for roads that are aff ected by sizeable recreational traffic generators occasionally (22). For VCD, characteristics of roadways to be considered for groupings should be different. Previous research has found out that the functional class of roadways have an inconsistent relationship to truck travel patterns (23; 24 ). The amount of long -distance truck traffic versus the amount of locally oriented truck traffic significantly affect the truck traffic patterns on a route . Also , the existence of significant truck traffic generators along a roadway such as agriculture or sign ificant industrial activ ity can affect these patterns . Functional road classification help s in a limited way to differentiate between roads with heavy through -traffic and those with only local traffic. Typically, i nterstates and principal arterials have higher through -truck traffic volumes . However, there are roads with several lower functional classifications that are carrying more through -truck volumes than the interstates and principal arter ials. Developing th ese groups also requires an understanding of the freight movements in the State (23; 24 ). Communication with the staff at the SHA would help in identifying the local and statewide patter ns. Cluster analyses 189 could be used to identify the traffic patterns initially, and each cluster could be looked at to gain a fundamental idea of the roadway character istics of the PTR locations in that cluster. Based on the above discussion, for Level 2B i nputs, the challenge lies in identifying a combination of attributes that can be used to group the PTR locations. The traffic patterns at the PTR locations should be similar within a group and should be different between the groups. An automated process wa s developed to help identify such combinations of attributes. The following attributes from the MDOT™s sufficiency database were identified for grouping different PTR locations: Functional classification (Freeway vs. Non -Freeway) Development type (Urban vs . Rural) One -way AADTT level (<1000, 1000 -3000, >3000) Corridors of the highest significance, COHS (National, Regional, and Statewide) Number of lanes (2, 3 and 4) Road type (Non -freeway divided, Non -freeway undivided, and Freeway) Vehicle class 9 (VC 9) distribution levels (< 45, 45 Œ 70, >70) Several attributes can be chosen at a time to divide the PTR locations into groups. Tables 4 -9, to 4-11 list the possible 2 -, 3-, and 4 -way combinations of the attributes listed above. Each attribute has sublevels ( e.g. , functional classification has two sublevels of freeway and non -freeway), and hence a combination of attributes has different sublevel combinations. The Level 1 traffic inputs of PTR sites belonging to a combination of sublevels are average d. For example, the VCD traffic inputs for the combination of functional class and development type (2 -way combination) can be seen in Table 4 -10. 190 Table 4 -10 Possible combination of attributes when chosen two at a time Attribute 1 Attribute 2 Functional Class Road Type Functional Class Number of Lanes Functional Class Commercial AADT Functional Class COHS Functional Class Development Type Functional Class VCD Level Road Type Number of Lanes Road Type Commercial AADT Road Type COHS Road Type Development Type Road Type VCD Level Number of Lanes Commercial AADT Number of Lanes COHS Number of Lanes Development Type Number of Lanes VCD Level Commercial AADT COHS Commercial AADT Development Type Commercial AADT VCD Level COHS Development Type COHS VCD Level Development Type VCD Level Note: There are 21 2-way attribute combinations when a total of seven attributes are available ( based on 7C2=21 ). 191 Table 4 -11 Possible combination of attributes when chosen three at a time Attribute 1 Attribute 2 Attribute 3 Functional Class Road Type Number of Lanes Functional Class Road Type Commercial AADT Functional Class Road Type COHS Functional Class Road Type Development Type Functional Class Road Type VCD Level Functional Class Number of Lanes Commercial AADT Functional Class Number of Lanes COHS Functional Class Number of Lanes Development Type Functional Class Number of Lanes VCD Level Functional Class Commercial AADT COHS Functional Class Commercial AADT Development Type Functional Class Commercial AADT VCD Level Functional Class COHS Development Type Functional Class COHS VCD Level Functional Class Development Type VCD Level Road Type Number of Lanes Commercial AADT Road Type Number of Lanes COHS Road Type Number of Lanes Development Type Road Type Number of Lanes VCD Level Road Type Commercial AADT COHS Road Type Commercial AADT Development Type Road Type Commercial AADT VCD Level Road Type COHS Development Type Road Type COHS VCD Level Road Type Development Type VCD Level Number of Lanes Commercial AADT COHS Number of Lanes Commercial AADT Development Type Number of Lanes Commercial AADT VCD Level Number of Lanes COHS Development Type Number of Lanes COHS VCD Level Number of Lanes Development Type VCD Level Commercial AADT COHS Development Type Commercial AADT COHS VCD Level Commercial AADT Development Type VCD Level COHS Development Type VCD Level Note: There are 35 3 -way combinations (based on 7C3=35) Table 4 -12 Possible combination of attributes when chosen four at a time Attribute 1 Attribute 2 Attribute 3 Attribute 4 Functional Class Road Type Number of Lanes Commercial AADT Functional Class Road Type Number of Lanes COHS Functional Class Road Type Number of Lanes Development Type Functional Class Road Type Number of Lanes VCD Level Functional Class Road Type Commercial AADT COHS Functional Class Road Type Commercial AADT Development Type Functional Class Road Type Commercial AADT VCD Level Functional Class Road Type COHS Development Type Functional Class Road Type COHS VCD Level Functional Class Road Type Development Type VCD Level 192 Table 4 -12 Possible combination of attributes when chosen four at a time (cont™d– ) Attribute 1 Attribute 2 Attribute 3 Attribute 4 Functional Class Number of Lanes Commercial AADT COHS Functional Class Number of Lanes Commercial AADT Development Type Functional Class Number of Lanes Commercial AADT VCD Level Functional Class Number of Lanes COHS Development Type Functional Class Number of Lanes COHS VCD Level Functional Class Number of Lanes Development Type VCD Level Functional Class Commercial AADT COHS Development Type Functional Class Commercial AADT COHS VCD Level Functional Class Commercial AADT Development Type VCD Level Functional Class COHS Development Type VCD Level Road Type Number of Lanes Commercial AADT COHS Road Type Number of Lanes Commercial AADT Development Type Road Type Number of Lanes Commercial AADT VCD Level Road Type Number of Lanes COHS Development Type Road Type Number of Lanes COHS VCD Level Road Type Number of Lanes Development Type VCD Level Road Type Commercial AADT COHS Development Type Road Type Commercial AADT COHS VCD Level Road Type Commercial AADT Development Type VCD Level Road Type COHS Development Type VCD Level Number of Lanes Commercial AADT COHS Development Type Number of Lanes Commercial AADT COHS VCD Level Number of Lanes Commercial AADT Development Type VCD Level Number of Lanes COHS Development Type VCD Level Commercial AADT COHS Development Type VCD Level Note: There are 35 4-way combinations (based on 7C 4=35) Table 4 -13 VCD traffic inputs for the combination of functional class and development type Sublevel Sublevel VC4 VC5 VC6 VC7 VC8 VC9 VC10 VC11 VC12 VC13 Freeway Rural 1.6 14.8 3.5 0.4 4.1 62.4 6.7 1.3 0.6 4.7 Freeway Urban 1.5 18.4 5.1 0.8 5.4 55.7 6.1 1.3 0.6 5.0 Non-freeway Rural 2.3 25.0 4.7 0.9 6.1 40.8 7.4 0.8 0.4 11.5 Non-freeway Urban 0.8 18.8 4.7 0.7 5.1 49.8 11.2 1.8 0.3 6.8 Pairwise Euclidean distances between each sublevel combinations were calculated to identify the combination of attributes that show different traffic patterns. Pairwise distances for the sublevel combinations in Table 4 -13 are shown in Table 4 -14. 193 Table 4 -14 Pairwise Euclidean distances between t he sublevel combinations Sublevel combination Freeway_Rural Freeway_Urban Non -freeway_Rural Non -freeway_Urban Freeway_Rural 0.0 8.0 24.9 14.2 Freeway_Urban 8.0 0.0 17.6 8.0 Non -freeway_Rural 24.9 17.6 0.0 12.6 Non -freeway_Urban 14.2 8.0 12.6 0.0 The maximum distance between the sublevel combinations increases with the increase in the number of attributes used for grouping. However, higher the number of attributes used for grouping, lower is the number of PTR locations in each sublevel combinations . For example, in Table 4 -15, the t otal number of sublevel combinations should have been 12 [Road type ( 4) x VCD Level ( 3)]; however, due to a limited number of PTR locations, only 7 sublevel combinations exist. When three attributes are chosen (see Table 4-16), only 10 out of a possible 18 sublevel combinations exist. Similarly, when four attributes are chosen (see Table 4 -17), only 14 out of a possible 72 sublevel combinations exist , and many of them have only one or two PTR locations. Hence it is more appropriate to use only two attribute combinations to form road groups for developing Level 2B inputs. Table 4 -15 Number of PTR locations in each sublevel combi nation (2 -way) for road type/ VCD level combination Road type VCD level Number of PTR locations Divided (partial or no access control) Low VC9 2 Freeway (full access control) High VC9 10 Freeway (full access control) Low VC9 6 Freeway (full access control) Medium VC9 15 Two travel lanes with the center left -turn lane Medium VC9 1 Two -way undivided (any number of lanes) Low VC9 4 Two -way undivided (any number of lanes) Medium VC9 3 194 Table 4 -16 Number of PTR locations in each sublevel combination (3 -way) for road type/ development type/VCD level combination Number of lanes Development type VCD level Number of PTR locations Four Rural Medium VC9 1 Three Rural High VC9 1 Three Rural Medium VC9 2 Three Urban High VC9 1 Three Urban Medium VC9 2 Two Rural High VC9 8 Two Rural Low VC9 11 Two Rural Medium VC9 11 Two Urban Low VC9 1 Two Urban Medium VC9 3 The next step is to obtain the pairwise distances between sublevel combinations and identifying the missing ones for each attribute combination. The descriptive statistics for these distances for each two -way combination of attributes (all 7 attributes and 21 combinations), the number of missing sublevel combinations, combinations with only one PTR site are listed in Table 4 -18 for VCD. Similar data for other traffic inputs were evaluated to identify the attributes. After careful evaluation of the results, the following attribute combinations are chosen based on the availability of the sublevel combinations and the distances between them. The traffic data of all the PTR sites in each of the sublevel combinations (road groups) for the attributes chosen are averaged to obtain the Level 2B inputs. Figure 4 -22 shows the averages of road groups for various traffic inputs. 195 Table 4 -17 Number of PTR locations in each sublevel combination (4 -way) for road type/number of lanes/ development ty pe/VCD level combination Road type Number of lanes Development type VCD level Number of PTR locations Divided (partial or no access control) Two Rural Low VC9 2 Freeway (full access control) Four Rural Medium VC9 1 Freeway (full access control) Three Rural High VC9 1 Freeway (full access control) Three Rural Medium VC9 2 Freeway (full access control) Three Urban High VC9 1 Freeway (full access control) Three Urban Medium VC9 2 Freeway (full access control) Two Rural High VC9 8 Freeway (full access control) Two Rural Low VC9 5 Freeway (full access control) Two Rural Medium VC9 8 Freeway (full access control) Two Urban Low VC9 1 Freeway (full access control) Two Urban Medium VC9 2 Two travel lanes with the center left -turn lane Two Urban Medium VC9 1 Two -way undivided (any number of lanes) Two Rural Low VC9 4 Two -way undivided (any number of lanes) Two Rural Medium VC9 3 a) Hourly distribution factors: VCD Level and Development Type The attributes of VCD level and development type resulted in six groups for HDF , as shown in Figure 4 -22(a). The sites having low VC9 levels in the urban areas have the highest peak among all other groups between 8:00 am and 4:00 pm suggesting local traffic patterns. The sites in this group are state routes with AADTT of less than 1300. Sites having high VC9 levels have the flattest peaks in both urban and rural areas suggesting long haul traffic patterns. All the sites in high VC9 groups are on interstate routes. b) Monthly adjustment fac tors: Commercial AADT and Development Type The attributes of commercial AADT and development type resulted in six groups of inputs for MAFs , as shown in Figure 4 -22(b) & (c). Almost all the groups have similar MAF patterns for VC5 except for sites with low AADTT in the rural areas suggesting seasonal 196 Table 4 -18 Descriptive Statistics of the pairwise distances between the sublevel combinations for various attribute combinations ( VCD ) Attribute 1 Attribute 2 Pairwise Euclidean distances between the sublevel combinations Sublevel combinations Max Min Avg . Std . Range Total Available With only one PTR location Missing VCD Level Development Type 50.0 1.5 26.5 15.1 48.5 6 6 0 0 Commercial AADT Development Type 46.8 1.9 21.3 11.9 44.9 6 6 0 0 COHS Development Type 28.3 2.9 18.0 7.1 25.4 6 6 1 0 Road Type Development Type 27.7 4.8 17.6 6.8 22.9 6 6 3 0 Functional Class Development Type 25.9 4.8 17.2 7.4 21.1 4 4 0 0 Functional Class VCD Level 50.4 6.7 25.5 14.4 43.7 6 5 0 1 Number of Lanes Development Type 41.6 4.3 19.0 16.1 37.3 6 5 2 1 Functional Class Commercial AADT 40.2 5.4 21.1 11.5 34.8 6 5 0 1 Functional Class COHS 28.7 8.0 15.9 6.7 20.7 6 5 0 1 COHS VCD Level 54.5 2.5 23.0 13.2 52.1 9 7 1 2 Commercial AADT VCD Level 50.3 1.9 24.3 14.3 48.4 9 7 0 2 Commercial AADT COHS 42.0 4.2 22.0 11.9 37.9 9 7 1 2 Road Type COHS 28.7 2.1 16.1 7.7 26.7 9 7 3 2 Functional Class Number of Lanes 28.0 6.9 15.8 7.8 21.2 6 4 1 2 Road Type VCD Level 56.1 7.6 25.3 14.0 48.5 9 6 0 3 Number of Lanes VCD Level 53.2 1.7 23.9 14.7 51.4 9 6 1 3 Road Type Commercial AADT 42.7 5.4 21.0 10.9 37.3 9 6 0 3 Number of Lanes Commercial AADT 42.3 1.1 18.1 11.0 41.1 9 6 1 3 Functional Class Road Type 23.1 9.7 17.8 7.2 13.5 6 3 0 3 Road Type Number of Lanes 28.8 6.9 16.4 7.5 22.0 9 5 1 4 Number of Lanes COHS 26.1 5.5 14.9 7.0 20.6 9 5 1 4 Note: Shaded cells indicate the selected attribute combination for the generation of Level 2B inputs 197 traffic patterns. Almost all the sites in this group are on US -2, US -12, and US -127 with AADTT level less than 1000. No differences in MAFs for VC9 trucks were found between the groups and are always close to 1. (a) HDF (b) MAF (VC 5) (c) MAF (VC 9) (d) VCD Figure 4 -22 Group averages (Level 2B) for various traffic inputs c) Vehicle class distribution: VCD Level and Development Type The attributes of VCD level and development type resulted in six groups for VCD , as shown in Figure 4 -22(d). Since the attribute used is VCD level, three distinct patterns can be seen with varying levels of VC9 irrespective of the development type. All the sites in high VC9 groups are located on the interstates while most of the sites in low VC9 groups are located on state routes. 012345678904812162024HDF Hour HighVC9_Rural HighVC9_Urban LowVC9_Rural LowVC9_Urban MediumVC9_Rural MediumVC9_Urban 0.00.51.01.52.0Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec MAF (Class 5) Month AADTT_One_Rural AADTT_One_Urban AADTT_Three_Rural AADTT_Three_Urban AADTT_Two_Rural AADTT_Two_Urban 0.00.51.01.52.0Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec MAF (Class 9) Month AADTT_One_Rural AADTT_One_Urban AADTT_Three_Rural AADTT_Three_Urban AADTT_Two_Rural AADTT_Two_Urban 02040608010034567891011121314Percentage Vehical Class HighVC9_Rural HighVC9_Urban LowVC9_Rural LowVC9_Urban MediumVC9_Rural MediumVC9_Urban 198 Sites in the medium VC9 groups have a mix of both intestates a nd state routes in rural and urban areas. d) Single Axle Load Spectra: COHS and Development Type The attributes of COHS and development type resulted in six groups for SALS , as shown in Figure 4 -23 (a). For all the sites in different groups, the first peak occurs at approximately 4 -6 kips while the second peak occurs at 8 -10 kips. Road groups in the urban areas have an almost equal proportion of axles in the 4 -6 kip range and the 8 -10 kip range while the sites in the rural areas have a higher proportion of 4 -6 kip axles than 8 -10 kip axles. The road group of the regional corridor in the urban area has only one site on US -2 with a unique loading pattern. e) Tandem Axle Load Spectr a: Number of Lanes and Development Type The attributes; number of lanes and development type resulted in five groups for TALS, as shown in Figure 4 -23 (b). The two peaks seem to correspond to unloaded (9 -14 kips) and loaded (30 -33 kips) tandem axles. Other characteristics could not be established for the groups as they have varying functional classifications and AADTT levels and also because some groups only have one site. (a) SALS (b) TALS 05101520253035400481216202428323640Frequency Axle Load (Kips) National_Rural National_Urban Regional_Rural Regional_Urban Statewide_Rural Statewide_Urban 0510152008162432404856647280Frequency Axle Load (Kips) Four_Rural Three_Rural Three_Urban Two_Rural Two_Urban 199 (c) TRALS (d) QALS Figure 4 -23 Group averages (Level 2B) for various traffic inputs f) Tridem Axle Load Spectra: COHS and Development Type The attributes of COHS and development type resulted in six groups for TRALS , as shown in Figure 4 -23 (c). The general trend of the tridem axle groups appears to be a large proportion of light axles around 12 kips , followed by a peak value around 40 -45 kips. All the sites in the national corridors are located on interstates while t he sites on regional and statewide corridors are on state routes with varying AADTT levels irrespective of the development type. 0102030405060081624324048566472808896104112Frequency Axle Load (Kips) National_Rural National_Urban Regional_Rural Regional_Urban Statewide_Rural Statewide_Urban 05101520020406080100Frequency Axle Load (Kips) National_Rural National_Urban Regional_Rural Regional_Urban Statewide_Rural 200 g) Quad Axle Load Spectra: COHS and Development Type The attributes of COHS and development type resulted in six groups for QA LS, as shown in Figure 4 -23 (d). Again, all the sites in national corridors are on the interstates while the sites on regional and statewide corridors are on state routes with varying AADTT levels irrespective of the development type. 4.2 SUMMARY Site -specifi c traffic inputs (Level 1) were generated for each of the 41 WIM sites using the PrepME after extensive QC checks. Development of regional inputs (Level 2) is crucial when site -specific data are not available. The averages from nearby sites (regional) with similar traffic characteristics (groups or clusters) can be used as Level 2 data (2). Two approaches were adopted for developing Level 2 inputs , (a) cluster analyses (Level 2A), and (b) grouping roads with similar attributes (Level 2B). Also , Level 3 data are further split into Levels 3A, and 3B, where 3A represents the average of freeways and non -freeways , and 3B represents the overall statewide average for traffic inputs. The a dvantages of cluster analyses are that the groups are formed obj ectively (based on a mathematical function) and not subjected to bias or subjective decisions. Also , it finds patterns in the data that are not intuitively obvious , therefore providing new insights into traffic patterns in a region. One main disad vant age i s the lack of guidelines for establishing the optimal number of clusters. While various techniques were developed to identify the optimal number of clusters, none of them are perfect and have their drawbacks. The other disadvantage is that, since clusterin g is purely a mathematical technique, the sites in a cluster might not have the same identif iable attributes making the assign ment of data from a new location to the existing clusters difficult . After comparing several clustering techniques, Hierarchical clustering method with 201 Euclidean distance and Ward™s method as the similarity measure and the linkage method respectively was used to cluster the traffic data . ‚Calinski -Harbasz Criterion™ and ‚Ga p Crit erion™ methods, and engineering judgments were used to determine the optimal number of clusters for each traffic input. Classificatio n models such as decision trees, random forests , and naïve Bayes classifiers were developed to assign a new site to t hese clusters. Since clustering is purely a mathematical technique, the sites in a cluster did not have the same identif iable attributes which made the assign ment of a new site to the existing clusters difficult. Grouping roads by attributes is more subjec tive and involve s identifying roads that are expected to behave similarly (i.e., similar traffic patterns). Attributes of the roadway s (e.g., road class freeway vs. non -freeway) can be used to identify groups that have similar traffic patterns. Such groups based on these attributes are easy to interpret by the users. A minimum of three to six groups are required, but more groups may be appropriate if significant regional differences exist (22). The advantage s of this methodolog y are that the creation of groups is intu itive , and the short -term count from a new site can be used to assign it to an existing group. The drawback of this process is that it is not entirely objective (involves a lot of subjective decisions which may not explain the variability of traffic patt erns within a group ). Also, the challenge lies in identifying a combination of attributes that can be used to group the PTR locations. The traffic patterns at the PTR locations should be similar within a group and sho uld be different between the groups. An automated process was developed to help identify such combinations of attributes. The traffic input levels developed in this chapter are listed below. a. Level 1 Œ Site -specific inputs b. Level 2A Œ Averages of clusters based on cluster analyses 202 c. Level 2B Œ Averages of groups based on roadway characteristics (attributes) d. Level 3A Œ Averages of groups based on the freeway and non -freeway road class e. Level 3B Œ Statewide averages Table 4 -19 lists the number of clusters and road groups formed that could be used as Level 2 traffic inputs. Table 4 -19 Number of clusters and road groups formed for Level 2 inputs Input Number of clusters (Level 2A) Road groups (Level 2B) Hourly distribution factors (HDF) 5 6 Monthly adjustment factors (MAF) based on VC 5 4 6 MAF based on VC9 5 6 Vehicle class distribution (VCD) 5 6 Single axle load spectra (SALS) 4 6 Tandem axle load spectra (TALS) 5 5 Tridem axle load spectra (TRALS) 6 6 Quad axle load spectra (QALS) 3 6 203 REFERENCES 204 REFERENCES [1] Wang, K. C., Q. J. Li, V. Nguyen, M. Moravec, and D. Zhang. Prep -ME: A multi -agency effort to prepare data for DARWin -ME.In Airfield and Highway Pavement 2013: Sustainable and Efficient Pavements , 2013. pp. 516 -527. [2] Turochy, R. E., D. H. Timm, and D. Mai. Development of Alabama Traffic Factors for use in Mechanistic -Empirical Pavement Design.In, 2015. [3] Milligan, G. W. Clustering Validation: Results and Implications for Applied Analyses.In Clustering and Classification , World Scientific, 1996. p p. 341-375. [4] Tan, P. N., M. Steinbach, and V. Kumar. Introduction to Data Mining . Pearson, 2013. [5] Mooi, E., and M. Sarstedt. Cluster analysis.In A Concise Guide to Market Research , Springer, 2010. pp. 237 -284. [6] Everitt, B. S., S. Landau, M. Leese, and D. Stahl. Cluster analysis: Wiley series in probability and statistics.In, Chichester: Wiley, 2011. [7] Baulieu, F. A classification of presence/absence based dissimilarity coefficients. Journal of Classification, Vol. 6, No. 1, 1989, pp. 233 -246. [8] Cheetham, A. H., and J. E. Hazel. Binary (presence -absence) similarity coefficients. Journal of Paleontology , 1969, pp. 1130-1136. [9] Gower, J. C., and P. Legendre. Metric and Euclidean properties of dissimilarity coefficients. Journal of classi fication, Vol. 3, No. 1, 1986, pp. 5 -48. data: an evaluation. Biological Reviews, Vol. 57, No. 4, 1982, pp. 669 -689. [11] Yim, O., and K. T. Ramdeen. Hierarch ical cluster analysis: comparison of three linkage measures and application to psychological data. Quant. Methods. Psychol, Vol. 11, 2015, pp. 8 -21. [12] Steinbach, M., L. Ertöz, and V. Kumar. The Challenges of Clustering High Dimensional Data.In New dire ctions in Statistical Physics , Springer, 2004. pp. 273 -309. [13] Ferreira, L., and D. B. Hitchcock. A comparison of hierarchical methods for clustering functional data. Communications in Statistics -Simulation and Computation, Vol. 38, No. 9, 2009, pp. 1925-1949. 205 [14] Mangiameli, P., S. K. Chen, and D. West. Comparison of SOM neural network and hierarchical clustering. European Journal of Operational Research, Vol. 93, No. 2, 1996, pp. 402-417. [15] L'Hermite, R., J. Chefdeville, and J. Grieu. Memoires sur la Mécanique -Physique du Beton: Nouvelle Contribution a L™Etude du Retrait des Ciments. Liants Hydrauliques.In Annales de L'Institut Technique du Batiment et des Travaux Publics, No. 106 , 1949. pp. 2-28. [16] Rendón, E., I. Abundez, A. Arizmendi, and E. M. Quiroz. Internal versus external cluster validation indexes. International Journal of computers and communications, Vol. 5, No. 1, 2011, pp. 27-34. [17] Milligan, G. W., and M. C. Cooper. An examination of procedures for determining the number of clusters in a data set. Psychometrika, Vol. 50, No. 2, 1985, pp. 159 -179. Communications in Statistics -theory and Methods, Vol. 3, No. 1, 1974, pp. 1 -27. [19] MATLAB. CalinskiHarabaszEvaluation . https://www.mathworks.com/help/stats/clustering.evaluation.calinskiharabaszevaluation -class.html2018 . [20] Desgraupes, B. clusterCrit: clustering indices. R package version 1.2.7.In, 2016. [21] Tibshirani, R., G. Walther, and T. Hastie. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 63, No. 2, 2001, pp. 411 -423. [22] FHWA. Traffic Monitorin g Guide.In, Washington, DC, 2016. [23] Hallenbeck, M., M. Rice, B. Smith, C. Cornell -Martinez, and J. Wilkinson. Vehicle volume distributions by classification.In, 1997. [24] Schneider, W. H., and I. Tsapakis. Review of Traffic Monitoring Factor Grouping s and the Determination of Seasonal Adjustment Factors for Cars and Trucks.In, University of Akron, Department of Civil Engineering, 2009. 206 CHAPTER 5 - SIGNIFICANT TRAFFIC INPUTS In Chapter 4, the Pavement -ME traffic inputs were generated for Levels 1, 2 A, 2B, 3A, and 3 B. Level 1 inputs should always be used for design purposes wherever possible as it is the actual traffic data specific to the site. When Level 1 inputs are unavailable, use either Le vel 2 or Level 3 inputs . The results of the sensitivity analyses based on statistical significance and maximum life difference should be used to decide on the appropriate traffic input level. Subsequently, the appropriate input levels can be selected for a site where Level 1 traffic data are not available. The primary purpose of sensitivity analyses is to identify if the traffic defaults developed based on clustering (Level 2A inputs) or road grouping (Level 2B inputs) techniques would provide significantly different pavement life predictions . The statewide defaults (Level 3 A or 3B inputs ) would suffice for some of the traffic inputs for which the Level 2 inputs do not have a significant impact on pavement design outcomes . The steps involved in sensitivity analyses include establishing base designs, performance criteria , and other input parameters in the Pavement -ME and then evaluating the impact of Level s 2 and 3 traffic inputs . The impact of Level 2 inputs on pavement designs was evaluated by changing one input at a time to Level 2 and kee ping all other inputs at Level 1. 5.1 SENSITIVITY ANALYSES Table 5-1 contain s the baseline flexible pavement design used for the sensitivity analyses. Material inputs used in these designs were as per MDOT guid ance . The Pavement -ME Version 2.3 was used for the sensitivity analysis. For b oth the flexible and rigid pavement , locally calibrated performance models were used (1; 2 ) with only one climate statio n (Lansing) . The pavement design life was assumed 20 years , with 95% design reliability . For each of the 41 WIM 207 locations, the HMA surface layer thickness was desi gned to achieve a 20 -year design life for bottom -up fatigue cracking threshold of 20% for fle xible pavements. Level 1 inputs were used in this process. For these designs, the rut depth values at the end of 20 years were also recorded . Table 5-2 presents the baseline rigid pavement design used for the sensitivity analyses. For each of the 41 WIM locations, the slab thickness was des igned to achieve a 20 -year des ign life for IRI threshold of 1 72 inch es/mile for the rigid pavements. Faulting and transverse cracking values were also recorded at the end of 20 years. For both the flexible and rigid pa vement designs, one traffic input was changed at a time to appropriate cluster or road groups for the site of the PTR. ( Levels 2A and 2B ) to determine the effect of that input on the design life. Level s 3A and 3B inputs for each design (one input at a time ) were also tested in the Paveme nt-ME to determine their impact on the design life. The time for the distress values (for Levels 2 and 3) to reach the threshold values in the Level 1 designs were documented . The differences in design lives between differen t inputs levels were quantified for further analyses. Table 5-1 Baseline designs for flexible p avements Layer/Detail Elastic Modulus (psi) Thickness (in) Asphalt Estimated by the software Variable Aggregate base (A -1-a) 33000 6 Sand subbase -A-1-b 20000 18 Sandy clay subgrade -A4 4400 Semi -Infinite Climate Lansing, MI Many statistical analyses can be performed to understa nd the characteristics of a data set and diff erences between datasets. Statistical analyses could detect diff erences in the datasets, but the differences might not have much practical signif icanc e or vice versa. Statistica lly, significant difference s can be found even with minima l differences between datase ts of considerabl e size. 208 Table 5-2 Baseline designs for rigid pavements Layer/Detail Elastic Modulus (psi) Thickness (in) JPCP 5600 ('cf) Variable Open graded base (A -1-a) 33000 6 Sand subbase (A-1-b) 20000 10 Sandy clay roadbed (A -6) 4400 Semi -Infinite Joint spacing 15 ft. Dowel bar diameter 1.25 in (<10in) 1.5in (=>10in) Climate Lansing, MI That is, a statistical significance of the results does not always imply pra ctical consequence. Hence in addition to finding the likelihood of a value (significance value ‚ ™) outside the 95% confidence interval (CI), the maximum life difference (MLD) values between two input levels were also adopted as an indicator of the variability in the data. The MLD is the maximum difference in life between Levels among the PTR locations. Table 5-3 lists the criteria used to determine the impact (sensitivity) of the difference between traffic inputs and correspondingly select the proper input level needed for the design . These designations will be used to measure each traffic characterization performance against site -specific values and to determine its impact on life differences. Table 5-3 Impact designation on predicted pavement performance Designation of Impact Maximum Life Difference (MLD) in years Significant MLD > 5 Moderate 2 < MLD < 5 Negligible MLD < 2 209 5.1.1 Level 2A Sensitivity Analyses A one-way analysis of variance ( ANOVA) can be used when to determine the statistical differences in the means of two or more groups. If the p-value is less than 0.05 (i.e., 95% confidence level), at least one group is different from the others. Additiona lly, multiple comparisons can be made to identify which group means are different from others. One way ANOVA was performed on the absolute lif e difference s (|Life Level 1 - Life Level 2A |) to detect the differences between the clusters for each traffic input. Table 5-4 and Figure 5-1 show the results of the ANOVA for Level 2A VCD clusters for flexible pavements . Since the p-value is below 0.05, th e results indicate that the cluster averages are different from each other (Clusters 2 and 4) and that their use in pavement design would result in statistically different des ign lives. Howe ver, it does not indic ate whether the differences are of practical significance. Similar tables and figures for other traffic inputs were developed . Figure 5-2 presents the differences in predicted performance for flexible pavements with the use of different Level 2A clusters for each WIM location . Each plot is divided into three regions (negligible, moderate , and significant) based on the MLD values presented in Table 5-3. Figure 5-2(a) shows the WIM locations in VCD Cluster 1 and the life differences when Cluster 1 VCD values are used in the Pavement -ME. Note that all the other traffic input values are at Level 1. 210 Table 5-4 ANOVA results for Level 2A VCD clusters for flexible pavements (bottom -up fatigue cracking ) Source DF Adj SS Adj MS F-value p-value VCD Cluster 4 9.454 2.3634 2.78 0.041 Error 36 30.599 0.85 Total 40 40.052 VCD Cluster N Mean StDev 95% CI 1 7 0.794 0.35 (0.087, 1.501) 2 8 1.834 1.568 (1.173, 2.495) 3 4 1.02 1.193 (0.085, 1.955) 4 9 0.379 0.399 (-0.244, 1.002) 5 13 0.872 0.77 (0.353, 1.390) DF = degrees of freedom, SS = sum of squares, MS = mean square (a) Mean differences between clusters (b) Tukey test results Figure 5-1 Mean design life comparisons between different for Level 2A VCD clusters for flexible pavements (bottom -up fatigue cracking) None of the WIM locations in Cluster 1 have ‚moderate™ or ‚significant™ life differences. Three VCD clusters result in moderate differences in design life [see Figures 5-2 (b), 5-2 (c), and 5-2 (e)]. If there is at least one WIM location in any cluster with a ‚moderate™ or ‚significant™ life difference, then the cluster was considered sensitive. Similar analyses were conducted for other Level 2A inputs. Table 5-5 summarizes the statistical sensitivity of flexi ble and JPCP pavements to different traffic inputs. The letter ‚Y™ means that at least one of the cluster mean is different from the other cluster means (i.e., sensitive) whereas an ‚N™ means insensitive . Table 5-6 543212.01.51.00.50.0VCD_Clust Noldfiff_vcd (years)5 - 45 - 34 - 35 - 24 - 23 - 25 - 14 - 13 - 12 - 1210-1-2If an interval does not contain zero, the corresponding means are significantly different.211 summarizes the sensitivity of flexible and JPCP pavements t o different traffic inputs that led to moderate practical significances in the Pavement -ME design outcomes. Table 5-5 Sensitivity of rigid and flexible pavements to statistical significa nce Œ Level 2A Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom -up Fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y N Y Y HDF - - N N Y MAF N N N N Y SALS N N N N N TALS N N N N N TRALS N N N N N QALS N Y N N N Table 5-6 Sensitivity of rigid and flexible pavements to moderate MLD criteria Œ Level 2A Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y N Y Y HDF - - N N Y MAF N N N N N SALS Y Y N N N TALS Y Y N N Y TRALS N N N N N QALS N N N N N 5.1.2 Level 2B Sensitivity Analyses Similar to the Level 2A sensitivity analyses, one way ANOVA was performed on the absolute life difference data (|Life Level 1 - Life Level 2 B|) to detect the differences between the road groups for each traffic input. Table 5-7 and Figure 5 -3 show the results of the ANOVA for Level 2B VCD as an example. Since the p-value is above 0.05, t he results indicate that the group averages are not different from each other and that their use in pavement design would not result in statistically different des ign lives. 212 (a) Cluster 1 (b) Cluster 2 (c) Cluster 3 (d) Cluster 4 (e) Cluster 5 Figure 5-2 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 2A VCD clusters 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 213 Table 5-8 summarizes the statistical sensitivity of flexible and JPCP pavements t o different traffic inputs. Howe ver, the m aximum life difference among the groups needs to evaluate for practical significance. Table 5-7 ANOVA results for Level 2B groups Source DF Adj SS Adj MS F-value p-value VCD Groups 5 7.64 1.53 1.54 0.203 Error 35 34.71 0.99 Total 40 42.35 VCD Groups N Mean StDev 95% CI HighVC9_Rural 8 0.334 0.509 (-0.381, 0.049) HighVC9_Urban 2 0.125 0.063 (-1.305, 1.55) LowVC9_Rural 10 1.399 1.660 (0.760, 2.038) LowVC9_Urban 2 0.080 0.000 (-1.349, 1.509) MediumVC9_Rural 7 0.987 0.536 (0.223, 1.751) MediumVC9_Rural 10 0.698 0.761 (0.115, 1.282) (a) Mean differences between clusters (b) Tukey test results Figure 5-3 Mean design life comparisons between different clusters for VCD MediumVC9_UrbanMediumVC9_RuralLowVC9_UrbanLowVC9_RuralHighVC9_UrbanHighVC9_Rural2.01.51.00.50.0-0.5-1.0-1.5Road Groupabs(ldfiff)MediumVC9_Ur - MediumVC9_RuMediumVC9_Ur - LowVC9_UrbanMediumVC9_Ru - LowVC9_UrbanMediumVC9_Ur - LowVC9_RuralMediumVC9_Ru - LowVC9_RuralLowVC9_Urban - LowVC9_RuralMediumVC9_Ur - HighVC9_UrbaMediumVC9_Ru - HighVC9_UrbaLowVC9_Urban - HighVC9_UrbaLowVC9_Rural - HighVC9_UrbaMediumVC9_Ur - HighVC9_RuraMediumVC9_Ru - HighVC9_RuraLowVC9_Urban - HighVC9_RuraLowVC9_Rural - HighVC9_RuraHighVC9_Urba - HighVC9_Rura3210-1-2-3-4If an interval does not contain zero, the corresponding means are significantly different.214 Table 5-8 Sensitivity of rigid and flexible pavements to statistical significa nce Œ Level 2B Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD N N N N N HDF - - Y N Y MAF N N N N N SALS N Y N Y N TALS N N N N N TRALS N N N N N QALS N Y N N N Figure 5-4 presents the differences in predicted performance in flexible pavements with the use of different Level 2B groups for each WIM location . Similar to Figure 5-2, each plot is divided into three regions (negligible, moderate , and significant) based on the MLD values in Table 5-3. Figure 5-4(a) sh ows the WIM locations in the road group with high VC9 levels in a rural area, and the life differences when that road group values are used in the Pavement ME. Note that all the other traffic input values are at Level 1. None of the WIM locations in this r oad group have ‚moderate™ or ‚significant™ life differences. However, three other road groups result ed in moderate to significant differences in design life [see Figures 5-4 (c), 5-4 (e), and 5-4 (f)]. Note that if there is at least one WIM location in any road group with a ‚moderate™ or ‚significant™ life difference, then that road group was deemed sensitive. Similar analyses were conducted for other Level 2B inputs. Table 5-9 summarizes the sensitivity to moderate life differences of flexible and JPCP pav ements t o different traffic inputs . 215 Table 5-9 Sensitivity of rigid and flexible pavements to moderate MLD criteria Œ Level 2B Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y N Y Y HDF - - N N Y MAF N N N N N SALS N Y N N N TALS Y Y N N Y TRALS N N N N N QALS N N N N N 5.1.3 Level 3A Sensitivity Analyses Level 3A has two road groups for all traffic inputs, i.e. , freeways and non -freeways. The procedure used for Level 2A and 2B sensitivity analyses was followed . Since there are only two groups, a one way ANOVA or a two -sample t-test could be used to find the differences between the two sets. Table 5-10 and Figure 5 -5 show the results of the ANOVA for Level 3A VCD. Since the p-value is below 0.05, the results indicate that the group averages are different from each other and that their use in pavement design would result in statistically different des ign lives. Table 5-10 ANOVA results for Level 3A VCD clusters or groups Source DF Adj SS Adj MS F-value p-value Class 1 8.075 8.0753 10.67 0.002 Error 39 29.529 0.7571 Total 40 37.604 Class N Mean StDev 95% CI F 31 0.6735 0.442 (0.3574, 0.9897) NF 10 1.707 1.622 (1.150, 2.264) 216 (a) High VC9 and Rural (b) High VC9 and Urban (c) Low VC9 and Rural (d) Low VC9 and Urban (e) Medium VC9 and Rural (f) Medium VC9 and Urban Figure 5-4 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 2B VCD road groups 0246810Life Difference (years) Site ID Significant Moderate 0246810807219829699Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 217 (a) Mean differences between clusters (b) Tukey test results Figure 5-5 Mean design life comparisons between different clusters for VCD Figure 5-6 presents the differences in predicted performance in flexible pavements (bottom -up cracking) with the use of different Level 3A groups for each WIM location . Figure 5-6(a) shows the WIM locations in freeway VCD group and the life differences when freeway VCD group values are used in the Pavement -ME. Only one WIM location in the freeway VCD group has ‚moderate™ life difference. Moderate to significant differences can be seen for WIM locations in the non -freeway VCD group [Figure 5-6(b)] . Table 5-11 summarizes the statistical sensitivity of flexible and JPCP pavements t o different traffic inputs fo r Level 3A. Table 5-12 summarizes the sensitivity to maximum life differences of flexible and JPCP pavements t o different traffic inputs . NFF2.52.01.51.00.5ClassLdiff_VCD (years)The pooled standard deviation is used to calculate the intervals.NF - F1.81.61.41.21.00.80.60.40.20.0If an interval does not contain zero, the corresponding means are significantly different.218 (a) Freeway cluster (b) Non -freeway cluster Figure 5-6 Differences in flexible pavement life predictions (bottom -up fatigue cracking) for Level 3A VCD groups Table 5-11 Sensitivity of rigid and flexible pavements to statistical significance Œ Level 3A Traffic Input Flexible pavements Rigid pavements Rut depth (in) Bottom -up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y N N N HDF - - N N N MAF N N N N Y SALS N Y N N N TALS N Y N Y N TRALS N Y N N N QALS N N N N N 0246810Life Difference (years) Site ID Significant Moderate 0246810Life Difference (years) Site ID Significant Moderate 219 Tab le 5-12 Sensitivity of rigid and flexible pavements to MLD criteria Œ Level 3A Traffic Input Flexible pavements Rigid pavements Rut depth (in) Bottom -up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y N Y Y HDF - - N N Y MAF N N N N Y SALS N Y N N Y TALS Y N N N Y TRALS N N N N N QALS N N N N N 5.1.4 Choosing the Appropriate Traffic Input Level As mentioned before, Level 1 traffic inputs should always be used for pavement design if available . In the absence of Level 1 inputs, use either Level 2 or Level 3 inputs. The results of the sensitivity analyses based on statistical significance and maximum life difference should be used to decide on the appropriate traffic input level. The criteria us ed in this evaluation is that the traffic input levels are sensitive if the life difference is moderate (> 2 years) . The recommendations for each traffic input are as follows: 5.1.4.1 Vehicle class distribution The statistical sensitivity analyses show that the us e of different VCD clusters (Level 2A) would result in statistically different design lives for both flexible and rigid pavements (see Table 5-5). This observation is also valid for MLD sensitivity analyses. The results indicate the existence of localized traffic patterns that would yield different pavement thicknesses i n the design process. While the statistical analyses do not show that Level 2B inputs would result in statistically different design lives for both flexible and rigid pavements , the MLD crit eria indicate that the use of Level 2B inputs would result in moderate life differences. Hence, for VCD, either Level 2A or 2B inputs could be used in the absence of Level 1 inputs. The next step is to identify if 220 there are any statistical differences between Level s 2A and 2B. If there are no differences between the two levels, then Level 2B can be used since it will simplify the VCD input selection process. A paired t-test was used to verify if there are significant differences between the values of (|Life Level 1 - Life Level 2A |) and (|Life Level 1 - Life Level 2 B|). Table 5-13 shows the results based on the paired t-test for various traffic inputs . It can be seen from the table that there is a statistical diffe rence between Levels 2A and 2B for rut depth in flexible pavements. However, the differences in design lives for flexible pavements in terms of rutting between the Levels 2A and 2B are practically insignificant (mean difference) although statistical ly sig nificant , as shown in Table 5-14. Further, the number of times the pavement sections are overdesigned or under designed when using Levels 2A and 2B were calculated . Figure 5-7 shows that the number of under -designed are higher for Level 2A compared to Leve l 2B. The descriptive statistics for design differences for all traffic inputs can be found elsewhere . A pavement at a WIM location will be overdesigned when the difference is design lives ( Life Level 1 - Life Level x) is positive and under -designed when the difference ( Life Level 1 - Life Level x) is negative. While a positive life difference would suggest increasing the thicknesses making the project overdesigned, a negative life difference will force to reduce the thicknesses making the project under -desi gned relative to Level 1. Table 5-15 and 5 -16 show the results of the sensitivity analyses between Levels 2A and 3A, 2B and 3A, respectively based on the paired t-test for various traffic inputs . It can be seen from Table 5 -16 that there are statistical di fferences between Levels 2B and 3A. Therefore , Level 2 B inputs are recommended for VCD for both flexible and rigid pavements. 221 Table 5-13 Summary of statistical significance Œ Level s 2A vs. 2B Traffic input Flexible pavements Rig id pavements Rut depth (in) Bottom up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y N N N N HDF - - Y N Y MAF N N N N N SALS N N N N N TALS N N Y Y N TRALS N N N N N QALS N N N N N Table 5-14 Paired t -test results between Levels 2A and 2B for rutting Sample N Mean StDev SE Mean Ldiff_Rut_2B 41 0.929 1.134 0.177 Ldiff_Rut_2A 41 1.179 1.08 0.169 Mean StDev SE Mean 95% CI -0.25 0.762 0.119 (-0.491, -0.009) t-value p-value -2.1 0.042 5.1.4.2 Hourly distribution factors Level 2A HDF inputs have shown to have a statistical ly significant impact on the design of rigid pavement designs compared to Level 2B inputs (see Table 5 -13). Note that the HDF inputs are (a) Rutting (b) Fatigue c racking Figure 5-7 Number of over and under -designed PTR locations Š Levels 2A and 2B (VCD) 05101520Level 2B Level 2A Normal design Overdesigned Underdesigned 05101520Level 2B Level 2A Normal design Overdesigned Underdesigned 222 Table 5-15 Summary of statistical significance Œ Level s 2A vs. 3 A Traffic input Flexible Pavements Rigid Pavements Rut depth (in) Bottom -up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD N N Y Y Y HDF - - N N Y MAF N N N N N SALS N N N N N TALS Y Y Y Y Y TRALS N N N N N QALS N N Y Y N Table 5-16 Summary of statistical significance Œ Level s 2B vs. 3 A Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom -up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD Y Y Y Y Y HDF - - Y N Y MAF N N N N N SALS N N N N N TALS N N N N N TRALS N N N N N QALS N N N N N only used in the rigid pavement design process. However, the differences in design lives for rigid pavements in terms of IRI and transverse cracking between the Levels 2A and 2B are very insignificant (0.04 and 0.9 years) from a practical standpoint as shown in Table 5-17. Figure 5-8 shows that Level 2A is slightly better with the number of undersigned PTR locations for transverse cracking . Table 5-17 Paired t-test results between Levels 2A and 2B for HDF (IRI and transverse c racking) Sample N Mean StDev SE Mean Ldiff_IRI_2B 41 0.071 0.0764 0.0119 Ldiff_IRI_2A 41 0.0324 0.0906 0.0142 Sample N Mean StDev SE Mean Ldiff_Crack_2B 41 2.539 1.909 0.298 Ldiff_Crack_2A 41 1.624 1.591 0.248 223 (a) IRI (b) Transverse cracking Figure 5-8 Number of over and under -designed PTR locations Š Levels 2A and 2B (HDF) However, note that predicted cracking levels are less than 5% at 20 years for all the 41 PTR locations; therefore, the difference of 0.9 years bet ween Levels 2A and 2B may not be of any practical significance. Therefore, Level 2B is recommended for HDF. 5.1.4.3 Monthly adjustment factors No statistical differences in design lives between Level 2A clusters or 2B road groups were observed for both flexible an d rigid pavements based on sensitivity analyses (see Table 5-13). Based on Figure 5-9, Level 2 B should be chosen because of a similar number of PTR locations with under -designed PTR locations . It can be seen from Table 5-16 that there are no statistically significant differences between Levels 2 B and 3A . The next step is to identify if there are any statistical differences between Levels 3A and 3B . If there are no differences between the two levels, then Level 3B can be used. Table 5-18 shows the results of the sensitivity analyses between Levels 3A and 3B based on the paired t-test for various inputs. Since there are statistically significant differences between Levels 3A and 3B, Level 3A inputs are recommended for MAF for both flexible and ri gid pavements. 010203040Level 2B Level 2A Normal design Overdesigned Underdesigned 0510152025Level 2B Level 2A Normal design Overdesigned Underdesigned 224 Table 5-18 Summary of statistical significance Œ Level s 3A vs. 3B Traffic input Flexible pavements Rigid pavements Rut depth (in) Bottom up fatigue cracking (%) IRI (in/mile) Faulting (in) Transverse cracking (%) VCD N N N N N HDF - - N N N MAF N N N N Y SALS N N N N Y TALS N N N N Y TRALS N N N N N QALS N N Y Y N (a) Flexible pavements ( rutting ) (b) Flexible pavements ( fatigue cracking ) (c) Rigid Pavement ( IRI ) (d) Rigid Pavement ( transverse cracking ) Figure 5-9 Number of over and underdesigned PTR locations Š Levels 2A and 2B (MAF ) 5.1.4.4 Axle load spectra For single axle load spectra, no differences were observed between Levels 2A and 2B (see Table 5-13) for both flexible and rigid pavements. Based on Figure 5-10, there is an almost equal number of under -designed PTR locations for Levels 2A and 2B. Hence , Level 2B inputs should be used . Also , since there are no differences between Levels 2B and 3A (see Table 5-16), and a 0510152025Level 2B Level 2A(VC5) Normal design Overdesigned Underdesigned 010203040Level 2B Level 2A (VC5) Normal design Overdesigned Underdesigned 010203040Level 2B Level 2A (VC5) Normal design Overdesigned Underdesigned 0510152025Level 2B Level 2A (VC5) Normal design Overdesigned Underdesigned 225 difference exist s between Levels 3A and 3B (see Table 5-17), Level 3A can be used for single axle load spectra. For tandem axle load spectra, some differences were observed between Levels 2A and 2B (see Table 5-13) for both flexible and rigid pavements. Based on Figure 5-11, Level 2A is slightly better with the number of undersigned PTR locations f or IRI and faulting . However, the differences in design lives (see Tables 5-19 and 5-20) for rigid pavements in terms of IRI and faulting between the Levels 2A and 2B are very insignificant (0.07 and 0.28 years) from a practical standpoint. Hence Level 2 B inputs can be chosen over Level 2 A inputs. Table 5-19 Paired t-test results between Levels 2A and 2B for TALS (IRI) Sample N Mean StDev SE Mean Ldiff_IRI_2B 41 0.1585 0.1356 0.0212 Ldiff_IRI_2A 41 0.0907 0.0919 0.0143 Mean StDev SE Mean 95% CI for 0.0678 0.1701 0.0266 (0.0141, 0.1215) t-value p-value 2.55 0.015 Table 5-20 Paired t-test results between Levels 2A and 2B for TALS (Faulting) Sample N Mean StDev SE Mean Ldiff_Fault_2B 41 0.6015 0.4855 0.0758 Ldiff_Fault_2A 41 0.3256 0.2336 0.0365 Mean StDev SE Mean 95% CI 0.2759 0.5126 0.0801 (0.1140, 0.4377) t-value p-value 3.45 0.001 For tridem and quad -axle load spectra, no differences were observed between Levels 2A and 2B (see Table 5-13) for both flexible and rigid pavements. Also , there are no differences between Levels 2B and 3A (See Table 5-16). Therefore, Levels 3A is recommended for both tridem a nd quad -axle load spectra. 226 (a) Flexible pavements ( rutting ) (b) Flexible pavements ( fatigue cracking ) (c) Rigid Pavement ( IRI ) (d) Rigid Pavement ( transverse cracking ) Figure 5-10 Number of over and under -designed PTR locations Š Levels 2A and 2B (SALS ) 5.1.5 Identifying the Changes in Traffic Patterns In this study, 5 years of data (2011 -2015) were averaged to obtain Level 1 inputs. However, there will be inherent variability in traffic data from year to year due to ref lecting the change in economic growth, additional , and downgraded WIM sites . Therefore, there is a need to identify the change in traffic patterns to update the traffic inputs so that the pavement sections would not be over -designed or under -designed. To t his effect, the changes in design lives when the traffic inputs were changed from Level 1 to Level 2A or Level 2B were tabulated and modeled to see the effect of change in vehicle class distribution values on the predicted design lives by Pavement -ME. 0510152025Level 2B Level 2A Normal design Overdesigned Underdesigned 051015202530Level 2B Level 2A Normal design Overdesigned Underdesigned 010203040Level 2B Level 2A Normal design Overdesigned Underdesigned 01020304050Level 2B Level 2A Normal design Overdesigned Underdesigned 227 (a) Flexible pavements ( rutting ) (b) Flexible pavements ( fatigue cracking ) (c) Rigid Pavement ( IRI ) (d) Rigid Pavement ( faulting ) Figure 5-11 Number of over and under -designed PTR locations Š Levels 2A and 2B (TALS ) Figures 5 -12 and 5 -13 show the predicted life differences using these models compared to the estimated life differences using Pavement -ME for flexible and rigid pavements . For example, for rutting in flexible pavements, the relationship between change in vehicle class distributi on values and the estimated life differences are as follows: LifeRut0.2770.021VC50.036VC80.117VC9 0.182VC100.237VC110.431VC131 .47LSFs (1) For bottom -up fatigue cracking, LifeCrack0.1440.0533VC50.013VC80.042VC9 0.077VC100.175VC110.377VC130 .151LSFs (2) 0510152025Level 2B Level 2A Normal design Overdesigned Underdesigned 0510152025Level 2B Level 2A Normal design Overdesigned Underdesigned 05101520253035Level 2B Level 2A Normal design Overdesigned Underdesigned 0510152025Level 2B Level 2A Normal design Overdesigned Underdesigned 228 In case of rigid pavements, for IRI, LifeIRI0.1380.017VC50.010VC80.013VC9 0.004VC100.037VC110.042VC130 .398LSFs (3) For Faulting, LifeFault0.0650.054VC50.009VC80.078VC9 0.039VC100.057VC110.198VC130 .057LSFs (4) For Transverse cracking, LifeCrack0.1910.149VC50.041VC80.061VC9 0.143VC100.154VC110.191VC131 .2LSFs (5) Where , VC 5, VC8, VC9, VC10, VC11 , and VC13 are the change in their vehicle class distributions from the existing values and LSF is the load spectra factor for the tandem axle defined as follows (3): 42244224 11111112222222 413636 js x (6) where; j= Load spectra factor (LSF) for the jth axle configuration (i.e., equivalent average ESALs per repetition of the axle load spectra) 111 ,,p Parameters for the first distribution 222 ,,p Parameters for the second distribution The short -term counts from the PTR sites can be used as inputs into these equations to check if there are any substantial differences in design life p redictions. If the life differences are deemed significant enough by the highway agencies at a PTR location for at least 3 years, then the new 3 years traffic data should be used to update the traffic inputs. Otherwise, the new data should be combined with the available traffic database. 229 (a) Rutting (b) Fatigue cracking Figure 5-12 Predicted vs measured life differences for flexible pavements (a) IRI (b) Faulting (c) Transverse cracking Figure 5-13 Predicted vs . measured life differences for rigid pavements R² = 0.9424 -6 -5 -4 -3 -2 -1 012345-6 -4 -2 024Pred_Rut_Ldiff (year) Meas_Rut_Ldiff (year) R² = 0.9319 -6 -5 -4 -3 -2 -1 01234-6 -4 -2 024Pred_AC_Ldiff (year) Meas_AC_Ldiff (year) R² = 0.6474 -5 -4 -3 -2 -1 012345-5 -3 -1 135Pred_IRI_Ldiff (year) Meas_IRI_Ldiff (year) R² = 0.9737 -4 -3 -2 -1 0123-4 -3 -2 -1 0123Pred_ Fault _Ldiff (year) Meas_Fault_Ldiff (year) R² = 0.8477 -5 -4 -3 -2 -1 01234-6 -4 -2 024Pred_TC_Ldiff (year) Meas_TC_Ldiff (year) 230 5.2 SUMMARY Five traffic input levels were developed in this study , as listed below. a. Level 1 Œ Site -specific inputs b. Level 2A Œ Averages of clusters based on cluster analyses c. Level 2B Œ Averages of groups based on roadway characteristics (attributes) d. Level 3A Œ Averages based on freeway and non -freeway road class es e. Level 3B Œ Statewide averages Level 1 inputs should always be used for design purposes wherever possible as it is the actual traffic data specific to the site. When Level 1 traffic inputs are unavailable, either Levels 2 or 3 inputs must be used for pavement designs. The impact of Level 2 inputs on predicted pavement performance can be evaluated by using s ens itivity analyses . If no differences in the predicted performance or pavement lives are observed between Levels 1 and 2 inputs , Level 3 traffic inputs will suffice for pavement design . The steps involved in sensitivity analyses include establishing base de signs, performance criteria , and other input parameters in the Pavement -ME and then evaluating the impact of Levels 2 and 3 traffic inputs. For sensitivity analyses, t he pavement design life was assumed 20 years , with 95% design reliability for flexible and rigid pavements . For each of the 41 WIM locations, the HMA surface layer thickness was desi gned to achieve a 20 -year design life for bottom -up fatigue cracking threshold of 20% for flexible pavements since it is critical structural distre ss for pavement design . Level 1 inputs were used in this process. For these designs, the rut depth values at the end of 20 years were also recorded . In addition , for each of the 41 WIM locations, the slab thickness was des igned to achieve a 20 -year des ign life for IRI threshold of 172 inch es/mile for 231 the rigid pavements because it controls most of the designs . Faulting and transverse cracking values were also recorded at the end of 20 years. For both the flexible and rigid pavement designs, one traffic inpu t was changed at a time to Levels 2A and 2B to determine the ir effect s on the design li ves . Levels 3A and 3B inputs for each design (one input at a time) were also used in the Paveme nt-ME to determine their impact on the design li ves . The time for the dist ress values (for Levels 2 and 3) to reach the threshold values in the Level 1 designs were documented . The differences in design lives between different inputs levels were quantified for further analyses. Statistical analyses could detect diff erences betwe en clusters or groups , but the differences might not have much practical signif icanc e. Hence , in addition to the statistical significan ce, the maximum life difference (MLD) values between two input levels were adopted as an indicator of the variability in the data and correspondingly select ed the proper input level needed for the design . One way ANOVA was performed on the absolute life difference s (|Life Level 1 - Life Level X|) to detect the differences between the clusters for each tr affic input. If the p-value is below 0.05, the results indicate that the cluster or group averages are different from each other and that their use in pavement design would result in statistically different des ign lives. It does not indic ate whether the differences are of practical significance. The absolute differences in predicted lives were estimated relative to Level 1 design life of 20 years. The cluster or group was considered sensitive or of practical significance if the absolute life differences of at least one WIM location is high er than two years. Once the sensitivity of inputs at Levels 2 and 3 was determined, t he next step was to identify if there are any differences between predicted lives for Level s 2A and 2B. If there are no differences between the Levels 2A and 2B , then Level 2B can be used since it will simplify the 232 input selection process. A paired t-test was used to verify if there are significant differences between the values of (|Life Level 1 - Life Level 2A |) and (|Life Level 1 - Life Level 2 B|). Also, the number of under - and over -designed WIM sites due to the use of Levels 2A and 2B inputs were determined . A pavement at a WIM location will be overdesigned when the difference in design lives ( Life Level 1 - Life Level x) is positive and under -designed when the difference ( Life Level 1 - Life Level x) is negative. While a p ositive life difference would suggest increasing the thicknesses making the project over -designed, a negative life difference will force to reduce the thicknesses making the project under -designed relative to Level 1. If there were statistically significant differences, either Level 2A or 2B was selected for that traff ic input after careful evaluation of the average design life differences . If there were no differences between Levels 2A and 2B , comparisons were made between Levels 2A and 3A or 2B and 3A to see if Level 3A would suffice for pavement designs. Again, if th ere are no differences between Levels 2 and 3A, comparisons were made between Levels 3A and 3B to see if Level 3B would suffice for pavement designs. After the sensitivity analyses, classifications models (decision trees) were developed for cluster assignm ent. The criteria used in these analyses to establish significant traffic inputs are based on engineering judgment and local experience . The statistical analyses may not be reliable alone ; practical significance should always support it . Based on the sens itivity analyses of Levels 2A, 2B, 3A, and 3B, it is recommended that for design purposes VCD should be estimated based on the short -term counts while classifications trees for HDF and TALS can be used for cluster assignments if Level 2A inputs are needed. Due to relatively high misclassification rates, the practically insignificant difference between Levels 2A and 2B, and ease of use, Level 2B inputs can be used for design purposes. The foll owing input levels are recommended for each traffic input (see 233 Table 5-21). Also, to identify the changes in traffic patterns that could affect the pavement designs, relationships between the change in vehicle class distributions , and the change in predicted design lives were developed. Using these equatio ns, highway agencies can identify the changes in traffic patterns that could affect pavement designs and can update the traffic inputs. Table 5-21 Recommended traffic input levels Traffic input Recommended traffic input level Flexible pavements Rigid pavements VCD 2B 2B HDF - 2B MAF 3A 3A SALS 3A 3A TALS 2B 2B TRALS 3A 3A QALS 3A 3A 234 REFERENCES 235 REFERENCES [1] Haider, S. W., G. Musunuru, M. E. Kutay, M. A. Lanotte, and N. Buch. Recalibration of Mechanistic -Empirical Rigid Pavement Performance Models and Evaluation of Flexible Pavement Thermal Cracking Model, 2017. pp. 986 -997. [2] Haider, S. W., N. Buch, W. Brink, K. Chatti, and G. Baladi. Preparation for Implementation of the M echanistic -Empirical Pavement Design Guide in Michigan Part 3: Local Calibration and Validation of the Pavement -ME Performance Models. Rep. No. RC -1595, Michigan State Univ., East Lansing, MI, 2014. [3] Haider, S. W., R. S. Harichandran, and M. B. Dwaikat . Estimating bimodal distribution parameters and traffic levels from axle load spectra, 2008. 236 CHAPTER 6 - CONCLUSIONS, RECOMMENDATIONS Based on the recalibration of rigid pavement performance models and analyses of traffic data collected during the years 2011 to 2015, the following conclusions and recommendations are drawn for traffic inputs for the Pavement -ME analysis and design in the State of Michigan. 6.1 CONCLUSIONS 6.1.1 Findings based on the Recalibration The following are the findings based on the recalibration: Performance models for rigid pavements (transverse cracking and IRI) have changed since the Pavement -ME version 2.0. Because of these changes, and additional time -series data being available, re -calibration of the models is warranted. For flexible pavement, the previous local calibration coefficients can still be used because the prediction mod els are not modified since the Pavement -ME version 2.0. To predict transverse cracking in rigid pavements with the current design requirements of MDOT, the permanent curl/warp values should be changed based on the project location. The SEE and bias for the global model s for rigid pavement cracking and IRI models are much higher as compared to the locally re-calibrated model s. The median model coefficients based on bootstrapping should be used since the distributions of SEE and bias are non -normal. The coe fficients also showed much lower SEE and bias than global coefficients for the cracking model. The previously calibrated local joint faulting model coefficients are still valid based on lowest SEE and bias and should be used . The average IRI model coeffic ients using bootstrapping showed significantly lower SEE and bias as compared to the global coefficients. 237 Using mixed -effects models instead of fixed effects models reduces the bias significantly and takes care of the inherent subject variations 6.1.2 Findings based on the Cluster Analysis and Traditional Approaches The following hierarchical traffic inputs can be used in the Pavement -ME: Level 1 Œ Convert WIM and classification site -specific data to the Pavement -ME format using PrepME. Level 2 Œ Utilize groups based on road attributes with similar traffic characteristics. The group traffic characteristics were averaged to create Level 2 traffic inputs. Level 3 Œ Average traffic characteristics from all PTR sites were used to generate for freeway and non -freeway Level 3 data. Two approaches were adopted for developing Level 2 inputs (a) cluster analyses (Level 2A), and (b) grouping roads with similar attributes (Level 2B). Hierarchical clustering techniques were found to be better than other clustering techniques and hence were used in the development of Level 2A inputs. The development of Level 2A inputs established the following findings : Vehicle class distribution (VCD) clustering identified five specific traffic patterns , each distinguished by the proportions of VC5 and VC9 . Sites in cluster 1 have percentage VC9 trucks in the ranges of 45 to 70 while the VC5 truck percentage was in the range of 15 to 25. Cluster 2 cont ained a majority of sites with percentage VC9 trucks less 45 while the VC5 truck percentage was in the range of 20 to 30. Cluster 3 has sites that have a slightly higher percentage of VC5 trucks than VC9 trucks. Sites in cluster 4 have the highest percenta ge of VC9 trucks (above 75) with a very low percentage of VC5 trucks (below 10). Sites in cluster 5 have a percentage of VC9 trucks between 55 and 70 with a percentage of VC5 trucks between 10 and 20. 238 Monthly adjustment factors (MAF) clustering resulted i n four clusters based on VC5. Cluster 1 exhibits reasonable seasonal variability, having MAF close to 1.4 during summer months with lower values in winter. Cluster 2 depicts very little seasonal variability with MAF close to 1. Cluster 3 displays higher MA F in summer and fall, with much lower MAF in winter and spring. Sites in cluster 4 also have higher MAF in summer and fall and are mostly located on north -south routes such as I -75 and US -127. Cluster analysis based on VC9 resulted in five clusters. Almost all the sites in all the clusters have no seasonal variability between months. Since VC9 trucks are used for long haul throughout the year, a uniform presence of such trucks is expected on all the sites. Hourly distribution factors (HDF) were grouped into five clusters . Cluster 1 contains heavier evening proportions of trucks. Cluster 2 has a similar percentage of trucks as sites in cluster 1, but on average , shifts left by an hour. Cluster 3 average has roughly a 1 -2% lower truck percentage between the ho urs of 7:00 am and 4:00 pm than either cluster 1 or 2. Sites in cluster 4 have the highest HDF from 8 am to 12 noon of all clusters. Sites in cluster 5 have the flattest curve among all the clusters suggesting minimum hourly variations for long -haul trucks . Single axle load spectra (SALS) were grouped into f our clusters based on VC5 trucks . For all the sites in the clusters , the first peak occurs at approximately 4 to 6 kips while the second peak occurs at 8 to 10 kips. Cluster 1 has an almost equal proportion of axles in the 4 -6 kip range and the 8 -10 kip range. Cluster 2 has a higher proportion of 4 -6 kip axles than 8 -10 kip axles. Cluster 3 has only one site (US -2) in the Upper Peninsula , and the pattern seen in the figure is unique to that site. Cluster 4 has sites with a higher proportion of axles in the 8 -10 kip range than the 4 -6 kip range. 239 Tandem axle load spectra (TA LS) based on VC9 resulted in five clusters. The two peaks in the clusters correspond to unloaded (9 -14 kips) and loaded (30 -33 kips) trucks. Clusters 1, 3 and 4 have more light axles than heavy, whereas Clusters 2 and Cluster 5 have heavier tandem axles. Tridem axle load spectra (TRALS) based on VC13 formed six clusters. The general trend of the tridem axle clusters show s a large proportion of light axles around 12 kips followed by a peak value around 40 -45 kips. Quad axle load spectra (QALS) based on VC13 resulted in 3 clusters. Peak values for the quad -axle load spectra occur at the 18 -24 kips, 45 -60 kip ranges. The a dvantages of cluster analyses are that the groups are formed objectively (based on a mathematical function) and not subjected to bias or su bjective decisions. The other advantage is that it finds patterns in the data that are not intuitively obvious , therefore providing new insights into traffic patterns in a region. One main disad vant age is the lack of guidelines for establishing the optimal number of clusters. While various techniques were developed to identify the optimal number of clusters, none of them are perfect and have their drawbacks. The other disadvantage is that, since clustering is purely a mathematical technique, the objects in a cluster might not have the same identif iable attributes making the assign ment of data from a new location to the existing clusters difficult . Single decision tree classifiers are very convenient to use but have errors (resubstitution errors) when the entire data set is used and even higher errors when the data is split into testing and training sets. Random fore st and naïve bayes classifiers were used to reduce the classification error rate. While they do a better job at reducing the training data erro rs, the testing data errors still exist. These errors would be attributed to limited data and in many instances, clusters having very few sites within them (as low as one site in some cases). Hence it 240 was deemed that there is a need to develop an alternati ve methodology to develop traffic inputs at Level 2 that is intuitive and easily used. As previously noted, the inputs developed using this alternative methodology are called Level 2B inputs. The development of Level 2B inputs established the following fin dings : It was anticipated that the MDOT would know the AADTT at a site (i.e., from historical traffic data or short -term counts). Therefore, AADTT was grouped into low, medium, and high traffic volume. Low was under 1000 AADTT, the medium was from 1000 to 3000 AADTT, and high was greater than 3000 AADTT for the design lane in one direction. Fourteen sites had low AADTT, eighteen sites had medium AADTT, and the remaining nine had high AADTT. For VCD, the attributes of the VCD level and development type resu lted in six groups. Three distinct patterns with varying levels of VC9 irrespective of the development type were observed. All the sites in high VC9 groups are located on the interstates while most of the sites in low VC9 groups are located on state routes . Sites in the medium VC9 groups have a mix of both intestates and state routes in rural and urban areas. For MAF, the attributes of commercial AADT and development type resulted in six groups. Almost all the groups have similar MAF patterns for VC5 except for sites with low AADTT in the rural areas suggesting seasonal traffic patterns. No differences in MAF for VC9 trucks were found between the groups and are always close to 1. For HDF, the attributes of the VCD level and development type resulted in six g roups. The sites having low VC9 levels in the urban areas have the highest peak among all other groups between 8:00 am and 4:00 pm suggesting local traffic patterns. Sites having high 241 VC9 levels have the flattest peaks in both urban and rural areas suggest ing long haul traffic patterns. All the sites in high VC9 groups are on interstate routes. For SALS, t he attributes of COHS and development type resulted in six groups . For all the sites in different groups, the first peak occurs at approximately 4 -6 kips while the second peak occurs at 8 -10 kips. Road groups in the urban areas have an almost equal proportion of axles in the 4 -6 kip range and the 8 -10 kip range while the sites in the rural areas have a higher proportion of 4 -6 kip axles than 8 -10 kip axles . The road group of the regional corridor in the urban area has only one site on US -2 with a unique loading pattern. For TALS, the attributes ; number of lanes and development type resulted in five groups. The two peaks seem to correspond to unloaded (9 -14 kips) and loaded (30 -33 kips) tandem axles. Other characteristics could not be established for the groups as they have varying functional classifications and AADTT levels and also because some groups only have one site. For TRALS, t he attributes of COHS a nd development type resulted in six groups . The general trend of the tridem axle groups appears to be a large proportion of light axles around 12 kips , followed by a peak value around 40 -45 kips. All the sites in the national corridors are located on inter states while the sites on regional and statewide corridors are on state routes with varying AADTT levels irrespective of the development type. For QALS, the attributes of COHS and development type resulted in six groups. Again, all the sites in national co rridors are on the interstates while the sites on regional and statewide corridors are on state routes with varying AADTT levels irrespective of the development type. 242 6.1.3 Significant Traffic Inputs For pavement design, it is recommended that site -specific data be used if available. For sites with no site -specific data, it is necessary to know whether Level 2 or Level 3 data are acceptable for design purposes . To investigate the impact of traffic input levels on predicted pavement performance for flexible and rigid pavements , the Pavement -ME was used. The results of the sensitivity analyses were used to establish the appropriate traffic input levels. Such analyses were performed on the absolute life difference s (|Life Level 1 - Life Level X|) to detect the differences between the clusters or groups for each traffic input. The sensitivity of inputs at Levels 2 and 3 was determined using statistical and practical significance criteria. T he next step was to identify if there are any diff erences between predicted lives for Level s 2A and 2B. If there are no differences between the Levels 2A and 2B , then Level 2B can be used since it will simplify the input selection process. Also, the number of under - and over -designed WIM sites due to the use Levels 2A and 2B inputs were determined . A pavement at a location will be over -designed when the difference in design lives ( Life Level 1 - Life Level x ) is positive and under -designed when the difference ( Life Level 1 - Life Level x ) is negative. While a positive life difference would suggest increasing the thicknesses making the project over -designed, a negative life difference will force to reduce the thicknesses making the project under -designed relative to Level 1. If there were s tatistically significant differences, either Level 2A or 2B was selected for that traffic input after careful evaluation of the average design life differences. If there were no differences between Levels 2A and 2B, comparisons were made between Levels 2A and 3A or 2B and 3A to see if Level 3A would suffice for pavement designs. Again, if there are no differences between Levels 2 and 3A, comparisons were made between 243 Levels 3A and 3B to see if Level 3B would suffice for pavement designs. The following is th e summary of findings: 1. VCD significantly impacts predicted rigid and flexible pavement performance. Thus, VCD groups (Level 2B) is suggested for use in flexible and rigid pavement design. 2. MAF a has negligible impact on predicted rigid and flexible paveme nt performance. Therefore, it is recommended that a statewide average (Level 3A) be used. 3. HDF significantly impacts rigid pavement performance. Due to relatively high misclassification rates, the practically insignificant difference between Levels 2A and 2B, and ease of use, group average (Level 2B) HDFs can be utilized for rigid pavement design. 4. AGPV is depended on the truck fleet which does not change in Michigan. Therefore, it is suggested that statewide averages (Level 3B) be used . 5. Single axle load spectra have a significant effect on predicted flexible pavement performance. Cluster (2A) and groups (2B) averages produced comparable results. Also , no significant difference was detected between Levels 2B and 3A. Therefore, it is recommended that s tatewide averages (Level 3A) be used . 6. Tandem axle load significantly impacted rigid and flexible pavement performance. Therefore, group averages (Level 2B) are suggested for both rigid and flexible pavement designs. 7. Tridem and quad -axle load spectra do no t have a significant impact on rigid and flexible pavement performance. Consequently, statewide average tridem and quad -axle load spectra (Level 3A) can be used . 244 8. The Pavement -ME defaults traffic inputs do not accurately reflect the local traffic conditions in the state of Michigan. In general, statewide or cluster averages produced performance lives that were closer to the site -specific values than the Pavement -ME defaults. Consequently, the Pavement -ME defaults are not recommended for use in the state of M ichigan. Table 6 -1 Recommended traffic input levels Traffic Characteristic Impact on Pavement Performance Suggested Input Level s (when Level I is unavailable) Rigid Pavement Flexible Pavement Rigid Pavement Flexible Pavement VCD Moderate Moderate Level 2B HDF Moderate - Level 2B - MAF Negligible Level 3A AGPV Negligible Level 3B Single ALS Negligible Moderate Level 3A Tandem ALS Moderate Moderate Level 2B Tridem ALS Negligible Negligible Level 3A Quad ALS Negligible Negligible Level 3A 6.1.4 Assigning a Site to a Cluster or a Group Once the appropriate input levels for each of the traffic characteri stic were established , it was necessary to determine how th ese could be implemented in the design. For the traffic inputs where site -specific (Level 1) data or only statewide values (Level s 3A or 3B ) need to be used , selection of the appropriate traffic input is obvious . Note that only VCD, HDF , and TALS require Leve l 2B inputs. The following road attributes can be used to obtain Level 2B inputs: Vehicle class distribution ( VCD) Š VC9 distribution (< 45%, 45 Œ 70%, >70%) and development type (Urban vs. Rural) Hourly distribution factor ( HDF) Š VC9 distribution (< 45%, 45 Œ 70%, >70%) and development type (Urban vs. Rural) Tandem Axle Load Spectra (TALS) Š Number of lanes (2, 3 and 4) and development type (Urban vs. Rural) 245 6.1.5 General Findings Additionally, the following observations were made based on t he analyses of the traffic inputs: In general, insignificant seasonal (month to month) variations existed in axle load spectra for the most vehicle classes except the vehicle classes (VC4, VC7, VC8, VC11, and VC12) that constitute a very low percentage of the traffic volume and are on roads with low AADTT. The impact of the directional difference in axle load spectra for most vehicle classes is negligible. Only VC10 and VC13 exhibited directional difference. This is most likely local nature of these specif ic VC trips (e.g., traveling to and from a logging site or gravel pit). This is an important observation as it substantiates the need to analyze only a single direction. The single axle load distribution depends on the percentages of VC5 and VC9 in the traffic stream. The sites with higher proportions of VC5 peak at 4-8 kips while sites with higher proportions VC9 peak at 8-10 kips. The tandem axle load distributions are mostly depend ent on the axle load spectra of VC 9. The tridem and quad -axle lo ad spectr a are a function of VC 7, VC 10, and VC 13. 6.2 RECOMMENDATIONS The following are the recommendations based on recalibration of performance models: 1. The local calibration model coefficients shown in Table 6 -2 can be used to replace the previous local model 's coefficients in Michigan if negligible cracking predictions are acceptable for rigid pavement designs. 246 2. The local calibration model coefficients shown in Tables 6 -3 can be used to replace the previous local model 's coefficients in Michigan if cracking pred ictions are critical for rigid pavement designs. 3. The significant input variables that are related to the various reconstruct and rehabilitation should be an integral part of a database for construction and material -related information. Such information wil l be beneficial for future design projects and local calibration of the performance models in the Pavement -ME. Table 6 -2 Summary of rigid pavement performance models with local coefficients (Initial recalibration) Performance prediction model Performance models and transfer functions Local coefficient Transverse cracking 5/41001BUD CTFCCRK DI 450.721.82CC IRI 1234 IIRIIRICRKSPALLTFAULT CSCFCC 12341.773.121.2728.08CCCC Table 6 -3 Summary of rigid pavement performance models with local coefficients (Permanent curl model) Performance prediction model Performance models and transfer functions Local coefficient Transverse cracking 5/41001BUD CTFCCRK DI 450.472.11CC IRI 1234 IIRIIRICRKSPALLTFAULT CSCFCC 12340.8711.310.4728.96CC CC The following are the recommendations based on the development of traffic inputs: 247 It is recommended, wherever possible, to expand the geographic coverage of traffic data in Michigan. When a new PTR site needs to be installed , it should be located in areas where limited traffic data are available . Short duration and continuous counts should be shared between agencies to ensure wider and recurrent data collec tion coverage. Effective communication between traffic data collection personnel and pavement design engineers is recommended for addressing the traffic input requirement for the Pavement -ME. Additionally, the following specific traffic data collection efforts should be considered as recommended by the Traffic Monitoring Guide (TMG) : The short duration volume coverage count program should provide comprehensive coverage across the roadway infras tructure on a cycle of 6 years. Short duration classification counts should account for at least 25 -30% of all volume counts being conducted wherever possible. In a ddition, at least one vehicle classification count should be made on each route annually. To obtain 95% confidence and 10% error in the precision of the traffic factors formed within a seasonal group; five to eight continuous counters should be established per group. New seasonal factors should be compared to the ones formed and placed into the appropriate group. At least six continuous vehicle classification counters should be established for each factor group. Continuous counts should be placed on different functional classes and different geographic regions within the state. Emphasis should be placed on roads that are primarily local or long hauls. When new sites are added, the data should be compared and placed into the appropriate existing factor groups. 248 A minimum of six WIM should be monitored within a group , with at least one of the WIM si tes operating continuously and recording two or more lanes of traffic. The amount of permanent WIM stations and discontinuous portable systems is a function of the number of groups , the accuracy at which the measured weights are taken, and the budget of th e State agency. With proper coverage of existing groups and gradual expansion into unmonitored areas within the State through the installation of permanent devices, the data collection program could be more robust. In addition to above mentioned general suggestions , based on the results of this study following are the specific recommendations to improve traffic data collection to facilitate the use of the Pavement -ME design process in the State of Michigan: 1. The attri butes selected for the road group development for different traffic inputs were determined based on the traffic data collected at 41 WIM sites distributed across the State. Also , most of the traffic data were collected between the years 2011 to 20 15. However, there will be a need to revise these groups for Level 2B traffic inputs if the following updates or changes are anticipated: a. Addition of new classification and WIM sites at different geog raphical locations or change in the status of existing site (e.g., down - or up -grading from WIM to classification or vice versa) . b. Significant change in the land use (e.g., industrial development or commercial zoning) in the vicinity of the existing WIM loc ations. c. Change in the WIM technology for a number of locations. For example , if less accurate piezo sensors are replaced with more accurate quartz or bending plate 249 sensors. The accuracy and bias in the WIM sensor will affect the axle load spectra , which mi ght influence the selection of attributes for Level 2B inputs. If MDOT anticipates the above -mentioned updates or changes in the foreseeable future (e.g., 5 or 10 year s), then there will be a need to revise the attributes or the group averages for all traf fic inputs. 2. For a few sites, the traffic patterns overtime were compared. Some changes in truck traffic distribution were observed for PTR locations with one -way AADTT < 1000. However, for sites having one -way AADTT > 1000 , the truck traffic and tandem ax le distributions did not vary substantially for the last 5 years (2011 to 2015). The equations developed should be used to identify the changes in traffic patterns that could significantly affect the pavement designs. If changes are observed in traffic pat terns (classifications and loadings) at a PTR location for at least 3 years, then the new 3 years traffic data should be used to update the traffic inputs. Otherwise , the new data should be combined with the available traffic database. 3. The existing PTR locations were reviewed, and the following specific WIM additions are recommended for the various regions in the State: a. Superior Region: Because of the presence of heavy to very heavy axle loads, an additional WIM site along M28 between Ewen and Kenton. To capture interstate truck traffic (between Michigan and Wisconsin), an addition WIM site should be considered on US2 west of Watersmeet . b. North Region: 250 The current WIM site distribution seems adequate wi th the addition of WIM site 3069 to cover the west side of the region. Additional WIM sites should be considered on the eastern side along M -32 if land -use demands change in the future . c. Gra nd Region: The axle loading analysis revealed light to medium axle loadings in this region. Therefore, the current WIM site distribution seems adequate at this time. d. Bay Region: This region also contains light to medium axle loadings. The current WIM site distribution seems adequate at this time. e. Metro Region: Additional WIM can be considered on I -75 between Flint and Auburn Hills . No PTR is located on this part of I -75 with anticipated commercial truck traffic. f. University Region: Based on the new axle loading data, the current WIM site distribution seems adequate at this time. g. Southwest Region: Additional WIM site is recommended in future on US31 near Sodus Township. This location will capture traffic coming north from I -90. 251 4. It is strongly recommende d that VCD based on the short -term (48 hours) counts should be used to verify Level 2B group VCD, especially for locations on interstate highways with a higher frequency of VC13.