u... .. . .35... i... . . : . .-...u......r..... :3; 2 an} .. .x... . I... .3 :1.) 1 L4: I) 90.313... 1‘. .\ m; .... Mugquwhxanm TATE N a... limilfl'r‘l’ll‘l’ililmniriflii‘imll‘iimiifiml 7m 3 1293 01410 7324 This is to certify that the dissertation entitled DYNAMIC ESTIMATION OF ORIGIN-DESTINATION MATRICES FOR AN URBAN NETWORK USING TRAFFIC COUNTS presented by Mahkame Megan Khoshyaran has been accepted towards fulfillment of the requirements for Ph.D. Civil Engineering degree in W/z/x/Zw e . 6%». Major professor Date October 18, 1995 MSU is an Affirmative Action/Equal Opportunity Institution 0- 12771 LIBRARY -. Michigan State I University PLACE ll RETURN BOX to tomavothb chodtoutftom your record. TO AVOID FINES Mum on or before date duo. DATE DUE DATE DUE DATE DUE .- ‘ 1‘ " -. mini/w ‘3 ‘W_g'_“ I DYNAMIC ESTIMATION OF ORIGIN-DESTINATION MATRICES FOR AN URBAN NETWORK USING TRAFFIC COUNTS By Mahkame Megan Khoshyaran A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Civil and Environmental Engineering 1995 ABSTRACT DYNAMIC ESTIMATION OF ORIGIN-DESTINATION MATRICES FOR AN URBAN NETWORK USING TRAFFIC COUNTS By Mahkame Megan Khoshyaran The purpose of this research is to design a model for estimating a dynamic or time dependent origin-destination trip table for a large urban area and to test the model using traffic count data obtained from inductive loops on a sample of arterial streets. So far the only reliable way to obtain an origin destination trip table has been through surveys at origins and destinations. These surveys are expensive and they do not reflect changes in traffic flow that occur over time. The hope of this research is to provide a reliable substitute for the traditional method of obtaining an O-D trip table. The network chosen to test the model is downtown Boston. The method of 0-D flow estimation used in this study is sequential. The algorithm used is the Generalized Least Square Method. The input data used to test the model is the on-line traffic information obtained from inductive loops on a sample of local streets. The study period is from 6:00 am. to 9:00 am. The I length of this study is 5 weekdays. The 1987 O-D table provided by the Boston Area Regional Planning Authority is used as the apriori 0-D table for the first interval of day one. Local street count data were provided by the City of Boston Traffic Control Center. The validity of 0-D estimates were checked by examining how closely the estimated link counts derived from the 0-D estimates match the actual link counts. Statistics used to measure the degree of closeness are Root Mean Square Error, and Percent Root Mean Square Error. The model was validated based on testing using synthetic data where the true O- D volumes were known. The results of the validation proved the mathematical integrity of the model and its robustness. The model, when applied to the urban traffic data, did not produce estimates of link counts which would be suitable for real time traffic control due to the following causes: 1) partial traffic count data; 2) insufficient detector coverage; 3) lack of speed and delay information on all the links in the network; 4) lack of sufficient centroid connectors. ACKNOWLEDGMENTS I would like to thank my advisor, Dr. William C. Taylor, for his guidance and very generous support during my years at MSU. Above all, I sincerely appreciate his friendship. I am grateful to Professor Moshe Ben-Akiva of the Massachusetts Institute of Technology for accepting being my dissertation supervisor, and part of my dissertation committee. I have learned a great deal from him, and his scientific views exerted a lasting influence on me. I would like also to thank Professor James Stapleton for his guidance throughout my research. A particular note of appreciation is due to Professor Lyles and Professor Maleck for accepting being part of my dissertation committee. My deepest gratitude goes to my mother Mahin, for her love, patience and support. To her, this work is dedicated. iv Chapter 1 Chapter 2 2.1 2.2 2.2.1 2.2.2 2.2.3 2.2.4 2.3 2.3.1 2.3.1.1 2.3.1.2 2.3.2 2.3.2.1 2.3.3 2.3.3.1 2.3.3.2 2.4 Chapter 3 3.1 3.2 Chapter 4 4.1 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.3 4.4 4.4.1 4.4.2 TABLE OF CONTENTS Introduction ............................................................................................. 1 Literature Review .................................................................................... 4 Dynamic O-D Estimation Models. ........................................................ 5 Intersection Flow Models ........................................................................ 5 Method of cross-correlation matrices ............................................... 6 Method of constrained optimization ............................................... 10 Method of Recursive Estimation ..................................................... 12 Estimation by Kalman Filtering. ..................................................... 13 Complex Networks ................................................................................ 16 Linked Static - Dynamic Correlation Method. ............................... 17 Calculation of the Coefficient of Correlation w(i,j) .................. 19 0-D Flow Estimation For a Complex Network ........................ 20 Kalman Filtering Algorithm. ........................................................... 22 Estimating Time Varying O-D Trip Matrices From Traffic Counts .............................................................................. 22 Generalized Least Square Method .................................................. 25 Simultaneous Estimation. ........................................................... 28 Sequential Estimation ................................................................. 29 Research Objective ............................................................................ 30 Model Development ............................................................................... 33 Recursive Generalized Least Square Estimation ................................. 33 Evaluation of the Accuracy of the Sequential Generalized Least Square Estimation Procedure. ..................................................... 39 Parameter Estimation ............................................................................. 41 Introduction ............................................................................................ 41 Calculating the Probability of Choosing a Path p(k/ t,r) ..................... 43 Procedure For Finding Multiple Paths ........................................... 44 Calculation of Path Total Travel Time ............................................ 46 Calculating Link Travel Times ........................................................ 47 Estimation of parameter p ................................................................ 48 Procedure For Finding Link-Path Incidence Fraction on (k,t,l,j) ..... 51 Calculation of Error Variances .............................................................. 55 Formulation of The Error Variance Calculation Procedure .......... 56 Calculation of Error Variances For The Estimated Origin- Destination Volumes ........................................................................ 59 Chapter 5 Model Validation .................................................................................... 60 5.1 Introduction ............................................................................................ 60 5.2 Analysis of The Results. ......................................................................... 61 5.3 Sensitivity Analysis ................................................................................ 61 Chapter 6 Data .......................................................................................................... 71 6.1 Representation of an Urban Area asNetwork ...................................... 71 6.2 Data Collection ........................................................................................ 73 6.3 Data Description ..................................................................................... 74 6.4 Estimating Speed and Delay For All the Links in the Network ......... 75 6.5 Adjustment of the Apriori (1987) 0-0 Matrix. .................................... 76 6.6 Choosing the Parameter u ...................................................................... 77 6.7 Choosing the Bonus Factor B ............................................................... .78 6.8 Test of Model Responsiveness When Different Weights are Assigned to the Count Data versus Prior O-D Estimates Data .......... 78 Chapter 7 Analysis of the Results ........................................................................... 85 7.1 Introduction ............................................................................................ 85 7.2 The OLS Procedure ................................................................................. 85 7.3 The GLS Procedure ................................................................................. 86 7.4 Revision of the Assumptions ................................................................. 88 7.5 Adjustment Procedure ........................................................................... 89 7.6 An alternate OLS-GL5 approach ........................................................... 92 7.7 Possible Sources of Differences Between the Estimated and Actual Link Counts ................................................................................. 93 Chapter 8 Conclusion ............................................................................................. 142 8.1 Model Development ............................................................................. 142 8.2 Implementation ..................................................................................... 143 8.3 Model Verification ................................................................................ 144 8.4 Application ............................................................................................ 144 8.4.1 OLS runs .......................................................................................... 145 8.4.2 GLS Runs ......................................................................................... 145 8.4.3 Alternate OLS-G15 Runs ............................................................... 145 8.5 Revision of Assumptions 145 8.6 Listing of the Assumptions .................................................................. 146 REFERENCES ........................................................................................................... 148 vi Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Table 12. Table 13. Table 14. Table 15. Table 16. Table 17. Table 18. Table 19. LIST OF TABLES Comparison of Predicted Versus Observed Link Flows - GLS runs - day 1 and day 2 - Sample Data ..................................................... 64 RMSE and %RMSE for Different Values of the Constant Parameter u -interval 1 of day 2 - Sample Data ...................................... 70 RMSE for different values of the constant parameter p -interval 1 of day 1 ................................................................................................... 81 RMSE and %RMSE for different values of the bonus factor B- interval 1 of day 1 .................................................................................... 2.82 RMSEs For Different Values of the Weighting Factor on - 12 intervals -GLS run ..................................................................................... 83 Comparison of Actual Versus Estimated Link Counts - 12 intervals - GLS run .................................................................................... 84 Comparison of Predicted Versus Observed Link Flows - OLS Run - Day 1 .............................................................................................. 100 Comparison of Predicted Versus Observed Link Flows - OLS Run - Day 2 .............................................................................................. 101 Comparison of Predicted Versus Observed Link Flows - OLS Run - Day 3 .............................................................................................. 102 Comparison of Predicted Versus Observed Link Flows - OLS Run - Day 4 .............................................................................................. 103 Comparison of Predicted Versus Observed Link Flows - OLS Run - Day 5 .............................................................................................. 104 Comparison of Predicted Versus Observed Link Flows - GLS Run - Day 1 .............................................................................................. 110 Comparison of Predicted Versus Observed Link Flows - GLS Run - Day 2 .............................................................................................. 111 Comparison of Predicted Versus Observed Link Flows - GL5 Run - Day 3 .............................................................................................. 112 Comparison of Predicted Versus Observed Link Flows - GLS Run - Day 4 .............................................................................................. 113 Comparison of Predicted Versus Observed Link Flows - GLS Runs - Day 5 ............................................................................................. 114 Comparison of Predicted Versus Observed Link Flows - OLS Run - Adjusted Data - Day 5 ................................................................ 120 Comparison of Predicted Versus Observed Link Flows - GLS Run - Adjusted Data - Day 5 .................................................................. 126 Comparison of predicted to observed link flows - Alternate GLS run - day 1 ....................................................................................... 132 vii Table 20. Comparison of Predicted Versus Observed Link Flows - Alternate GLS Run - day2 ...................................................................... 133 Table 21. Comparison of Predicted Versus Observed Link Flows - Alternate GLS Runs -day 3 ..................................................................... 134 Table 22. Comparison of Predicted Versus Observed Link Flows - Alternate GLS Runs -day 4 ..................................................................... 135 Table 23. Comparison of Predicted Versus Observed Link Flows - Alternate GLS Runs - day 5 .................................................................... 136 viii LIST OF FIGURES Figure 1. Drawing, Depicting the Synthetic Network .......................................... 63 Figure 2. GIS run - actual vs. estimated link counts interval 1 - day 2 ................ 65 Figure 3. GIS run - actual vs. estimated link counts interval 3 - day 2 ................ 66 Figure 4. GLS run — actual vs. estimated link counts interval 6 - day 2 ................ 67 Figure 5. GIS run - actual vs. estimated link counts interval 9 - day 2 ................ 68 Figure 6. GIS run - actual vs. estimated link counts interval 12 - day 2 .............. 69 Figure 7. Flow Chart Depicting the Overall OLS - GLS Procedure ...................... 97 Figure 8. Drawing, Depicting the Location of the Over-Estimated Access / Egress Links to a Centroid 98 Figure 9. OLS run - actual vs. estimated link counts interval 1 - day 5 ............ 105 Figure 10. OLS run - actual vs. estimated link counts interval 3 - day 5 ............ 106 Figure 11. OLS run - actual vs. estimated link counts interval 6 - day 5 ............ 107 Figure 12. OLS run - actual vs. estimated link counts interval 9 - day 5 ............ 108 Figure 13. OLS run - actual vs. estimated link counts interval 12 - day 5 .......... 109 Figure 14. GLS run - actual vs. estimated link counts interval 1 - day 5 ............ 114 Figure 15. GIS run - actual vs. estimated link counts interval 3 - day 5 ............ 115 Figure 16. GLS run - actual vs. estimated link counts interval 6 - day 5 ............ 116 Figure 17. GLS run - actual vs. estimated link counts interval 9 - day 5 ............ 117 Figure 18. GL5 run - actual vs. estimated link counts interval 12 - day 5 .......... 118 Figure 19. OLS run - actual vs. estimated link counts interval 1 - adjusted day 5 data .................................................................................................. 121 Figure 20. OLS run - actual vs. estimated link counts interval 3 - adjusted day 5 data .................................................................................................. 122 Figure 21. OLS run - actual vs. estimated link counts interval 6 - adjusted day 5 data .................................................................................................. 123 Figure 22. OLS run - actual vs. estimated link counts interval 9 - adjusted day 5 data. ................................................................................................ 124 Figure 23. OLS run - actual vs. estimated link counts interval 12 - adjusted day 5 data .................................................................................................. 125 Figure 24. GLS run - actual vs. estimated link counts interval 1 - adjusted day 5 data .................................................................................................. 127 Figure 25. GLS run - actual vs. estimated link counts interval 3 - adjusted day 5 data .................................................................................................. 128 Figure 26. GLS run - actual vs. estimated link counts interval 6 - adjusted day 5 data .................................................................................................. 129 Figure 27. GLS run - actual vs. estimated link counts interval 9 - adjusted day 5 data 130 Figure 28. GIS run - actual vs. estimated link counts interval 12 - adjusted day 5 data .................................................................................................. 131 Figure 29. Alternate GLS run - actual vs. estimated link counts interval 1 - adjusted day 5 data .................................................................................................. 137 ix Figure 30. Alternate GLS run - actual vs. estimated link counts interval 3 - adjusted day 5 data .................................................................................................. 138 Figure 31. Alternate GLS run - actual vs. estimated link counts interval 6 - adjusted day 5 data .................................................................................................. 139 Figure 32. Alternate GLS run - actual vs. estimated link counts interval 9 - adjusted day 5 data .................................................................................................. 140 Figure 33. Alternate GLS run - actual vs. estimated link counts interval 12 - adjusted day 5 data .................................................................................................. 141 Chapter 1 Introduction The purpose of this research is to design a model for estimating a dynamic or time dependent origin-destination trip table for a large urban area and to test the model using traffic count data obtained from inductive loops on a sample of the arterial streets in the network. So far the only reliable way to obtain an origin destination trip table has been through surveys at origins and destinations. These surveys are expensive and they do not reflect changes in traffic flow that occur over time. The hope of this research is to provide a reliable substitute to the traditional method of obtaining an OD trip table. One potential use for time dependent O-D data is in controlling traffic flow in freeway corridors. Ramp metering strategies need O-D data to determine the optimum on-ramp diversion rates to parallel arterials. Likewise, incident management strategies which divert traffic from freeways to parallel service roads should consider the prevailing corridor O-D pattern. The need for 0D data is especially acute in freeways with express / collector and transfer lanes, and in metropolitan-wide control systems which consider freeway to freeway diversions during peak periods. Dynamic O-D estimates are required for the analysis of time-varying dynamics of freeway congestion growth and decay. The use of time dependent O-D data is not limited to freeway traffic analysis. An advantage of estimating dynamic 0D data for a large urban area is for use in on-line control of traffic signals. The new generation of traffic signal models, 1 such as the IVHS model "MOTION" for traffic dependent signal control, rely on 0-D flows for the determination of flow profiles, as well as the incremental optimization to adapt signal settings to changing traffic conditions. Control systems in use today can not adapt to the near term projected traffic conditions in the network, because they have no information about the existing O—D flows. These models, currently depend on historical data, if the system includes any forward looking capability. To develop more effective strategies for better on- line signal control it is necessary to improve the existing models for estimating O-D flows. The ultimate goal of future looking traffic adaptive control systems is to predict variations of flows in a network so that traffic signals can react accordingly. The conventional way of collecting traffic data by traffic surveys is not suitable for this purpose. There have been a considerable number of studies conducted to determine if an O-D matrix could be derived from traffic counts, because these data can be collected and processed automatically [Cremer,Keller, Okutani, Casetta] [3,4,7,1]. The need for reliable time-dependent O-D data for traffic analysis in a metropolitan area has been identified by many European researchers as well. The European research projects DRIVE and PROMETHEUS are designed to collect more and better data about real time traffic flow. MOTION [8] is an on-line signal control system of the DRIVE project. MOTION has been developed by using the experience gained from the German off-line signal control model SIGMA and existing on-Iine systems in Italy. This model determines: (1) split as a function of traffic volume and turning movements at the intersections, (2) cycle time according to the traffic volume at critical intersections, (3) number of phases, and on a future development level, phase sequence as a function of the minimum sum of intergreen times and the time-space-diagrams and (4) offsets as a multi-fimctional result of origin and destination flows in the network, traffic volumes and an impact analysis of traffic and environmental criteria. Different modules of this system include O-D based estimates of flow at the intersections, and a dynamic optimization plan based on network O-D flow estimates. The reference to signal timing adjustment is to a traffic adaptive signal timing arrangement, not to traditional actuated operation of traffic signals. Under advanced adaptive signal control strategies, adjustments are made in response to 0-D demand profiles in a network for various time slices. This allows the system to select the most appropriate signal plan from a predeveloped library. Individual cycle phases are then adjusted based on the volume of traffic on individual streets during a cycle. Chapter 2 2. Literature Review The subject of dynamic origin destination trip table estimation using on-line traffic count data has been investigated by many researchers. So far the main area of concentration has been in estimating time dependent O-D trips for a section of a freeway with few entering and exit ramps, or an intersection with. (m>1 )entries and (n>1) exits [Cremer, Keller, Cascetta] [3,1]. Until now, only limited studies have been done on estimating time dependent O-D trips for an urban area , using traffic counts received from detectors on arterial roads. A few researchers have tested dynamic O-D estimation models on very small networks; using simulated data [Keller, Ploss, Okutani][6,7]. The networks used in testing different O—D estimation models have 6 or 7 centroids, and a minimum of 13 links and maximum of 40 links [Okutani][7] . These dynamic O-D estimation models produce good O—D estimates under controlled conditions. To validate the dynamic O-D matrix, comparisons have been made with data obtained either from a sample or through simulation. The idea in dynamic 0D trip table calculations is to estimate time dependent variations in 0-D patterns. The assumption is that the 0-D volumes are not constant, but rather vary between successive time periods. In estimating O-D volumes for each interval, no matter what kind of modeling approach is used, information such as departure time from the origin, route taken to go from the origin to the destination, total travel time from an origin to a destination given a particular route, and arrival time on various links of a route are needed. The following paragraphs describe some of the issues that need to be considered in modeling O-D travel. In addition several O-D modeling approaches will be reviewed. 2.1 Dynamic O-D estimation models Several dynamic O-D estimation methods have been tried by researchers. In the early stages of dynamic 0D estimation, the type of geometry considered was an intersection. The parameter estimation was limited to finding the fraction of the volume that entered through entrance i, which exits from exit j. Later, freeway segments with a few entering and exit ramps, and small networks were considered. The following are the most significant algorithms originally used in dynamic O-D estimation for an intersection [Cremer, Keller] [2,3], and later extended for use in estimating time-varying O-D matrices for networks with few intersections: Constrained optimization method, Ordinary least square estimation, Recursive estimation, and estimation by Kalman filtering. The following are the most significant algorithms used in dynamic O-D estimation for complex networks such as freeways with entering and exit ramps: Linked static -dynamic estimation method [Keller, Ploss] [6], estimation by Kalman filtering[Okutani][7], Generalized Least Square Estimation[Cascetta][1]. The method of Recursive Generalized Least Square Estimation is the method used in this study to estimate O-D trip for an urban area. Therefore a thorough description of this method is provided in a later section. 2.2 Intersection Flow Models In many cases traffic flows through a complex intersection can not be measured directly, and therefore must be estimated from vehicle counts at the entries and exits of the intersection. The following methods were used in estimating the magnitude of traffic flows through complex intersections by using vehicle counts at the entries and exits of the intersection . Furthermore, it was shown that by using the time variant quality of volume counts a unique solution to the problem of estimating unknown flows was obtained. 2.2.1 Method of cross-correlation matrices [Cremer, Keller][2,3] This is a least square estimation method, and was used in estimating O—D flows for an intersection with (m>1) entries, and (n>1) exits. A period H was chosen. This period was then divided into k intervals of length T each. A sample of observations was taken for each period. During each sampling period, the volume q(i,k) that entered the entrance (i) during (k-1) T St < kT was observed. Volume which left exit j, W) k) during (k-I) T + t S t < kT + ‘C , 1 >0 was also observed. The objective was to estimate the part of volume q(i,k),f(i,j,k) which left the intersection through exitj during (k-I) T + t S t < kT + t. where t is the average travel time a vehicle needs to pass from an entry to an exit. f(i,j,k) is formulated as: f(i.j,k) = WM. 120', j, k) (1) Mi, 1', k) is the fraction of q(i,k) that go from i to j. The following assumptions hold for b( i, j, k) 0.<.b(i, j, k) 31 for all i,j,k (2) fl 2: b(i, j, k) = 1 for all i,k, (3) i=1 b(i, i, k) = 0 for i=1,...,min(m,n) (4) Balancing the departing and entering vehicles, for each time interval yields the following relationship m yfj,k) = X f(i,j,k) (5) .=1 since each O-D flow is a certain portion of the ith entering flow, f(i,j,k) can be replaced by its equivalent in equation (1). The following relationship then holds m y(j,k) = 2 q(i,k). b(i, j, k) (6) i=1 The matrix form of equation (6) is written as: Y'(k) = (1'00 - WC) (7) where y’(k) is a (1 x n) row vector with elements y(j,k), q’(k) is a (1 x m) row vector with elements q(i,k) and B(k) is an (m x n) matrix with elements b( i, j, k). Since q’(k) and y’(k) are obtained through sampling, the only parameter that needs to be estimated is B(k). To calculate B(k) first, the mean values ti, 3’, and B are calculated. K E =(1/K)2 q’(k) k=1 K y’ =(1/I<)E y’(k) k=1 K ”E = 0/10 2: B(k) k=1 K is the total number of sampling intervals. q’(k) , y’(k), and B(k) can be expressed as the sum of their mean plus a random deviation cl’(k) = i + AQ’Uc) Y'(k) = 'y" + A)"(k) (8) B(k) = B + AB(k) inserting this into equation (7) gives y’(k) = (1'00 - (E + A1300) (9) Or 7’ + AY’UC) = (E + Acl'(k)) - (E + AB(1‘)) (10) taking the mean on both sides of equation (10) over a whole observation period, average exit flows are obtained as K y’ = 'q'. E + UK 2 Aq’(k). AB(k) (11) k=1 subtracting equation (10) from equation (11) and substituting the summation index k with l the following relation is obtained K Ay’(k) = Aq’(k) . '1? + {q’(k) . AB(k) - 1/1< 2 Aq’(l). AB(I) (12) l=1 assuming AB(k)=0 causes the second term in equation (12) to vanish resulting in the following A3"(k) = AQ’UC) - E (13) introducing the following abbreviations Q = [Aq'(1).Aq'(2)...... Aq'aol' (14) y = [Arm ,Ay(2) . ...... . Ay’(1<)]’ prime denotes transpose of a vector. Partitioning the observation period into ' subsequent time intervals and using time sequence of traffic counts provides additional information which solves the problem of an underdetermined system of equations and leads to a unique solution. The least square error solution to equation (13) is given by E =(Q'Q)'1. 0'. Y (15) where Q Q = Z Aq(k)°Aql(k) = Kq’qq Q . Y = 2 Aq(k). Ay’(k) = K. ¢qy where ((1% and ¢qy are the cross-correlation matrices correlating the sequences of Aq(k) with Ay(k), respectively over the observation period k=1,..,K. With this notation equation (15) can be rewritten as A _ -I B ' ‘l’ 44% B is the least square solution of B. 10 The above estimation procedure was tested using synthetic and real data from different studies. A 5—leg intersection was considered for testing this model. The synthetic entry flows q(i,k) were generated as integer random numbers. For the deviations AM i, j, k) random numbers from a normal distribution with zero mean and a standard deviation of 0.05 were taken first. Then these deviations were updated such that conditions (3), and (4) were preserved, and gave realistic integer flows fl i,j,k) . Comparison of the OD table of synthetic data with those from the dynamic estimation procedure showed a close match [Cremer, Keller] [3,4]. 2.2.2 Method of constrained optimization [Cremer, Keller] [3,4] Here the estimation problem described above is reformulated as a constrained optimization problem. Given a sample of entries and exits over K intervals, an estimate for B , B is obtained such that the squared error between the observed exit flow, and estimated exit flow is minimized. K I=min(1/KZIly(k)-9(k)l|2) (16) k=1 where m y(i,k) = 2 q(i,k) . b(i,j,k) for all j,k i=1 0 _<. b(i,j,k) $1 for all i,j,k n 2 b(i,j,k) =1 for all i,k i=1 b(i,j,k) = 0 for i=1,...,min(m,n) 11 91k) =q' . sac) - 2 0'06) .1? + q'(k) 1‘3 finqtk» (18> =1 the optimal solution for B can be obtained by setting the first derivative of J with respect to B equal to zero. K K I= -2/I< ( 2 q(k).y'(k> ) + 2/I< (2 q(k) q'(k» . 13* = o (19) k=1 k=1 Here 13* represents the optimal solution under the assumption that the solution lies inside the admissible parameter space. The optimal solution 13* to equation (16) is given by ., _ -1 B ‘ ¢ qq°¢qy ¢qq and ¢qy were defined earlier. This method when tried under the same . geometry and data described for the cross-correlation method gave better O-D estimates mainly because it uses the information from all equations formulated. 12 2.2.3 Method of Recursive Estimation [Cremer,Keller] [3,4] Unlike the two methods described before, the predicted exit flow deviations are computed from the measured entry flow deviations based on the estimation of the (k-1)th step by a model similar to equation (13) as follows Ay’(k) =Aq'(k). ink-1) (20) this prediction then is inserted into the following recursive correction formula; 130.): B(k—1)+ y. K(1 /1< Z Aq(k) . (Ay(k) - 119(k)), summed over k=1,...,K (21) where y is a gain factor. The following assumptions assure the convergence and stability of the above equation 1) The expected values of Aq(k) and AB(k) are zero. AB(k) is an (m x n) matrix of random deviations in split parameter B. E( Aq(k) ) = o E( AB(k)) = o (22) 2) Covariances are given by an (m x m) matrix E{Aq(k). Aq’(1)} = (Sq, 0) (23) S for k: l, and Ofor k¢l q 13 E( Ab(i, k).Ab’(j, 1)) = (V), 0) (24) V,- for i=j,and k=l,and 0for i¢jork¢l E( Aq(k).Ab’(i, 1)) = 0 for all i,k,l 3) the gain parameter 7 should be chosen according to O< 7 <2/ (m .03) a; = max ( a Aqi2(k)) (25) The exit flows are updated based on the estimated entry flows Aq(k). This procedure was tested under the same geometry and with synthetic data as the previous models. The results were better than the other two methods above when the split parameters b( i, j, k) varied significantly from hour to hour (i.e. when the OD pattern was different at different hours) due to better tracking abilities. 2.2.4 Estimation by Kalman Filtering [Cremer, Keller] [3,4] The Kalman filtering technique is another recursive estimation method. Consider the equation (6) m y(i,k) = 23 (103k). b(i, j, k) = q’(k).b(i,k) (26) i=1 14 b(i,]c) is the jth column of matrix B(k) containing all split parameters which are related to the jth exit flow. The dynamic nature of the split parameter b(i,k) is incorporated via the state equation, wherein b(iJc) = b(i-I. k) + w(k) <27) w(k) being a random error term with mean zero and known covariance W. E{ w(k)} = 0 ElAw(k). Aw’(k)} = (w, 0) (28) W for k=l, and zero for k¢l. Information about the state variables is given by the measurement of the jth exit flows which are related to the b(j,k) by equation (26), plus a random measurement error vector v(k). The relationship between y(i,k) and q(k) is linear and is specified by the measurement equation yolk) = q’(k) - b(i, k) + v(k) (29) with E{ v(k)} = 0 Etv2(k)} = (11.0) (30) Kalman filter generates bias-free estimates only if the system model is completely observable. This implies that rank[ q(k-K), q(k-K+1), q(k-K+2),....,q(k)] = n (31) where n is the number of exits, K is the total number of intervals, and k represents each interval. This condition implies that the matrix Q of 15 equation(14) or equivalently matrix ¢qq has full rank. It is important to note that choosing the OD flows f(i,j,k) directly as state variables would set up a system description which is not completely observable and for which the Kalman filter could not be applied. The split parameter b(j, k) can be estimated by the following set of equations d(k) = [vac-1) +Wlq(k)[q’(k)[l’(k-1)+W]q(k)+R]'1 (32) d(k) is called the filter gain, and P is the estimation error covariance matrix. Updating the estimation error covariance matrix is obtained from Mt) = [I - d(k) - Q’(k)][P(k-1)+Wl (33) b(i, k) is estimated by the following equation 130,1.) = Bonk-1) + d(k) (yank) - q(k)-130.1(4) (34) For the start of this computation an initial value of the covariance matrix of estimation errors P(0) must be specified. If the assumptions of the Kalman filter are fulfilled, the filter equations compute bias-free, minimum variance estimates. Otherwise, estimates may not necessarily be optimal, but have minimum variance. The Kalman filtering process when applied on the same geometry as the above models with the same synthetic data does not produce the best results. The reason for this is that the assumption of the random variations as being Gaussian white noise is only partially fulfilled. The results showed considerable fluctuations trying to follow the real variations. 16 2.3 Complex Networks Three major algorithms that are used in estimating O-D matrices for complex networks will be reviewed here. The first algorithm is the Linked Static- Dynamic Correlation Method developed by Keller and Ploss [6]. This algorithm is an extension of the model developed by Cremer and Keller [3,4] which was explained earlier under the section on single intersections or partial networks. The linked static-dynamic method for the estimation of O-D flows in . transportation systems from traffic counts is based on the following concept. The model finds the coefficient of correlation (or weights) between an entry volume and an exit volume profile. The coefficient of correlation is an indicator that shows to what extent the variation of one variable is determined by the variation of another variable. These coefficients are used as weights for the OD flows, and are linked with an entropy maximization model to find a unique O-D estimate. This new approach uses all the information that can be used for partial networks or single intersections to improve the results for the total system. The Linked Static-Dynamic Correlation Method is the algorithm used in project "MOTION" for estimating O-D flows. "MOTION" is an on-line signal control system developed using the German off-line signal control model SIGMA and existing on-line systems in Italy. The Kalman filtering algorithm for complex networks was developed by Okutani [7]. In this paper, the ideas introduced by Keller and Cremer [3,4] are extended to a general network. Provisions are made to solve the problem of an underdetermined system. 17 The third model that will be reviewed is the dynamic estimation of O-D flows based on time-varying traffic counts and any other available information such as outdated O-D matrices. Two types of optimization procedures are introduced. A simultaneous GLS estimation which produces joint estimates of the whole set of O-D matrices, one for each time interval, and a sequential GIS estimation which produces a sequence of CD estimates for successive time intervals. 2.3.1 Linked Static-Dynamic Correlation Method [Keller, Ploss] [6] The relationship between link flows and OD demand is essential for dynamic O-D estimation. This relationship is expressed by I I V0) = X X p(i,j,l) . f(i,j) (35) i=1 j=1 V(1) represents volume on link 1; p(i,j,l) is the fraction of O-D flows f(i,j) on a path that includes link I. The system of linear equations based on equation (35) is highly underdetermined. However, by including the estimated turning flows at intersections and cordon line counts on the periphery of the system, additional equations for the determination of the OD flows can be developed. This addition reduces the degree of under determination of the systems considerably. As a result, this method has two parts: part one, is estimation of the O—D flows at intersections using a linked static dynamic correlation model. This information is then added to part two to decrease the degree of under determination. Part two is the estimation of the O-D flows in complex traffic networks with the linked static-dynamic method. 18 PART ONE: Estimation of the OD flows at intersections in simple networks using a linked static-dynamic correlation model. Definitions I Number of entries in a system I Number of exits in a system T Number of intervals with the length At in a time period p q(i,k) volume passing entry i during interval k, (k=1,..,T, i=1,..,I) Fifi) = UT 2 q(i,k), summed over k=1,..,T y(j, k) volume passing exitj during interval k, ( k=1,...,T, j=1,....,]) 70') = [If 2 110) k), summed over k=1,...,T t(i,j) travel time from i to j (integer multiple of At ) f(i,j) partial volume from i to 1' during the observation period p b(i,j,k) = f( i,j)/q( i) (split parameter), 02 b(i,j,k) $1 The following static conditions have to be satisfied I (1) q( i) = X f(i,j) , j=1 I (2) y(j) = X f(i,j), i=1 J (3) Z b(i,j,k)= 1, i=1 19 2.3.1.1 Calculation of the Coefficient of Correlation, w(i,j) The coefficient of correlation w(i,j) can be obtained from the following equation which is similar to the correlation coefficient . The correlation coefficient gives a quantitative measure of how strong the linear relationship is between x and y- i.e., to what degree x and y are related. T w(i,j) = { Z ((q(i,k) - 5(2)) . ( y(j,(k+t(i,j)) - 7(1)) )2 } k=I T +{ 2 ((40310 --q-(i»2. 2t y(j,(k+t(i,j)) - r0»?! k=1 large variations in w(i,j) can result if the observation period p is small. This can be avoided by exponential smoothing of w(i,j), using a smoothing factor or as follows w(i.j) = a . w( 231', p) + (l-a) . w(ij. p-l) O-D flows can be expressed by f(i,j) = w(ij) . Q(z') . Y0) the unknown values Q( i) and Y( j) can be obtained by l Q(i) =q(i) + 2 (YO) . w(i,j)). i=1 26) 1 Y0) = Y(i) +2 (Q(i) . w(i,j)). i=1 Using an iterative solution method such as an entropy model, a unique solution to this system can be found that satisfies the static conditions (1) and (2) exactly and includes the dynamic weights w(i,j) for the most recent period. This system of equations are added to equation (34). PART TWO: 2.3.1.2 O-D Flow Estimation For a Complex Network The first step in this procedure is to find the shortest path between each O-D pair that includes link 1. Given information such as entry i, counting site a, and exit j, and using deterministic or stochastic successive capacity restrained algorithms, these set of paths can be found. The fraction p(i,j,l) of the total number of vehicles f(i,j) that take a certain route that includes link I can be determined given a set of paths. p(i,j,l) is 1 if link 1 falls on a path going from i to j, and 0 otherwise. Travel time from i to j, t(i,j) is the travel time on the shortest path between i and j. Based on the travel times, the weights w(i,j) are recalculated and applied to the flows from any entry to any exit at the cordon line of the system. O-D volumes can be formulated as f(i,j) .-. w(i,j) . xa» . II xa) . p(i,j,l) I p(i,j,l) is equal tol if link I is on the path between i,j and is equal to zero otherwise. 21 with I I I I X(0)={2 Xfli,j)}+{2 Xw(i,j)}, i=1 j=1 i=1 j=1 A solution for X(l) can be found iteratively by applying the balancing algorithms of the entropy model of Van Zuylen [10] V0) = 22 p(ijJ) . w(i,j) . X(l) . II(X(1) . p(i,j,l)) I By including the turning flows of intersections inside the observed network that have already been estimated according to the algorithm in part one, new equations can be set up which decrease the degree of under determination. The entropy algorithm should converge reasonably well. The method suggested by Keller and Ploss [8] has been incorporated in the system ” MOTION ”, but this model is still in the experim ental stage. Many improvements can be made by exploring the effects of multiple paths on the estimation of dynamic weights w(i,j), and travel times t(i,j). The above method considers the entropy model as the only way of finding the optimal O-D flows for a network and does not consider other optimization algorithms which may be more efficient in 0D flow estimation. 22 2.3.2 Kalman Filtering Algorithm 2.3.2.1 Estimating Time Varying O—D Trip Matrices From Traffic Counts [Okutani][7],[12,13] This is an extension to a larger network of the Kalman filter formulation for estimating O-D flows at an intersection introduced by Cremer and Keller [3,4]. The model is formulated for a network with n O-D pairs and M links. Let f(i,t) denote the ith O-D flow during a time period t and y(j,t) denote the observed traffic counts on link j during the same period t. Let f( t) be an n dimensional vector with elements f(i,t) , then the next period O-D flows can be expressed by the following relation known as the state equation: f(t+1) = 2 Hm(t)f(t-m) + w( t), summed over m=0,..,q (36) where the input, or plant, noise w(t) is a zero-mean white-noise process, with covariance E! w( t) } = 0 cov {w(t) ,w(s) I = R515 5ts = 1 for t=s 6ts = 0 for tats Setting q (the delay time) equal to zero and H0( t) equal to an identity matrix, equation (36) can be rewritten as f(t+1) = f(t) +w(t) The link volume is expressed as a fraction of O-D flows f( t) given by the linear algebraic relation, known as the measurement equation 23 y( t) =2 P(m,t) . f(t-m) + 'v( t), summed over m= ,..,p (37) where the measurement noise v(t) is a zero-mean white noise process, with 13100)) = 0 covaa),v(s)) = R155 815 = 1 or t=s ts = 0 for tats where y( t) = (y1(t),y2(t)....,yM(t)) In equation (37) P(m,t) is the proportion of O-D f(t-m) flows which use link j during period t. This is an (M x n) matrix. p represents delay. Some transformation of equations (36) and (37) are necessary to show delay in O-D flows. Therefore, O-D flows are redefined as f(m,i,t) = f(i,(t-m)) (i=1,2,...,n, m=0,1,..,r) 1' = max(p,q) Let f(m,t) denote a matrix with elements f(m,i,t), i.e. f(m,t) = ( f1(m,t), f2(m,t),...., fn(m,t))’ where prime denotes a transpose. Let F(t) be the vector of dimension n(r+1) F(t) = (f0(t),f1(t),...,f'(t))’ equations (36) and (37) can be rewritten as F(t+1) = (b(t) P(t) + W(t) Y(t) = P(t)F(t)+ v(t) where 24 ‘ H(t) a...) = j" - - I 0 d>(t) is an n(r+1) x n(r+1) matrix. W( t) = (wt(t), 0) W(t) is an n(r+1) x 1 matrix. pa) = ((p0(t),p1(t),...,pr(t» P(t) is an (M x n(r+1)) matrix. H(t) = (H0(t),H1(t),....,Hr(t)) H( t) is an (n x n(r+1)) matrix. H(m,t) = 0: (n x 11) matrix for m > q P(m,t) = 0 (M x 11) matrix for m > p The estimate of F( t), F( t) can be obtained from the following set of equations 17:04.1): «mafia-1) + K(t)[Y(t) - Panama-1)] (38) K(t) is called the Kalman gain matrix with dimensions ( n(r+1) x M) and is estimated by 1(a) = S(t)Pt(t)[P(t)S(t)Pt(t) + Kl]-1 S( t) is and ( n(r+1) x n(r+1)) estimation error covariance matrix called Apriori variance. S(t) satisfies the following system of matrix difference equations sa) = (t-1)Q(t-1)d>t(t-1) + R (40) Q(t) = [1- K(t)P(t)]S(t) (41) Q(t) is called the A posteriori variance. 25 The initial conditions for HO), and 8(0) have to be selected before using the above results for estimating O-D volumes. The Kalman Filtering methodology was tested using a network with five and eight links. Although the method performed well for a small network, it is still in the early stage. Before definitive conclusions are drawn, extensive experimentation must be carried out. The main computational difficulty of this method is that the inversion of a large matrix is required in the algorithm. Therefore, a way of reducing the dimensionality must be sought along with an efficient algorithm to achieve this inversion when applying the method to a network of practical size [Okutani] [7]. 2.3.3 Generalized Least Square Method [Cascetta][1] The relationship between link flows and OD flows is not static, but varies across time intervals. The dynamic link O-D flow equation is expressed as "0d )1 v(l,h) = X Z p(r,t,l,h) . f(r,t), r=1 t=1 where 00,11) is flow on link 1 during interval h. f(r,t) represents O-D flows for an OD pair r that left the origin during some previous interval t and is on link 1 during h. p(r,t,l,h) is the fraction of O-D flows contributing to the flow on link 1 during interval h. The link O-D flow equation can be expressed in term of path flows through the following equation P(k,t) = d(r, t) . q(k,t) (42) 26 120,11) = a(k,t,l,h) . F(k,t), k=1,...,Kr t=1,..h (43) P(k,t) is the flow between O-D pair r following path k that left the origin during previous intervals t. q(k/t) is the probability of choosing path k conditional on the departure interval t. a(k,t,l,h) is called the link-path incidence fraction which is the fraction of path flow P(k,t) contributing to link flow v(l,h). By combining the two equations (42) and (43) the following relationship is obtained Kr p(r,t,l,h) = 2 (10:111.) . q(k/t) , k=1 Kr denotes a set of all paths connecting O-D pairs r. An estimate for p(r,t,l,h) can be found by obtaining estimates for a(k,t,l,h) and q(k/t) represented by d(k,t,l,h) and fiat/t). q(k/t) can be found by a random utility model such as q(k/t) = prob(C(k,t) + 0(k,t) .<_ C(m,t) + 0(m,t)) V m e Kr where C(k,t) is the cost associated with taking path k, conditional on departure time t C(m,t) is the cost associated with taking path m, m at k, conditional on departure time t 0(k,t) is the difference between the perceived cost and the actual cost of taking path k 22 0(m,t) is the difference between the perceived cost and the actual cost of taking path 171 A path choice model can be obtained by making different assumptions on the joint distribution of error term 9(k,t) . The most popular specifications based on zero variance, Weibull-Gumble and a multivariate normal distribution of residuals are Logit and Probit models. The link - incident fractions a(k,t,l,h) can be found based on the assumption that path flow F(k,t) is uniformly spread over the departure interval of length T and remain equally distributed over the same time length while moving on the network. The fractions a(k,t,l,h) can be estimated by a(k,t,l,j) = 1 - (t(k,t,l)/T - ( j-1)) if (j-1)T < t(k,t,l) < ]T = 1 - a(k,t,l,(j-1)) if (j-2)T < t(k,t,l) < (j-1)T = 0 otherwise t(k,t,l) is the arrival time at the head of the path flow F(k,t) on link 1. Knowledge of average link speed is required to calculate average travel time on each path k, and related arrival time on a link of a path. Since d(k,t,l,h) and 6 (k/t) are approximations of a(k,t,l,j) and q(k/,t), estimated link flows differ from the actual ones by a random error term 11 "ad v(l,h)=2 )3 130,111.). f(r,t) + 50,11), t=1 r=1 in matrix form the above equation can be written as 28 11 V01) = 2 P(t,h). f(t) + 50.), (44) t=1 The system of equations set up this way can be solved using either simultaneous or sequential generalized least square (GLS) estimation models. 2.3.3.1 Simultaneous Estimation This method provides O-D estimates for all intervals in one step. Using GLS as the desired optimization method the system is set up as h so.) = V(h) - ( Z P(t,h). 5(1)), t=1 s(t) is the unknown O-D demand vector. Optimal estimates of O-D flows for all intervals can be computed by setting the first derivative of the quadratic equation (45) with respect to O-D flow estimates 1' 4.“) equal to zero. h (r ‘10.), r *2(n),..., 1' *ha.) ) = min 2‘. «5(1) im'V1tt)(s(t)-i(t))+5'hw15h =1: :1: :1: fl 20,f 220,...1' hZO i=1 (45) f (t) is an O-D matrix obtained through a survey. V(h) is the variance - covariance matrix of sampling errors. W(h) is the variance — covariance matrix of assignment errors. f *1(h), f *2(h) ..... f a"b(h) are optimal estimates of O-D flows obtained by solving the GLS equation (45). 29 2.3.3.2 Sequential Estimation In this case, an OD demand vector for O-D pair r is estimated for a single interval )1 at each step. Equation (45) can be stated in a new form as h to.) = 2 (WM). 1' *(t)) + P(h,h). $01) + 5),, i=1 . f *( t) is the previous interval O-D estimate. The GIS formulation of the model is as follows: h 51. = $0.) . 2: (P(h,t). r *(t)) + P(h,h). s(h) i=1 )1 r *(h) = min 2 «5(1) - i (1)) Wm (s(t) - i (t))+8’hW'1(h)8h r *(n) 20 t=1 The sequential generalized least square estimation method suggested by Cascetta was tested on a motorway which is part of the motorway system located in northern Italy linking Turin and Venice. True O-D volumes were available for comparison. Consistent and significant estimates of the true O—D flows over 15 minute intervals were obtained with reasonable computational requirements. The sequential generalized least square estimation method was not applied to a practical size network [Cascetta] [1]. 30 2.4 Research Objective Cascetta [1] suggests that further research is needed to assess the possibilities of dynamic O—D estimation with dynamic assignment on a more complex network. The network used by Cascetta is a linear network (Brescia-Vicenza-Padova motorway) where there is only one possible path between an origin and destination. The path choice is always the same; and therefore there is no need for dynamic path assignment. The assignment fractions used to test the model were taken to be identity matrices. Dynamic assigrunent matrices were not considered. The objective of the present research is to test the sequential GLS O-D flow estimation suggested by Cascetta for an urban network using time dependent traffic data obtained from count stations on local streets. In this study path choice varies during each time interval, and the link - incident fractions a(k,t,l,h) are time dependent. The methods discussed so far have only been tried on small networks, such as intersections, with synthetic data. None of the models have been tested on a large network with real time traffic data. So far only the approach suggested by Cacsetta [1] was tested on a segment of a highway, with traffic data being collected for several time intervals. In all the methods discussed so far, the shortest path between an OD pair is the only path considered. The issue of multiple path choice has not been addressed or considered. It is not clear from the description of these methods how they would deal with a situation where the system is underdetermined. 31 The general approach for finding O-D estimates in this research is the Least Square Estimation. This is the overall approach used by all the methods described so far that were tested on networks. The method used in this study is new both in its setup and in its implementation. In the setup the model is similar to the Kalman Filtering system of equations. It consists of two sets of equations: The observation equation, and the state equation. The observation equation describes the relationship between link counts and O-D trips. The State equation describes the relationship between the present interval O-D trips and the previous interval O-D trips. Though this setup is similar to the Kalman setup in concept, the State equation is a variation of the original. The approach to finding time dependent O-D estimates in the first stage of the implementation is to find the Ordinary Least Square Estimates (OLS) of O-D trips. These estimates are by definition the best estimates given a particular data set. The OLS, O—D estimates are used in finding both the observation error and the error associated with the State equation. The second stage of the implementation is to find the Generalized Least Square Estimates (GLS) of the OD trips. The O-D estimates obtained through the GLS application are the unbiased minimum variance estimates. The variance incorporated in the GLS process is a function of both the observation errors and the errors associated with the State equation. This process is repeated sequentially for all the intervals in the study period. This implementation of the model is a new approach to finding the best O—D estimates for a network. 32 The contribution of this research is: 0 Introduction of a new model for estimating time dependent O-D trips for a large network. 0 Implementing the model in an urban network using existing traffic data. The study area chosen is the City of Boston. Traffic data used is data that is available for . every 15 minute interval on selected links in the network. Chapter 3 Model Development 3.1 Recursive Generalized Least Square Estimation The objective of this study is to develop a method for estimating time dependent O-D matrices. By time dependent it is meant that if a time period is divided into smaller intervals, the OD matrices vary for each interval. The model development relies on formulating link flow as a fraction of O-D matrices during each time interval plus an error term which reflects the deviation between the estimated link flow and the real link flow. An O-D matrix for an interval is expressed as a linear combination of the OD matrix of the previous interval plus an error term. The model is a linear system consisting of two equations, the link flow equation or measurement equation, and the O-D matrix equation. The objective is to find an estimated O-D matrix such that a quadratic measure is minimized. The estimate which minimizes the quadratic expression will be called the Least Square estimate. The procedure used to minimize the quadratic measure, or the objective function, by adjusting the O-D matrix based on the link flow estimate error variances and the OD estimate error variance is the Generalized Least Square Estimation method. In this section some basic concepts of the recursive GLS algorithm are given. . Start with the assumption that volume on any link 1 during interval j, z (I, j) is a fraction h( r, t, l, j), of O - D volumes x( r, t), leaving the origin during either the 33 34 same or previous interval and are contributing to the flow on link 1 during interval j. In this formulation, r represents O-D pair r, and t represents the departure time from the origin. Therefore, traffic flow on any link 1 during interval j can be expressed by the following equation: "0d f z( 1, j) = 2 X h( r, t, l, j) . x( r, t) + 120, j), (46) r=1 t =1 where "ad is the number of CD pairs, t represents intervals from 1 to interval j . 120, j) is a zero-mean observation error term, E! v(l, j)} = 0 with covariance, cov{v(l, j), 00, k)} = Vv(j)5jk (47) In matrix form, equation (46) can be written as: Z(j) = H(t, j) X(t) + V(j), 1:1,.) (48) Due to the short duration of total travel times from an origin to a destination in the proposed study area, only two analysis periods are considered in this study, the present interval (j), and the previous interval (j-I). Equation (46) is then modified as follows: nod j z(l,j) = 2 2 Mr, t, l,j) . x(r, t) + v(l,j), r=1 t=j-1 (49) 35' or "0d "0d z(l,j) = 2 h(r,j-1, I, j). £(r,j-1) + 2 h(r, j, I,j) x(r,j) + v(l, j) , (50) r =1 r=1 where i (r, j-1) is the estimate of the previous interval O-D volume. Equation (50) can be rewritten as: "0d 20.)) = X h(r,f,l,j).x(r,j) +v(l,j), (51) r=1 where "0d 20.17 = 2(1,j)- 2 h(r,j-1,1,j). £0,131), (52) r=1 the matrix form of equation (51) is written as: 2(1) = H(J'J)X(j) + W!) Z is an (n) x 1) vector, H is an ( n) x nod) matrix, X is an ("ad x1) vector, and V is an (111 x 1) vector. For j 2 2, :2 (j) can be written as 20') = Z(j) - H(j-1, j) 20-1) (53) for j=1 2 (j) is equal to 20'). The O - D volume for interval j can be interpolated from the O - D estimates of interval (j-l) as follows: 36 120-1) = 1(0) + W(j) (54) where J? (j-l) is a vector of estimated O-D volumes for the previous interval; X( j) is the present interval O—D vector; W is a (nod x 1 ) error or noise vector. The noise W( j) is a zero-mean estimation error term, Et W(j)} = o with covariance cov{W(j),W(k)} = mm). (55) . we can adjoin the O- D estimates for interval X (j-l) with 2 (j) to obtain Z*(j) = H*(j) 12(1) + V*(j) (56) where 22(j) =1 20'). 220-1)}1‘ H*(j) = { H(j, j) , I)t V*(j) =1 (V(j), I}, W(j))t The augmented matrices have the following dimensions: 2* is a ((nj+n0d) x1) vector, )2' is a ((n1+nod) x1) matrix, H * is an ((n)+n0d) x(n1+n0d)) matrix. V” is an augmented matrix of the variance-covariance of the observation errors V0 with the variance-covariance of O-D estimate errors Vw. The partitioned form of V" can be written as V*=(Vvl 0V I Vw’ 0w) V0 is a (njxnj) matrix, 0v is an (n jxnod) matrix with zero entries, Vw is an 37 ("0d xnod) matrix, 0W is an ("ad X11 1) matrix of zeros. The augmented matrix, V" is an «111-mod) x(n1+nod)) matrix. The objective is to select an estimate if (j) such that the quadratic measure 102(1)) = 1/2 (2*(j) - H*(j) )2(j))t V*'1 (2*(1') - H*(j) 12(1)) (57) is minimized. Because minimization of I ( J? (j)) is an ordinary deterministic minimization problem, the least square estimate is obtained by setting d (1(Ji(j)))/d (12(1)) = 0 (58) By using equation (57) for if (j) and carrying out the indicated partial differentiation, we obtain d (1026)» + d (12(1)) = H*t(j) V*'1(Z*((j)- H*(j) it)» = 0 (59) The minimum variance unbiased estimate of the OD matrix for interval j is expressed as: 12(1) = (114(1) V*’1H*(j))'1H*t(j) v.4 21(0) (60) A description of how the fraction matrix H, and error variance-covariance matrices V1,, Vw are derived are contained in a later section. Two system of linear equations were defined. The first equation defines a cauSal relationship between link counts and OD trips. The second equation provides a connection between the previous interval O-D trips and the present interval O-D 38 trips. This equation states that the previous interval O-D volumes are equal to the present interval O-D volumes plus some deviation. The causal relationship defined in the second equation is significant since it allows the analyst to compensate for the problem of rank deficiency, by converting the under- determined system into a full rank system. 39 3.2 Evaluation of the Accuracy of the Sequential Generalized Least Square Estimation Procedure The test of the credibility of the results of the GIS model (estimated OD volumes) is to find out how close the estimated link counts are to the observed link counts. Link volumes are estimated by the following eqautions: i nod 2(l,j)= 2 z h(r.t,l.j).£(r.t), t=j—lr=1 or nod nod 2(1,j)= 2 h(r,j-1.l,j). i(r.j-1) + 2 Mr, j, 1, j) 526,1) , (61) r=1 r=l 2 (l,j) is the estimated counts on link 1 during interval j. i( r, t) is the estimated volume for O-D pair r during interval t. h(r, t, l, j) is an assignment matrix, specifying the fraction of rth O-D pair volumes that left the origin during interval t and are on link 1 during interval j . The statistic used to measure the degree of closeness between predicted and observed link counts is the root mean square error (rmse). The rmse measures the degree of disagreement between two series of link flow values (estimated and observed) and it is given for each interval j by: nl rmse(j) = sqrt ({ 2 (z(l,j) - 20.1))21/711) (61) 1:1 ' %rmse (j) = (rmse(j) /2 (j)) * 100 (62) 4o nl z(j)=(1/n)) 2 2(1)) (:1 z(l,j) is the observed counts on link 1 during interval j. Chapter 4 Parameter Estimation 4.1 Introduction To apply the model it is necessary to estimate the value of the model parameters. If z(l, j) denotes traffic counts on any link 1 during an interval j, then z(l, j) can be defined as follows: i nod 20, j) = Z 2 h(r, t, l,j) . x(r, t) + v(l, j), (63) t=lr=1 In the above equation h(r, t, l,j) is the fraction of rth O-D volume that departed the origin during interval t, and is on link 1 during interval j. The procedure for estimating h( r, t, I,j) is adopted from Cascetta[1]. x(r, t) is the rth O-D traffic volume that left the origin during interval t and is contributing to flow on link 1 during interval j. 00, j) is the observation error on link 1 during interval j. To estimate the fractions h( r, t, l,j), link traffic counts must be expressed in terms of path flows. Letting x( k, t) denote the traffic flow following path k that left the origin during period t, x( k, t) can be expressed as follows: x(k, t) = x(r, t) . p(k/t) k e K, (64) where p(k / t) is the probability of choosing path k going between an OD pair r given that the trip left the origin during interval t. The observed link traffic counts can be rewritten in terms of the path flow x( k, t) as follows: 41 42 i Kr 20, j) = Z 2 (a(k, t, I,j) . x(k, t)) + 120, j), (65) t=j—1k=1 where a( k, t, I,j) is the fraction of path flow x( k, t) contributing to link flow z(l, j). The assignment fraction h( r, t, I,j) can be expressed in terms of path-link incidence fractions a(k, t, I,j), and path choice probabilities p( k / t) . Combining equation (63), and equation (64), the assignment fraction h( r, t, I,j) is expressed by the following relationship: . Kr h(r, t, I, j): 2 a(k, t, l, j).p(k/t), (66) k=l Let h (r,t,l,j) denote the estimate of h( r, t, I, j), and a(k, t, I, j), and ('5 (k / t) denote estimates of a(k, t, l, j), and p( k / t) respectively. Equation (66), can be written in terms of these estimates as follows: If the link arrival time t is t=j, then the fraction of traffic on link 1 during interval j that arrived during j is: Kr z a(k, j, I, j). (3 (k /j), (67) =1 28. :14) = k If the link arrival time t is t=j-1 ~ K h(r,j-1,1,j) = 2' d(k)-1, I, j). 6 (k /j-1), (68) k = l . In the next two sections, steps taken to develop estimates of the parameters p(k / t), and a(k, t, I, j) are described. 43 4.2 Calculating the Probability of Choosing a Path p( k / t, r) p(k / t,r) can be estimated by a multinomial logit model. The probability of choosing any path k during any interval t can be expressed by: p(k/ t, r) = Prob ( C(k,t) + q(k,t) < C(m,t)+q(m,t) } (69) where, C(k,t) is the cumulative travel time for path k leaving the origin during. interval t, and q(k,t) is the error term. The use of a multinomial logit model to estimate choice probabilities requires the error terms to have the following properties: 1) The error terms, q(k,t) are independent, and 2) identically distributed. Based on the above assumption, the multinomial logit form [12] of the choice probabilities is expressed as: p(k /t, r) = exp(-u* (tt(r,k,t) + [3" dk)) / 2 exPHl " (tt(r,m,t) + [3" dm)) (70) where u is a scale parameter, and tt(r,k,t) is the total travel time on any path k for an O/ D pair r, given that departure time from the origin is during interval t. The parameter m represents any other reasonable path . A brief explanation is required before variables [3, and dk could be defined. A network is a collection of local streets; the only exceptions are expressways. Since expressways have no traffic signals, it is reasonable to assume that the travel time for those paths 44 connecting origins and destinations that include an expressway should be shorter than the alternative paths. It is not possible to calculate travel time on a path that includes an expressway based on on-line traffic data, since no routine on-line traffic data for this type of roadway is available. To compensate for the lack of on-line data to estimate travel time; each segment of an expressway is assigned a speed. Travel time is then calculated based on the assigned speed and distance. In addition, a bonus is assigned to each . segment to denote savings in time due to the absence of traffic signals. Lets denote, B< 0 to represent a bonus; implying savings in travel time in minutes per mile and dk as the distance of path k on an expressway in miles. If a path does not include the expressway, then B = 0. To use equation (70) to calculate choice probabilities, three parameters need to be estimated, one is the total travel time tt(r,k,t), another is the value of u; and once u is set; then for a given value of u, [3 needs to be estimated . 4.2.1 Procedure For Finding Multiple Paths The first step in OD estimation, whether it is static or dynamic estimation, is to find paths between O-D pairs. In all static O-D estimation procedures, a set of shortest paths are found. In dynamic modeling, it is assumed that there is usually more than one path choice between O-D pairs, and drivers sometimes choose a path other than the shortest path if they anticipate delay. Some models use what is called an all-or-nothing algorithm to find a set of shortest paths and ignore the possibility of multiple paths even though the modellers recognize 45 that in real life all drivers do not choose the shortest path to go from an origin to a destination. As was mentioned before, much of the past research on dynamic O-D estimation has been based on freeway sections including ramps, and intersections that have simple constrained geometry. Both of these conditions have only one possible path between O-D pairs. Therefore in traffic assignment models all trips are loaded onto this one path [Cassetta][1]; and if the network represents an urban area, trips are loaded onto a set of shortest paths, one path per O-D pair. In this study the issue of multiple path choices is dealt with as follows. Normally the most desirable approach in finding multiple paths based on variable travel time is a dynamic assignment model. In the absence of a dynamic assignment algorithm, the paths are found through a static assignment algorithm, and then the path choice probabilities are calculated. These two steps are not simultaneous, but sequential. First, all reasonable paths are found by running a static algorithm, and then the dynamic part (which is the probability of choosing a particular path during a particular interval) is found through a separate algorithm. An equilibrium static assignment program was adopted to find a set of reasonable paths. A maximum of three most likely paths were identified. The first possible path is the shortest path obtained by running a ten iteration equilibrium assignment algorithm of the UROAD program of the UTPS [14] package. To find the second possible path, some of the major links (streets) were deleted. The choice of which links to omit is based on tracking the shortest routes for individual O-D pairs, and familiarity with the area. The equilibrium 46' assignment program is run again to find shortest path given the missing links. To find the third path, the same process was repeated this time with additional links missing from the second path. 4.2.2 Calculation of Path Total Travel Time Calculation of link total travel times requires information on link speed, link - delay, and time of departure from the origin. In developing a dynamic trip table, the relevant data can be obtained by collecting traffic data for a number of days. This information is usually received from probe vehicles transmitting to a monitoring center, [Cremer,Putensen] [5,9] . Based on the sample collected, general traffic characteristics including delay are determined. For signalized arterial links, delay is calculated implicitly within travel time estimation by looking at the variability in the link travel time experienced by individual probe vehicles on an arterial. Total link travel time within a given period is split into 3 components; the travel time prior to reaching the tail of the over saturated queue, the subsequent time required to traverse the over saturated queue, and time to reach the tail of the uniform queue and clear the stop line. The reliability of information from probe vehicles is heavily dependent on the number of probe vehicles, their origin-destination, the path that the probe vehicle is on, and the relationship between the departure time of the probe vehicle with respect to other vehicles. Unless all vehicles are equipped and information is received continuously, the information required to estimate travel times must be derived from sample data. In this study an attempt has been made to extract the required information from data received via inductive loops 47 that are installed on some arterial links in an urban area. This approach is preferred over the other methods of collecting traffic information because traffic data from inductive loops is received daily and can be collected on a daily basis. Calculation of path total travel time for each O-D pair requires two kinds of information; link average speed, and delay per vehicle. The first step is to come up with a reasonable speed for all the links in the network. Delay per vehicle is measured by dividing total delay experienced by all vehicles having to stop during the red light within time interval h by the total number of vehicles on links during the same interval h. 4.2.3 Calculating Link Travel Times Total travel time on a path is calculated by adding link travel times between the origin and the destination. Link travel time is found by dividing distance by link average speed. Since both speed and delay vary from interval to interval, link travel times vary accordingly. tt(r,k,t)= 2 (8(1)) *d(I) )1 (60) + 8(l,t) ) (71) l ek tt(r,k,t) = Total travel time on path k between O-D pair r, for vehicles leaving the origin during interval t. s(l,t) = Speed on link 1, during interval t . d(l) = Distance of link I. 5(l,t) = Delay on link 1 during interval t. 48 Due to short distances between origins and destinations in the study network, almost all O-D pairs can be reached within one 15 minute interval. Therefore, to calculate total travel time during an interval, it was assumed that drivers leave the origin within the same interval. 4.2.2 Estimation of parameter )1 The choice of u is arbitrary, but the assumption that the error terms are independent and identically distributed constrains all error terms to have the same value of u. The fact that each error term has the same value of )1 implies that the variances of the error terms are equal. A suitable value for p. was chosen through an iterative process. Given that the choice probabilities must satisfy the following axioms: 1) 0 .aBua - mum . a 2:9“. paieumsa mum Sn now com «mm 03 3383mm 8v mom 08 N F N 000 3m .33 patewgtsa Km mom 4... ..11....m.£6.mm m nnv now a Em mom >3 - n 328:. 2:38 :5. 335:3 .m> .258 . who - n 959". com mom w _ v 03 p 8 8m coEmEcmm _mau< 67 .386 1% 08— N >3 . o .8225 8.53 x5. 385:3 .9, .388 . who . v 052... paieumsa 0mm 03 m _ v won one new 3385”.— Sum 8m 2w 3m 80 now 383.. 68 323 000— oom 08 00v CON 0 4. . .4. .m .89. 3280 _ w m, m M. 88 D e. . ONO . : . . 82 mg N 0% >3 - m 322:. 8:33 :5. 33:53 .9 330a . who . m 832“. m3 03 omm nofiEcmm 30 N3 00v va 2K 333. 69' 320a >a - a. .8225 2550 x:.. 338.88 .m> 320a . who . m 059“. paieumse $0 30 ONV _ Om SR mow noueEEm :6 Be as mmv 86 he see... 70 Table 2 RMSE and %RMSE for Different Values of the Constant Parameter u - interval 1 of day 2 - Sample Data u RMSE %RMSE 0.3 42 13 0.8 41 13 1 41 12 1.5 43 13 2.0 43 13 3.0 44 13 4.0 45 13 Chapter 6 6.1 Representation of an Urban Area as a Network Having determined that the model gives good results when all factors are known, the next phase of the study is to see if it can be applied to an underdetermined system with several estimated parameters. In order to test the model, the first step was to define an area of interest and to represent the area as a network which is a graph (N ,A) of a set of N points which are known as nodes, and a set A of ordered pairs of these nodes known as directed arcs. Arcs represent the streets. Each arc has properties such as distance, and direction. The next step was to identify the points that traffic emanates from or tends towards such as parking lots, residences, and places of work (in other words traffic collectors/ generators). Traffic collectors/ generators are represented by a set of nodes that are not part of the roadway network . These nodes are called centroids. They are points on the network that represent the approximate location of traffic collectors/ generators in the area. These centroids are either origins or destinations. Traffic collectors/ generators are connected to the network by links with special attributes. Once the network was coded it was necessary to match the location of traffic counters with the corresponding links on the network. This was done using special schematic maps that show the location of detectors on streets, their proximity from intersections, the direction of traffic covered by loops, and the number of lanes covered by these detectors. Coding the location of traffic counters on the network was done by identifying the arcs that represent the 71 corresponding streets on the schematic maps. The inner study area consists of parts of the Beacon Hill, Back Bay, and Kenmore neighborhoods, and Commonwealth Avenue to the Boston University Bridge. The external areas include Cambridge, West end, Financial district, Downtown crossing, Chinatown, Theater district, Prudential / Copley area, Fenway, the edge of Brookline; and Brighton. The area is divided into 34 zones. The zoning scheme was adopted from the 1975 DPW (Department of Public Works) zoning systems. The zones are chosen where there is either a concentration of business activities or residents. Parking lots are chosen as traffic collectors or centroids for each zone. (See Map of Boston). The network representation of the area has 1124 links and 34 centroids. twelve centroids represent the inner study areas, and 22 centroids represent outside areas or externals. There are a total of 77 traffic detectors in this area. The majority of traffic detectors cover only one lane, and are located in one side of a street only. In a few cases, such as Cambridge Street, loops cover both directions, and two lanes on each side. The 77 links that had detectors were identified and the links A- node, and B-node were stored in a data file. Path files for all O-D pairs are listings of A-nodes, and B-nodes of the links on paths that join origins to destinations. The path file was matched with the 77 links file to find out which links on each of the possible paths had detectors. Of the 77 links, a total of 58 links fall on at least one paths that connects O—D pairs. 73 6.2 Data Collection After coding the network and identifying the location of the loops, the next task was to collect data for the analysis. Local street traffic count data were provided by the City of Boston Traffic Control Center. In the Traffic Control Center, traffic controllers monitor traffic by watching link data displayed on computer screens. The location of the link on the network is displayed on a large screen where a ' series of flashing lights display traffic signals. A flashing arrow on a link signals either uninterrupted flow or congestion. A green arrow signals normal flow, and red congested flow. If congestion occurs traffic controllers attempt to correct the problem by adjusting the traffic signal timing. The data is not ordinarily saved since no further analyses are done; but if requested, data can be saved for any particular period, in 15 minute increments. A hard copy of the traffic data was obtained from the Traffic Control Center. Each traffic detector is identified by a number. A set of detectors clustered in one area are identified by their area number. Each report provides two blocks of information; the first block gives traffic statistics per detector such as volume, percent occupancy, average speed, number of stops, delay, historical volume (which is an hourly average based on the previous 5 weeks), and historical occupancy (which is an average based on the previous 5 weeks). The second block of information is a summary statistic for the section. Data for this block is obtained by averaging the detector data in a section. For this study, data were obtained for every 15 minute interval for the period from 6:00 a.m. to 9:00 a.m. for six consecutive days from a Monday to the 74. Monday of the following week not including the weekend. Data were collected between August 3 and August 14, 1992. The data explicitly used in parameter estimation of the recursive Generalized Least Square algorithm are; volume, average speed, and delay. 6.3 Data Description A brief description of each of the items on the report is given below: 0 Sensor number This is a reference number used for locating traffic detectors on maps used by the traffic controllers. These maps have detailed intersection schematics. 0 Traffic volume (VEH) per 15 minute interval. Counts are received and stored during each traffic signal cycle, and then added together to get 15 minute counts. 0 Average speed (MPH) An average speed for a particular link is calculated as follows: An algorithm sets an average vehicle length to be equal to 20 ft. A detector is assumed to have 5 ft x 5 ft dimensions. As the front of a vehicle passes over a loop, 5 ft. is added to the length of a vehicle, and as the back of a vehicle crosses the loop another 5 ft. is added to the length of a vehicle, making a vehicle 30 ft long. While a loop is occupied by a vehicle, it sends a signal to the computer. The duration of the signal is taken to be the time over the 100p. 75 The algorithm then calculates speed by dividing the vehicles length (which is assumed to be a constant 30 ft.) by the duration of the signal. During the green interval of each cycle vehicle speeds are calculated and then divided by the number of vehicles that passed over a loop to obtain an average speed for a link during that cycle. The average speeds for each cycle are calculated and summed, and then divided by the number of cycles in an interval to obtain an average speed for a link for that interval. The section average speed is the arithmetic average of the calculated average speeds of the sensors in a section. - Stops (VEH) This is a derived parameter, assuming that some fraction of vehicles that pass over a sensor during the yellow signal are assumed to stop at the intersection for the red light. 0 Average stopped time delay (min) This is the number of stops per cycle, multiplied by delay per vehicle stopped. This is the average delay experienced by vehicles at the stop line due to the red light. Delay is calculated per cycle summed over the number of cycles in a 15 minute interval, and then averaged. 6.4 Estimating Speed and Delay For All The Links in The Network The network for the study area consists of 1124 links. A total of 77 links equipped with detectors fall within the study area. Of these 77 links, 58 links fell on one or more of the minimum paths that connect origins to destinations. 76 Therefore, assumptions had to be made to estimate speed and delay for the reminder of links in the network. To do this, the network was divided into 5 sections. The division followed the segmentation scheme used by the City of Boston. Using these section boundaries, the following assumptions were made: 0 For each interval, all the links within the boundaries of a section are assigned the average section speed. 0 For each interval, all the links within the boundaries of a section are ' assigned a constant section delay. This delay is calculated as follows: Emu.) 8 = 131—— (77) m ZVUJI) [=1 5 = Section delay 8(1)!) = Delay on link 1 during interval h. Only links that have traffic detectors are considered . v( 1,11) = Volume on link 1 during interval h. 6.5 Adjustment of the Apriori (1987) O-D Matrix The historical trip matrix was derived from the 1987 Census survey data. The O-D volumes were factored to update this table to be more consistent with the observed link counts. The following steps were taken to adjust the apriori O-D table: 77 1) A screen line was picked. In this case due to the particular geometry of the area, and the availability of on-line count data from the streets that cross the screen line Massachusetts avenue was chosen. The direction of traffic chosen was west to east. 2) The apriori (1987) CD, 3 hour volumes for those O-D pairs whose paths cross the screen line were summed. 3) The on-line traffic counts on the links that crossed the screen line were summed for the 12 intervals of day 1. i 4) This volume was then divided by the total observed volumes from the links over the 3 hour period to get the adjustment factor. The apriori O-D matrix was adjusted by the following ratio: 11 12 m ratio: ( 2 x(i)) / ( X 2 (JUN) i=1 h=1 i=1 where x(i) = trips going from origin (0) to destination (D) that cross the screen line 11 = Number of x(i)s. v(j,h) = Observed counts on link j during interval h, on those links on the O-D paths that cross the screen line. m = Number of v(j,h)s. Using the above calculation, the adjustment factor was found to be equal to 1.36. 6.6 Choosing the Parameter u The RMSE values of the OLS run of interval one of day one were calculated for values of 11 between 0.3 and 4.0. Table 3 lists the value of 11 vs. RMSE. The 78 results show that RMSE is not very sensitive to the value of 11. Choosing u =1.5, causes the shortest path to be the prominent path choice; but still assigns some trips to the second and third paths defined for each origin and destination. A value of )1 =1 means that the probability of trips assigned to the minimum path is not significantly higher than an alternate path. As 11 increases, the probability of trips being assigned to the minimum path as opposed to an alternate path increases. The minimum path update algorithm developed by the Bureau of Public Roads (BPR) uses an update value of 1.5 for the exponent of v/ c, which-is somewhat analogous, and supports the choice of u for this study. 6.7 Choosing the Bonus factor B Using a )1 value of 1.5, different values of B from -0.2 to -1.0 were used to obtain the RMSE values of the OLS run of interval one of day one. Table 4 shows the values of B vs. RMSE. Since B is the travel time bonus used to modify the travel time on Storrow Drive (an expressway), the value is constrained by reasonableness. Due to the link lengths, the maximum travel time on any segment of ”Storrow Drive” is 0.5 minutes, therefore a bonus factor B = -0.3 minutes was used. 6.8 Test of Model Responsiveness When Different Weights are Assigned to the Count Data versus Prior O-D Estimates Data To test how sensitive the model is to input data (Count data), a weighting factor a was assigned to the counts and (1- a) was assigned to the prior O-D estimates. The purpose of this exercise was to find out if the link estimates improve by assigning more weight to the counts and less weight to the prior O-D matrix. 79 The test was performed as follows: The inverse counts error variance Were multiplied by or, and the inverse O-D estimates error variances were multiplied by (l-a). The model was tested for several values of a = 0.05, 0.2, 0.6 for one day only, without a prior day O-D estimate. The prior interval O-D estimate used came from the historical" O-D matrix. The measure of how close the estimated counts were to the actual counts was the RMSE values. Table 5 shows the RMSEs for different values of 01. Compare this Table with table 6 which contains the RMSE values of a 12 interval GLS run when equal weights are assigned to the inverse error variance for the counts and the OD estimates. The model produced better estimates of link counts when almost all the weight was assigned to the counts. The implication is that more count data would produce better link count estimates which in turn implied better O-D estimates. Basically, the less underdetermined the system, the better the estimating ability of the model. These results indicate that the model acts rationally, since it produces better estimates of link counts when more weight is assigned to the counts and less to the prior O-D estimates. This weight assigning process though reasonable, could not be adopted to the test network in the study because count data from all links are not available. The obvious deduction from this exercise is that to get better O-D estimates additional traffic count data is needed. However, since these data are not available, and it can be assumed that successive O-D flows are not randomly distributed, but in fact are related to the previous interval flows, a I value of a = l was selected. Therefore, the weights assigned to the counts and the OD estimates are as described earlier derived from observation error 80 variances and OD estimates error variances; and are not artificially augmented by any factor a. 81 Table 3 RMSE for different values of the constant parameter u - interval 1 of day 1 u RMSE 0.3 46 0.8 45 l 44 1.5 43 2 42 3 41 4 40 82 Table 4 RMSE and %RMSE for different values of the bonus factor (B) - interval 1 of day 1 (3 RMSE %RMSE 41.2 63 240 -O.3 64 242 OS 63 243 83 Table 5 RMSEs For Different Values of the Weighting Factor (I. 12 intervals - GLS Run Interval a = 0.05 L a = 0.2 T a = 0.6 RMSE l 20 23 31 2 17 21 29 3 20 23 29 4 27 so 36 5 37 42 50 6 39 44 51 7 43 50 61 8 52 61 74 9 64 75 93 10 69 80 98 ll 74 86 105 12 77 86 105 Table 6 Comparison of Actual Versus Estimated Counts - 12 intervals - GLS Run Interval RMSE 32 31 59 72 95 l 02 l 02 ngaooouomtswm-o Chapter 7 Analysis of the Results 7.1 Introduction The intent of this chapter is to report the results of the application of the model described in chapters 3 and 4 to estimate O-D trips for the study network using the data described in chapter 6. The approach used in the analyses is depicted in the flow chart shown Figure 7. The results of the initial model application were used to identify problems with the data. The data were then adjusted, and the analyses were repeated. 7.2 The OLS Procedure As shown in Figure 7, the model was initially run using the OLS procedure. The OLS procedure will result in the O-D estimate that produces the lowest %RMSE given a particular data set. The OLS procedure is identical to the GLS; except that the error variance of the link counts and the prior O-D estimates are not weighted. The OLS O-D estimation equation is similar to equation (60) of chapter 3 with an the inverse variance matrix V" '1: I, an identity matrix. )2 (j) = (1141;) 111-1141(1))4141t (j) w-1 21:11)) The OLS procedure was performed in a sequential manner for 12 consecutive) intervals for each of 5 days. To check the validity of the OD estimates, these estimates were used to calculate link count estimates for each interval using 85 86 equation 61 of chapter 3. The RMSE and %RMSE statistics measure how close the link estimates were to the actual counts. The RMSEs and %RMSEs are exhibited in Tables 7 through 11. A comparison of RMSE’s and %RMSE’s of the 12 intervals shows little change in RMSE values between day 4 and day 5. The predicted and observed link flows for intervals (1 ,3,6,9,12) of day 5 are shown in Figures 9-13. A review of these charts shows: 1) A significant over-estimation of almost all links in interval 1 (54 of 58). 2) The identification of 3 outlier links (overestimated). 3) Most of the links being underestimated by interval 12 (39 of 58 links). 7.3 The GLS Procedure The GLS procedure attempts to minimize the deviations of the estimated counts from the observed counts while also weighting the magnitude of deviations between the prior O-D estimates and the present O-D estimates. The O-D estimates and the link count estimates of the OLS procedure for the same interval of the same day are used in calculating errors and estimating the variance as was described in section 4.3. The GLS procedure was also run sequentially for 12 intervals of each day for 5 days. The O-D estimates were obtained using equation (60) of section 3.1 (shown here as a reminder) )2 (j) = (11*t (j) V*'1H*(j))'1H*t (j) vr1 21(1)) 87 The link counts were then estimated from the OD estimates for each interval. The link counts are estimated using equation (50) of chapter 3 2(1, j) = 2 h(r, j-l, l, j) . 5H r, j-l) + E h( r, j, l, j) x(r,j) + 00, j) , r=1,..,nod in matrix form the above equation is stated as Z(j) = H(j-l) 12(1) + H(j) 12(1) The measure of the goodness- of --fit of the estimated link counts to the actual link counts was the RMSE and %RMSE statistics . A comparison of RMSE’s and %RMSE’s of the 12 intervals for 5 days are shown in Tables 12 through 16 show little change in the RMSEs and %RMSEs between day 4 and day 5. The predicted and observed link flows for intervals (1 ,3,6,9,12) of day 5 are shown in Figures 14-18. A review of these charts show: 1) Most of the links are overestimated in interval 1, with 3 links identified as outliers. 2) In interval 3 and 6, the majority of the links are underestimated, with both high and low outliers. 3) In interval 9, there is nearly an equal number of links that are over and under estimated. The low outliers are the most significant. 4) In interval 12, there is an equal number of links overestimated and underestimated, with no evidence of convergence between the actual and estimated link counts. 88 7.4 A Revision of the Assumptions At the end of day 5 there is still a considerable deviation between the actual and estimated counts for some of the links. This means that either the number of trips assigned to the links by the assignment matrix is not correct, or there are errors in the actual traffic counts. Since calculation of the assignment matrix is dependent on travel times, which are a function of speed and delay, errors in estimating these parameters could lead to assignment errors. In an attempt to improve the assignment matrix, the assumptions made in deriving average speed and delay were reviewed. The under-estimation of many of the interval 12 day 5 link counts was alarming. One explanation for under-estimation could be a low average speed on some links or in some sections of the network which would lead to the link being assigned low volumes. In checking the raw data obtained from the Traffic Control Center, some links were being recorded with an average speed as low as 5 mph. Since this does not appear reasonable, the following procedure was adopted to increase average speed and reduce average stopped delay. In each section those detectors that showed speeds that were not within the range of the level of service A, B, and C were removed from the calculation of an average speed for the section. The link average stopped delay was also recalculated considering only those , links used to calculate the section average speed. The section average delay was then calculated from equation (77) of chapter 6 89 where the numerator is the sum of delay from detectors that show level of service A, B, or C and the denominator is the sum of the volume of those detectors that fall within these levels of service. A review of the detector data used to represent "actual" counts on the links indicated that in some cases the detectors only covered one lane of a multi-lane street. In these locations, the detector counts were multiplied by the ratio of total lanes on the links to the number of lanes covered by the counter to obtain new link volumes. 7.5 Adjustment Procedure The counts were adjusted and the speed and delay were revised as described above and the OLS procedure was run again for 12 intervals of day 5. Table 17 shows the values of the RMSEs and %RMSEs for day 5. The %RMSE was reduced from 64 to 45 by interval 12, and the distribution around the mean appears to be better with the adjusted data, (Figures 19-23). Observations from these charts are: 1) Almost all links are overestimated in interval 1 (similar to the original OLS results). 9O 2) The trend of most links being overestimated in interval 1 and most links being underestimated in interval 12 is retained. 3) The same three links that were overestimated in the initial OLS runs remain outliers with the adjusted data. The location of these three links and their proximity to a centroid is shown in Figure 8. The over-estimation of these three links constitutes 15 ”/6 of the RMSE values of the adjusted OLS estimates for interval 12. Another indicator of how close the link estimates are to the actual counts is the frequency with which links fall within a specified percent absolute difference between the estimated and the actual counts for the final interval of day 5: 1) 16 Percent of the links fall in interval { 0, 25] 2) 31 Percent of the links fall in interval {25, 50] 3) 27 Percent of the links fall in interval (50,100] 4) 26 Percent of the links fall in interval { >100 } Comparison of the RMSEs and %RMSEs (Table 18 and Table 16) of the GIS runs before and after data adjustment show lower RMSEs and %RMSEs for the unadjusted GLS runs. This is a result of the weighting factors used to calculate successive estimates. In the GLS estimation procedure, the inverse of the error variance ( this includes link count error variance and O-D error variance) act as weighting factors. These weights adjust for the differences between the estimated link counts and the observed counts, and smooth out large variations that might occur between the present interval and the previous interval O-D 91 estimates. Therefore, high weights are assigned to data with a low error variance, and low weights are assigned to data with a high error variance. In the adjusted GLS estimation procedure the weights assigned to the link counts for intervals 4-12 are very small compared to the weights assigned to the link counts in the unadjusted GIS runs. The weights assigned to prior O-D estimates in the adjusted GLS estimation procedure are also smaller than the weights assigned to prior O-D estimates in the unadjusted GLS estimation. The effect of the high volume links (which are generally overestimated) being assigned extremely low weights in the GLS process is an over-estimation of the OD trips which use these links. This results in an increase in the volume assigned to these links and consequently to even larger estimated link flows in the next interval. Comparison between the estimated and actual link counts for the GLS runs using the adjusted data show some improvement in the distribution around the mean (Figures 24-28). Reviewing these figures leads to the following observations: 1) There is approximately an equal number of links overestimated and underestimated in interval 1. 2) There are a number of outliers on both the high and low side in interval 1. 3) By interval 12, most of the links are overestimated, but outliers remain on both the high and low side. 4) The three outliers from the previous runs are no longer identifiable as a special case. 92 7.6 An alternate OLS-GLS approach In an attempt to improve the distribution of link estimates around the mean, a variation of the OLS - GLS procedure was tried. The procedure is as follows: Interval one OLS O-D estimates are calculated. Subsequently interval one GLS O-D estimates are calculated. For the second and each subsequent interval OLS O-D estimation, the prior interval GLS O-D estimates were used as an apriori O-D matrix. A comparison of the RMSEs and %RMSEs of 12 intervals of 5 days are shown in Tables 19-23, both the RMSE and %RMSE values increase towards the later intervals. However, they are much lower than the GLS results shown in Table 18 when the sequential OLS - GLS was employed. The predicted and observed link flows for intervals (1, 3, 6, 9, 12) of day 5 are shown in Figures 29-33. Comparison between the estimated and actual link counts for the alternate GLS runs using the adjusted data show some improvement in the distribution around the mean. Reviewing these figures leads to the following observations: 1) There is approximately an equal number of links overestimated and underestimated in interval 1. 2) There are a number of outliers on both the high and low side in interval 1. 3) By interval 12, most of the links are overestimated, but outliers remain on both the high and low side. 93 7.7 Possible Sources of Differences Between the Estimated and Actual Link Counts There are three outliers that appear in many of the figures showing link counts versus estimated link volumes. In the alternating OLS-GL8 procedure they represent 32% of the RMSE and 54% of the %RMSE terms in interval 12 day 5 results. Two of these links are on the one way street which provides the only access to and from the centroid shown in Figure 8. Therefore, in the estimating process, all paths going from any origin to this particular centroid (destination) must include link 1. Likewise, all paths departing this origin must include link 2. These links are also used by trips from other origins to other destinations. Because path assignment must be continous, trips on a link must continue on the adjacent links on the same paths. Thus overestimating link 2 results in overestimating link 3. The concentration of a significant percent of the error on these three links is thought to be a result of the zone structure. Using centroids with multiple access links to the network should result in fewer outliers, and a lower RMSE. Certain other assumptions or simplifications had to be made in the application of the model structure to the data available. These items are perceived to contribute to the error terms when applying the model to the study area network. First, the selection of links to be eliminated in order to find the second and third alternative path for the trip assignment process was heuristic, and constrained to be on paths that included one of the 58 links where the detectors were located. 94 The use of a probabilistic assigrunent algorithm that would identify the three alternate paths in one run might result in a different set of paths. If the constraint that all paths must include one of the 58 links were relaxed (by virtue of having more detectors), the vehicle trip distribution would be more uniformly distributed over the network, thus reducing the probability of over assignment on the links with detectors. A second parameter that would change the number of trips assigned to various paths in the network is the parameter B. This parameter (as was described in chapter 4) represents savings in travel time (min) for trips using expressway links. Thus B represents the drivers perception of time saved by choosing a route that includes a freeway segment. In this research, the value of B was selected as a maximum reasonable value for the shortest freeway link in the network. In fact, different values of B might apply to links other than freeway links. For example, motorists may perceive that a one way street with good progression would provide a higher average speed than a two way arterial with a high density of traffic signals. The use of B for freeway links, B2 for one-way links and B3 for other arterial links may provide a better representation of driver perception of the disutility of using links on various paths. In addition, Bis considered to be time independent in the model formulation. Thus, the motorist is assigned the same perceived bonus at 6:00 in the morning 95 and in the height of rush hour. A feature that should be considered in future applications of the model would be to use a B(t) formulation, with B varying according to the time of day. Other assumptions or procedural steps used in the application of the model may contribute to the high RMSE and %RMSE values include: 1) Grouping of the link counts and the estimated O-D volumes. Link counts and O-D volumes were categorized into 3 groups of low, medium) and high. Other grouping schemes such as clustering data into groups of 2 or 4 were not considered. 2) Adjustment of the prior O-D estimates. During interval (j), the prior O-D estimate X(d,j-l), (d=day, j=interval) is multiplied by a correction factor. The correction factor is: if X(d-1,j-1) > 0 then X(d-Lj) / X(d-1,j-1) if X(d-1,j-1) = 0 then X(d-1,j) / X(d-1,j-1) = 1.0 3) Problems with the On-line data. Observed link counts. Volume data collected by the detectors in the streets often provides volume data for only one or two lanes out of a multi - lane street. Therefore, the volume data that is a major input data in the model is incomplete. To adjust for lack of data, a correction factor based on the number of lanes was used. 4) Observed link speeds The speed information obtained from traffic counters is an average speed of . vehicles going over detectors during the green phase of traffic signals. This speed does not represents the distribution of a vehicle’s speed while traveling across a link. It only reflects the speed of the vehicles passing over the detectors, 96 and this may vary according to the distance from the counter location to the intersection. 5) Observed link delay Delay values obtained from the detectors represent delay due to stopping at an intersection due to a red traffic signal. No on-line delay information is available for random rnidblock delay on streets. 97 Figure 7 Flow Chart Depicting the overall OLS - GLS Procedure l Data Preparation Calculation of link travel time Calculation of path total travel time Calculation of the Assignment ‘ (.____1 ‘ tnx Interval t=1, day 1 only Apriori 1987 O-D table #1 OLS OLS O-D estimates 1...... traffic counts interval (t) Starting from t>1 ) . Prior interval (t-l) '- O-D estimate Link Estimation Procedure Link counts estimates interval (t) Error Variance O-D error Calculation Counts error variance variance Augmented Inverse error variance <19 Interval t=l, day 1 only requiresApriori 98 1987 O-D table , GLS GLS ‘ O-D estimates interval (t) Link traffic counts interval (t) Starting from t>1requires . , Prior interval (t-l) O-D estimates 1 ) Link Estimation Procedure 1.1 Link counts estimates interval (1) Check the validity of O-D estimates by calculating RMSE and V %RMSE for the estimated link counts =t+1 0“. 411')» 1 l \ ...”...ao . 1 .94.}. . 1.). : L ..M... .hu. eve-511.5%. . . ...e o. _ . o. / .5 £31.. .. 1:0 : .H.._...> ...5 8.3.2.. HI: .. {JR ’8". .03 )1) .7... / ii... \\ ...-:3”. . [Uphill ....1..Biu...fl. ... ..Wur. ll. VmMMW-MWWnflmmflh Ell" tohu1\ .EZmu \. on mask. :28? v . . 7 .... 2030. a .zoJmom / v. .. .. ../_... .3 l W In": =- Eozzmu m ,_ a 2 3:3 wmocwmimmuxx VBmEzmm—LEO 2: .o 5:80.— 9: 956380 .wEBED w 935... 100 Table 7 Comparison of Predicted Versus Observed Link Flows - OLS run - Day 1 Intervals Xobs Xest RMSE %RMSE 1 1536 4184 64 242 2 1485 2995 43 167 3 2205 3592 46 121 4 3055 4660 57 108 5 3613 4985 61 98 6 3904 4990 57 85 7 4532 5658 65 83 8 5648 6391 70 72 9 6495 6958 81 72 10 6878 6903 83 70 11 6920 6930 80 67 12 7070 6688 80 65 101 Table 8 Comparison of Predicted Versus Observed Link Fows - OLS run - Day 2 Intervals Xobs Xest RMSE %RMSE 1 1509 3823 55 212 2 1638 2950 42 148 3 2569 3765 46 103 4 2928 4109 49 95 5 3757 4877 55 86 6 4348 5418 63 83 7 5071 5842 62 71 8 5928 6630 66 65 9 6760 7129 78 67 10 7076 7153 81 66 11 7172 7040 82 66 12 7384 7098 83 65 102 Table 9 Comparison of Predicted Versus Observed Link Flows - OLS run - Day 3 Intervals Xobs Xest RMSE %RMSE 1 1553 3935 58 217 2 1648 3021 45 157 3 2428 3602 44 106 4 3010 4204 50 96 5 3809 4841 60 91 6 4468 5576 63 82 7 5011 6026 65 75 8 S922 6762 73 72 9 6594 7215 80 70 10 7133 7616 84 69 11 7570 7816 88 67 12 7421 7436 85 66 103 Table 1 0 Comparison of Predicted Versus Observed Link Flows - OLS run - Day 4 Intervals Xobs Xest RMSE %RMSE 1 1532 3780 57 215 2 1622 2820 42 150 3 2431 3598 46 109 4 3058 4355 51 98 5 4060 5421 61 86 6 4490 5631 65 84 7 5134 6284 70 79 8 6150 7303 79 74 9 6754 7652 87 75 10 6661 6668 82 71 11 7524 6507 84 65 12 7419 6302 87 68 104 Table 11 Comparison of Predicted Versus Observed Link Flows - OLS run - Day 5 Intervals Xobs Xest RMSE %RMSE l 1521 3788 54 205 2 1616 3055 43 153 3 2378 3485 42 102 4 2872 4128 50 102 5 3805 4890 54 83 6 4157 5132 57 79 7 4882 5986 65 77 8 5616 6248 67 69 9 6374 6973 78 71 10 6654 6705 73 64 11 7245 7422 83 66 12 7331 7249 81 64 105 OS of 03 ON_ 89 .0200 0m 00 ON ill..- ow ow § om— OS Oo— m >00 - _ .282... ...—.500 0.5. 00.08.30 .»> .0200 - c2 30 - o 059“. om— pamwusa 106 8m ow— Oo— 03 ON. .0200 8 . ow om om ll ow Ii om. ov— oo— om. m >00 - n .0202. £5.00 03.00.08.210 9.0200 - c2 30 - 9 059. SN papwusa 107 0mm 8m .03 .0200 on. 8. ..I II Ii m >00 - o .0202. 3.500 0.5. 00.05.30 .»> .0200 - 2 05m: -8— 00— 8m 0mm 8m 09.. palowusa 108 omv 0mm. 8m 0mm .0200 on. 8. m >00 . o .0202. 2.500 0.5. 00.05.00 ....> .0200 - :2 20 - a. 050.“. 00. on. oow 0mm com 0mm 80 Om.V papwgse 169 090. 80 0mm 8m, .0200 8m 8m m >00 - a. .0202. 22:00 2.. 00.02.00 ....> .0200- :2 30 - n. 050.“. 8. 0m. 8m Dom 8m 8m 80 on.0 pawwysa 110 Table 12 Comparison of Predicted Versus Observed Link Flows — GL5 run - Day 1 Intervals Xobs Xest RMSE %RMSE 1 1536 2787 51 194 2 1485 2269 43 166 3 2205 2335 43 113 4 3055 2439 49 93 5 3613 2445 57 91 6 3904 2475 58 85 7 4532 2623 67 86 8 5648 2746 80 82 9 6495 2935 95 85 10 6878 2945 101 85 11 6920 3057 102 85 12 7070 3131 100 82 Comparison of Predicted Versus Observed Link Flows - GLS run - Day 2 111 Table 13 Intervals Xobs Xest RMSE %RMSE 1 1509 2630 47 179 2 1638 2217 42 149 3 2569 2313 40 91 4 2928 2466 42 83 5 3757 2724 49 76 6 4348 2879 57 76 7 5071 3114 62 71 8 5928 3703 68 67 9 6760 4159 79 68 10 7076 4316 83 68 11 7172 4424 86 69 12 7384 4441 88 69 112 Table 14 Comparison of Predicted Versus Observed Link Flows - GL5 run - Day 3 Intervals Xobs Xest RMSE %RMSE 1 1553 2621 47 176 2 1648 2123 39 136 3 2428 2204 34 81 4 3010 2463 38 73 5 3809 2931 46 70 6 4468 3285 51 67 7 5011 3635 56 65 8 5922 4590 59 57 9 6594 5064 67 59 10 7133 5338 74 60 11 7570 5711 80 61 12 7421 5655 79 62 113 Table 15 Comparison of Predicted Versus Observed Link Flows - GL5 run - Day 4 Intervals Xobs Xest RMSE %RMSE 1 1532 2601 48 180 2 1622 1869 33 118 3 2431 2197 33 78 4 3058 2591 37 70 5 4060 3209 47 67 6 4490 3531 51 66 7 5134 3976 58 65 8 6150 5131 63 59 9 6754 5661 71 61 10 6661 5410 70 61 11 7524 5293 81 63 12 7419 5503 82 64 114 Table 16 Comparison of Predicted Versus Observed Link Flows - GLS run - Day 5 Intervals Xobs Xest RMSE %RMSE 1 1521 2769 49 186 2 1616 2020 33 119 3 2378 2195 30 73 4 2872 2583 35 71 5 3805 3659 51 77 6 4157 3933 55 77 7 4882 4423 58 69 8 5616 5512 60 62 9 6374 6387 71 64 10 6654 6967 75 65 11 7245 7558 86 69 12 7331 7662 87 69 115 ow. .0200 oo. ow. ON. 8. 8 ow 0m 0 . . M I O I I H - om - I l I I - I O? I I W _. 8 I I II 1 .. 8 oo. .. .- ON. I 00. 00. I I ow. m >00 - . .0202. £500 2.. 00.02.00 ....> .0200 .. :2 30 .. v. 050.“. 9310011133 116 Co. 00. ON. . CO. .0200 on 00 m >00 - n .0202. «.2300 02.. 00.02.00 .2. .0200 - :2 30 - m. 050.“. ON 00 0w 8. ON. 00. 00. pagownsa 117 own .0200 8m .8m 8m on. 8. om II I I I I III I l .....rl . I I I I I I . .. I I .r I I I I ‘ I I I m >00 .. o .0202. 32:00 2.. 00.02.00 .»> .0200 .. :2 «.0 - o. 050.“. 8. .8. 8m 8m 8m omm pamnsa 118 own 0mm .0200 8m 00. 8. m >00 - o .0202. 02:00 2.. 00.02.00 .0> .0200 - :2 30 .. 2 059. 8. 8m 00m 119 omv .0200 80 can . 8m 0mm 8m 0m. 8. on I I I IF I I I I I 5 I I I I I I I III I + I I I I I ml? 7 I I I I I I III I I I u I I m >00 - a. .0202. 02:00 2.. 00.02.00 .0> .0200- :2 30 - o. 059“. 80 09.» 126 Table 17 Comparison of Predicted Versus Observed Link Flows - OLS run - Adjusted Data - Day 5 Intervals Xobs Xest RMSE %RMSE 1 2917 5735 69 138 2 3109 4879 55 102 3 4580 6052 60 76 4 5483 7037 72 76 5 7331 8838 84 66 6 8052 9480 91 66 7 9452 10821 99 61 8 11072 12184 107 56 9 12213 13524 121 57 10 12844 12907 105 47 11 13949 13781 115 48 12 14271 13895 112 45 121 0mm Dow .0200 om . 8. 0.00 0 >00 00.3.0.0 .. — .0202. 2:300 V...: 00.02.00 .o> .0200- :2 30 .. o— 0.30... 122‘ .0200 80 0mm 8 0mm 8 on. 8. on o I -‘ I. I III WW I. I 5 I I I I I 0.00 n >00 00—3—00 .. n .0202. «2:00 v...... 0208.00 £10200 .. :2 30 .. cw 059.. O 18. on. omw 8m own 9910'” 123 8. .0200 0.00 m >00 000200 - o .0202. 2:00 2.. 00.02.00 ....> .0200 - :2 30 .. .a 050.. .8, §§§ 93100-11193 124 .0200 0.00 m >00 000200 - o .0202. 02:00 2.. 00.02.00 .2, .0200 .5. 30 - «a 050.. -8. §§§§§§§§§ . 93100-31139 125 8> .0200 8m 8w I 0.00 0. >00 000200 .. a. .0202. 02:00 2.. 00.02.00 .0> .0200 - 2.. 30 - mm 050.“. -8. 8w §§§§§§§ WWII“ 126 Table 18 Comparison of Predicted Versus Observed Link Flows - GLS run - Adjusted Data - Day 5 Intervals Xobs Xest RMSE %RMSE 1 2917 2963 49 94 2 3109 2773 38 67 3 4580 3592 55 67 4 5483 5352 75 75 5 7331 7954 123 94 6 8052 8978 141 101 7 9452 11855 165 101 8 11072 18480 284 149 9 12213 23404 396 188 10 12844 26780 479 216 11 13949 29884 543 226 12 14271 32507 630 256 127'- .0200 On. OO— 03 ON. 8. Co 00 00 ON 0 I I I I o I I I I I I “Illlll I I I-IIII ON II I qI I I I “WEIIIIIfi 00 I I I I I I “u 00 I II . .I. 8 8. I I I ON. I I 00. I 00. . . 8. 0.00 m >00 000200 - . .0202. 02:00 2.. 00.02.00 .0> .0200 :2 30 - cm 050.“. Palm-””99 128 .0200 cow 8 on. 8. on 0 ill I I IIIII I E I II I I IIII I I I I I I I III: I I I I I I I I I I II I I I I I I I I I I I I 0.00m >00 000200 - n .0202. 02:00 2.. 00.02.00 .m> .0200 - :2 30 - mm 050.“. 8. on. 8N 0mm pawwusa 129 .0200 0.00 m >00 000:.00 .. 0 .0202. 02:00 2.. 00.02.00 .0> .0200 .. :2 30 - ca 050.“. § § § papwusa § § 130 .0200 8. 0.00 m >00 000:.00 - 0 .0202. 02:00 2.. 00.02.00 .0> .0200 .5. 30 .. R 050.“. 335 0010041130 é 8mm 131 8mm .0200 8 . 8. m >00 00.00.00 - a. .0202. «.2300 v.2. 00.02.00 .m> .0200 .. 30 - mm 0.30.“. 0.00 E Palowllsa 132 Table 19 Comparison of predicted to observed link flows - Alternate GLS run - day 1 Intervals Xobs Xest RMSE %RMSE 1 1536 3517 58 219 2 1485 3556 61 238 3 2205 4145 67 177 4 3055 3838 58 109 5 3613 3959 58 93 6 3904 3732 47 69 7 4532 4441 54 69 8 5648 4945 61 62 9 6495 5766 68 61 10 6878 6201 69 58 11 6920 6549 70 59 12 7070 6739 67 58 133" Table 20 Comparison of Predicted Versus Observed Link Flows Alternate GLS run - day 2 Intervals Xobs Xest RMSE %RMSE 1 1509 3278 52 200 2 1638 3367 58 207 3 2569 3928 56 126 4 2928 3653 49 97 5 3757 4109 47 73 6 4348 4495 46 62 7 5071 5221 48 55 8 5928 5908 54 53 9 6760 7169 66 57 10 7076 8127 79 65 11 7172 8427 87 70 12 7384 8693 86 67 134 Table 21 Comparison of Predicted Versus Observed Link Flows - Alternate GL5 run - day 3 hnuuvak; IXobs tht RLASE QGRIASE 1 1553 3176 51 192 2 1648 3240 56 196 3 2428 3722 51 123 4 3010 3687 47 91 5 3809 4453 50 76 6 4468 5453 75 98 7 5011 5889 78 91 8 5922 5873 94 92 9 6594 6976 68 6O 10 7133 7764 80 65 11 7570 3953 99 76 12 7421 4985 80 62 135 Table 22 Comparison of Predicted Versus Observed Link Flows - Alternate GLS run - day 4 Intervals Xobs Xest RMSE %RMSE 1 1532 2601 47 179 2 1622 2724 51 181 3 2431 3092 45 107 4 3058 3143 38 73 5 4060 3895 43 61 6 4490 4888 49 63 7 5134 4699 70 79 8 6150 4285 67 64 9 6754 6162 67 57 10 6661 6690 72 62 11 7524 7293 109 84 12 7419 9776 203 160 136 Table 23 Comparison of Predicted Versus Observed Link Flows - Alternate GLS run - day 5 Intervals Xobs Xest RMSE %RMSE 1 2917 3066 48 90 2 3109 3647 45 78 3 4580 4521 47 56 4 5483 5936 57 56 5 7331 8508 86 69 6 8052 11005 121 87 7 9452 8125 93 57 8 11072 15775 159 83 9 12213 22161 287 136 10 12844 24096 382 172 11 13949 20149 246 102 12 14271 19732 231 94 om. oo— . 03 ON. .0200 ow ov 137 D 00.0200 - F .0202. 02:00 2.. 00.02.20 .0> .0200 . 2:. 0.0 0.02.0.3 . 0a 050.". 0.00 0 >00 ov oo— on. 00— patamusa 138 .0200 0.00 0 >00 - :0 u. 2.0 00 0. . . 0 n .0202. 02:00 2.. 00.02.00 .0: .0200 .. 2:. 0.0 0.0 ..< 00.0: 0 - paiaumsa 139 .0200 com com. 000 com 8m 8. o 0 .00 00.0200 . m .0202. 02:00 2.. 00.02.00 .0> .0200 . 2:. 01.0 0.02.024 - 5 050.". DGIBLUIISG 140 .0200 oom — coo —. com com 000 8m 0.00 0 .00 00.0200 - 0 .0202. 02:00 2.. 00.02.00 .0> .0200 - 2:. 0.0 0.020.? - an 050.”. paxewusa 141 333 83 009 . 000— 8m 8m 09. m E. 8.3.8 - up .9595 «:53 x5. $38.60 .m> 338a . E: mac 29:22 - an 2:9“. DGIBWIISO Chapter 8 Conclusion This research started with two objectives in mind: 1) Development and validation of a new dynamic O-D estimation method. 2) A determination of model accuracy when applied to an urban area where full data is not available. 8.1 Model Development A new method for estimating origin-destination trips in an urban area based on dynamic traffic count data was proposed. The method is configured as an optimization model consisting of two linear systems of equations. The first equation treats the OD volumes as time-variant variables, which are dependent on the time series of link flows. The second equation expresses a causal relationship between the past and present O-D volumes. This causal relationship is similar to the Kalman State equation, but varies from it in its spatial dependency. In the Kalman filtering setup the State equation defines the future interval O-D volumes as a linear combination of the present interval O-D volumes. In the proposed method, unlike the State equation, the previous interval O-D estimate is defined in terms of the present interval O-D volumes. With this approach additional information can be obtained, which is used to identify the structure and size of the flows in urban road system. 142 143 8.2 Implementation The optimization method used to derive estimates of GB volumes is the Least Square Estimation. This approach is common in all time-dependent O-D estimation models. However, each model uses the LS method in a specific manner. For example, in the Kalman filtering approach the model starts the optimization process with some initial values for the coefficient and variance matrices, applies the LS estimation procedure, updates the coefficients and variances for the next interval and repeats the process. No initial values for the parameters are assumed except the prior O-D trips of interval one of day one. All the parameters needed to calculate the assignment (fraction) matrix are estimated based on specified relationships. The sequential OLS application finds O-D estimates for the 12 intervals of each day of the study period. The link count estimates are then derived from the estimated O-D volumes. The GLS estimates are based on an optimization process that finds an unbiased minimum variance estimate. This is achieved through insertion of weights, which are an inverse function of link error variances, and adjust for the deviations from the observed counts. Prior O—D estimates are adjusted by applying weights which are an inverse function of the O-D error variances to control the magnitude of variations from the previous O-D estimate. TheGLS estimation process is also applied sequentially for 12 intervals. During each subsequent interval, the OD estimates of the previous interval are used as apriori O-D trips. 144 8.3 Model Verification The validity of the mathematical relationships was tested by applying the process to a fully determined artificial network consisting of 7 links and 7 OD pairs. Data such as volume on the links,and link total travel times were obtained through a random number generator process. The true O-D volumes were known. The objective was to replicate O—D volumes by applying the model. The outcome of the model implementation showed close match between estimated O-D volumes and real O-D volumes. The conclusion drawn from implementing the model on synthetic data is that the model is mathematically sound and given enough information can produce a reliable outcome. 8.4 Application The model was applied to a network representing a portion of the city of Boston. In this network, flow on only 58 of 1124 links was known. Travel speed and delay were also only available on these same links, and no "true" O-D data exists. 8.4.1 OLS Runs The OLS was run sequentially for 12 intervals per day for five days. The objective was to minimize the magnitude of the RMSE and %RMSE statistics. Both RMSE and %RMSE stabilized by day five. The distribution of the 145 difference between the estimated link counts and the actual link counts improved significantly over this analysis period. 8.4.2 GLS Runs Similar to the OLS runs the RMSEs and the %RMSEs stabilize after day five. Both RMSEs and %RMSEs remain high. 8.4.3 Alternate OLS-GL5 Runs A comparison of the RMSEs and %RMSEs of 12 intervals of day 5, show the RMSE and %RMSE values increase towards the later intervals. Comparison between the estimated and actual link counts for the alternate GLS runs of day 5 using the adjusted data show some improvement in the distribution around the mean . 8.5 Revision of Assumptions In an attempt to improve the results, assumptions made on speed, delay, and link counts were scrutinized. All three were revised. Speed was generally raised, delay was adjusted accordingly to reflect changes in speed; and counts were raised based on the number of lanes covered by detectors. 146 8.6 Listing of the Assumptions The following assumptions were made in this study. These assumptions will influence the results of the estimating procedure, and should be verified ( or modified) prior to using the model. 1) The value of the constant p: This constant was set equal to 1.5 based on the results of the sensitivity analysis, reasonableness of the (31.5 run RMSEs, and the fact that the minimum path update algorithm developed by the Bureau of Public Roads (BPR) used an update value of 1.5 for the exponent of v/c. This value effects the choice of paths between O-D pairs, and thus link flow. 2) The value of the constant B: This constant was chosen to be equal to -O.3. This choice was based on a reasonable expectation of time savings on the expressway links contained in the network. This value effects the travel time on paths which include an expressway, and thus the path choice and link flow. 3) Speed was calculated based on the average speed of monitored links in a section of the city. This average speed was adjusted by eliminating all the links with what was considered to be unreasonably slow speeds. This speed was then assigned to all the links that fell in that section of the city. This assumption was made because all the individual link speeds were not available. This value effects the travel time on paths passing through the section, and thus route choice and link flow. 4) Average delay was derived from total delay and total volume data for the monitored links within each section of the city. All the links that fell in a 147 particular section were assigned the same delay value. Delay has the same effect as a change in the link speed. The OLS procedure was repeated based on the adjusted data. The distribution of the difference between the estimated link counts and the actual link counts was significantly improved. Both the RMSEs and %RMSEs of the adjusted day 5 GLS estimates are higher than the unadjusted RMSEs and %RMSEs. The error distribution is not significantly better than the unadjusted GLS estimated link count distribution. Overall, this dynamic optimization method is a valid mathematical model. Its applicability to an urban network requires reliable traffic data. When tested on a network with synthetic data, model gives reliable results. When the model was applied to the City of Boston network, where full data were not available, and the parameters had to be estimated from data samples, the model did not give reliable results. REFERENCES 1. 10. 11. Cascetta, E., Inaudi, D., Marquis, G. : Dynamic Estimation of Origin- Destination Matrices Using Traffic Counts, 1991. Cascetta, E., Nguyen, S. : A Unified Framework For Estimating or Updating Origin/ Destination Matrices From Traffic Counts, Transportation Research-B, Vol. 22B, No. 6, pp. 437-455, 1988. Cremer, M., Keller, H. : A New Class of Dynamic Methods For the Identification of Origin-Destination Flows, Transportation Research- B, Vol. 21B, No. 2, pp. 117-132, 1987. Cremer, M., Keller, H. : Dynamic Identification of Flows From Traffic Counts at Complex Intersections. Proc. 8th Int. Symp. on Transp. and Traffic Theory, University of Toranto press, Toranto, Canada. Cremer, M., Putensen, K.: Monitoring of Traffic Density Profiles Using Measurement From Loop Detectors and Vehicle Trip Data. Proc. 11th Int. Symp. on Transp. and Traffic Theory, Yokohama, Japan. Keller, H., Ploss, G. : Real-Time Identification of CD Network Flows From Counts for Urban Traffic Control. Transportation and Traffic Theory (Gartner N. H. and Wilson N. H. M. eds.), Elsevier Science Publishing Co., Inc. Okutani, I. : The Kalman Filtering Approaches in Some Transportation and Traffic Problems, Transportation and Traffic Theory, pp. 1-11, 1987. Ploss, G., Philipps, P., Inaudi, D., Keller, H. : MOTION - A New Traffic Control Concept Based on Real-Time Origin-Destination Information, Transportation and Traffic Theory, 1990. Van Arde, M., Hellinga, B., Yu, L., Rakha, H. : Vehicle Probes as Real- Time ATMS Sources of Dynamic OD and Travel Time Data. CITE ’93 Conference Compendium. Edmonton, Alberta. Van Zuylen, H. j.: Some Improvements in the Estimation of an O-D matrix from Traffic Counts, Proc. of the 8th Int. symp. on Transportation and Traffic Theory, Toronto, 1981. Ben-Akiva, M., Lerman, S. : Discrete Choice Analysis : Theory and Application to TravelDemand. MIT Press, 1985. I48 HIGH A to N STATE UNIV. LIBRARIES 1|“WWW"IWMWWIHIIIHIINWHWWI 31293014107324