u... .. .
.35...
i... . .

: .
.-...u......r.....
:3; 2
an}

.. .x...

. I...
.3 :1.)

1
L4: I)
90.313...
1‘.

.\
m;

....

 

Mugquwhxanm

 

 

TATE N

a... limilﬂ'r‘l’ll‘l’ililmniriﬂii‘imll‘iimiiﬁml

7m 3 1293 01410 7324

 

 

 

This is to certify that the

dissertation entitled

DYNAMIC ESTIMATION OF ORIGIN-DESTINATION
MATRICES FOR AN URBAN NETWORK USING
TRAFFIC COUNTS

presented by

Mahkame Megan Khoshyaran

has been accepted towards fulﬁllment
of the requirements for

Ph.D. Civil Engineering

degree in

 

 

W/z/x/Zw e . 6%».

Major professor

Date October 18, 1995

 

MSU is an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

 

LIBRARY -.
Michigan State I
University

 

 

 

PLACE ll RETURN BOX to tomavothb chodtoutftom your record.
TO AVOID FINES Mum on or before date duo.

DATE DUE DATE DUE DATE DUE

 

.- ‘ 1‘ " -.
mini/w ‘3

‘W_g'_“ I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DYNAMIC ESTIMATION OF ORIGIN-DESTINATION
MATRICES FOR AN URBAN NETWORK USING
TRAFFIC COUNTS

By

Mahkame Megan Khoshyaran

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Civil and Environmental Engineering

1995

ABSTRACT

DYNAMIC ESTIMATION OF ORIGIN-DESTINATION
MATRICES FOR AN URBAN NETWORK USING
TRAFFIC COUNTS

By

Mahkame Megan Khoshyaran

The purpose of this research is to design a model for estimating a dynamic or
time dependent origin-destination trip table for a large urban area and to test the
model using trafﬁc count data obtained from inductive loops on a sample of
arterial streets. So far the only reliable way to obtain an origin destination trip
table has been through surveys at origins and destinations. These surveys are
expensive and they do not reﬂect changes in traffic ﬂow that occur over time.
The hope of this research is to provide a reliable substitute for the traditional
method of obtaining an O-D trip table. The network chosen to test the model is

downtown Boston.

The method of 0-D ﬂow estimation used in this study is sequential. The
algorithm used is the Generalized Least Square Method. The input data used to
test the model is the on-line traffic information obtained from inductive loops on
a sample of local streets. The study period is from 6:00 am. to 9:00 am. The I
length of this study is 5 weekdays. The 1987 O-D table provided by the Boston

Area Regional Planning Authority is used as the apriori 0-D table for the first

interval of day one.

Local street count data were provided by the City of Boston Traffic Control
Center. The validity of 0-D estimates were checked by examining how closely
the estimated link counts derived from the 0-D estimates match the actual link
counts. Statistics used to measure the degree of closeness are Root Mean Square

Error, and Percent Root Mean Square Error.

The model was validated based on testing using synthetic data where the true O-
D volumes were known. The results of the validation proved the mathematical
integrity of the model and its robustness. The model, when applied to the urban
traffic data, did not produce estimates of link counts which would be suitable for
real time trafﬁc control due to the following causes: 1) partial traffic count
data; 2) insufficient detector coverage; 3) lack of speed and delay information

on all the links in the network; 4) lack of sufficient centroid connectors.

ACKNOWLEDGMENTS

I would like to thank my advisor, Dr. William C. Taylor, for his guidance and very
generous support during my years at MSU. Above all, I sincerely appreciate his

friendship.

I am grateful to Professor Moshe Ben-Akiva of the Massachusetts Institute of
Technology for accepting being my dissertation supervisor, and part of my dissertation
committee. I have learned a great deal from him, and his scientific views exerted a
lasting inﬂuence on me. I would like also to thank Professor James Stapleton for his
guidance throughout my research. A particular note of appreciation is due to Professor

Lyles and Professor Maleck for accepting being part of my dissertation committee.

My deepest gratitude goes to my mother Mahin, for her love, patience and support. To

her, this work is dedicated.

iv

Chapter 1
Chapter 2

2.1

2.2
2.2.1
2.2.2
2.2.3
2.2.4
2.3
2.3.1
2.3.1.1
2.3.1.2
2.3.2
2.3.2.1

2.3.3
2.3.3.1
2.3.3.2
2.4

Chapter 3

3.1
3.2

Chapter 4

4.1
4.2
4.2.1
4.2.2
4.2.3
4.2.4
4.3
4.4
4.4.1
4.4.2

TABLE OF CONTENTS

Introduction ............................................................................................. 1
Literature Review .................................................................................... 4
Dynamic O-D Estimation Models. ........................................................ 5
Intersection Flow Models ........................................................................ 5
Method of cross-correlation matrices ............................................... 6
Method of constrained optimization ............................................... 10
Method of Recursive Estimation ..................................................... 12
Estimation by Kalman Filtering. ..................................................... 13
Complex Networks ................................................................................ 16
Linked Static - Dynamic Correlation Method. ............................... 17
Calculation of the Coefficient of Correlation w(i,j) .................. 19
0-D Flow Estimation For a Complex Network ........................ 20
Kalman Filtering Algorithm. ........................................................... 22
Estimating Time Varying O-D Trip Matrices From
Traffic Counts .............................................................................. 22
Generalized Least Square Method .................................................. 25
Simultaneous Estimation. ........................................................... 28
Sequential Estimation ................................................................. 29
Research Objective ............................................................................ 30
Model Development ............................................................................... 33
Recursive Generalized Least Square Estimation ................................. 33
Evaluation of the Accuracy of the Sequential Generalized
Least Square Estimation Procedure. ..................................................... 39
Parameter Estimation ............................................................................. 41
Introduction ............................................................................................ 41
Calculating the Probability of Choosing a Path p(k/ t,r) ..................... 43
Procedure For Finding Multiple Paths ........................................... 44
Calculation of Path Total Travel Time ............................................ 46
Calculating Link Travel Times ........................................................ 47
Estimation of parameter p ................................................................ 48
Procedure For Finding Link-Path Incidence Fraction on (k,t,l,j) ..... 51
Calculation of Error Variances .............................................................. 55
Formulation of The Error Variance Calculation Procedure .......... 56
Calculation of Error Variances For The Estimated Origin-
Destination Volumes ........................................................................ 59

Chapter 5 Model Validation .................................................................................... 60
5.1 Introduction ............................................................................................ 60
5.2 Analysis of The Results. ......................................................................... 61
5.3 Sensitivity Analysis ................................................................................ 61
Chapter 6 Data .......................................................................................................... 71
6.1 Representation of an Urban Area asNetwork ...................................... 71
6.2 Data Collection ........................................................................................ 73
6.3 Data Description ..................................................................................... 74
6.4 Estimating Speed and Delay For All the Links in the Network ......... 75
6.5 Adjustment of the Apriori (1987) 0-0 Matrix. .................................... 76
6.6 Choosing the Parameter u ...................................................................... 77
6.7 Choosing the Bonus Factor B ............................................................... .78
6.8 Test of Model Responsiveness When Different Weights are

Assigned to the Count Data versus Prior O-D Estimates Data .......... 78
Chapter 7 Analysis of the Results ........................................................................... 85
7.1 Introduction ............................................................................................ 85
7.2 The OLS Procedure ................................................................................. 85
7.3 The GLS Procedure ................................................................................. 86
7.4 Revision of the Assumptions ................................................................. 88
7.5 Adjustment Procedure ........................................................................... 89
7.6 An alternate OLS-GL5 approach ........................................................... 92
7.7 Possible Sources of Differences Between the Estimated and

Actual Link Counts ................................................................................. 93
Chapter 8 Conclusion ............................................................................................. 142
8.1 Model Development ............................................................................. 142
8.2 Implementation ..................................................................................... 143
8.3 Model Verification ................................................................................ 144
8.4 Application ............................................................................................ 144
8.4.1 OLS runs .......................................................................................... 145
8.4.2 GLS Runs ......................................................................................... 145
8.4.3 Alternate OLS-G15 Runs ............................................................... 145
8.5 Revision of Assumptions 145
8.6 Listing of the Assumptions .................................................................. 146
REFERENCES ........................................................................................................... 148

vi

Table 1.

Table 2.

Table 3.

Table 4.

Table 5.

Table 6.

Table 7.

Table 8.

Table 9.

Table 10.

Table 11.

Table 12.

Table 13.

Table 14.

Table 15.

Table 16.

Table 17.

Table 18.

Table 19.

LIST OF TABLES

Comparison of Predicted Versus Observed Link Flows - GLS

runs - day 1 and day 2 - Sample Data ..................................................... 64
RMSE and %RMSE for Different Values of the Constant

Parameter u -interval 1 of day 2 - Sample Data ...................................... 70
RMSE for different values of the constant parameter p -interval

1 of day 1 ................................................................................................... 81
RMSE and %RMSE for different values of the bonus factor B-

interval 1 of day 1 .................................................................................... 2.82
RMSEs For Different Values of the Weighting Factor on - 12

intervals -GLS run ..................................................................................... 83
Comparison of Actual Versus Estimated Link Counts - 12

intervals - GLS run .................................................................................... 84
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Day 1 .............................................................................................. 100
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Day 2 .............................................................................................. 101
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Day 3 .............................................................................................. 102
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Day 4 .............................................................................................. 103
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Day 5 .............................................................................................. 104
Comparison of Predicted Versus Observed Link Flows - GLS

Run - Day 1 .............................................................................................. 110
Comparison of Predicted Versus Observed Link Flows - GLS

Run - Day 2 .............................................................................................. 111
Comparison of Predicted Versus Observed Link Flows - GL5

Run - Day 3 .............................................................................................. 112
Comparison of Predicted Versus Observed Link Flows - GLS

Run - Day 4 .............................................................................................. 113
Comparison of Predicted Versus Observed Link Flows - GLS

Runs - Day 5 ............................................................................................. 114
Comparison of Predicted Versus Observed Link Flows - OLS

Run - Adjusted Data - Day 5 ................................................................ 120
Comparison of Predicted Versus Observed Link Flows - GLS

Run - Adjusted Data - Day 5 .................................................................. 126
Comparison of predicted to observed link flows - Alternate

GLS run - day 1 ....................................................................................... 132

vii

Table 20. Comparison of Predicted Versus Observed Link Flows -

Alternate GLS Run - day2 ...................................................................... 133
Table 21. Comparison of Predicted Versus Observed Link Flows -

Alternate GLS Runs -day 3 ..................................................................... 134
Table 22. Comparison of Predicted Versus Observed Link Flows -

Alternate GLS Runs -day 4 ..................................................................... 135
Table 23. Comparison of Predicted Versus Observed Link Flows -

Alternate GLS Runs - day 5 .................................................................... 136

viii

LIST OF FIGURES

Figure 1. Drawing, Depicting the Synthetic Network .......................................... 63
Figure 2. GIS run - actual vs. estimated link counts interval 1 - day 2 ................ 65
Figure 3. GIS run - actual vs. estimated link counts interval 3 - day 2 ................ 66
Figure 4. GLS run — actual vs. estimated link counts interval 6 - day 2 ................ 67
Figure 5. GIS run - actual vs. estimated link counts interval 9 - day 2 ................ 68
Figure 6. GIS run - actual vs. estimated link counts interval 12 - day 2 .............. 69
Figure 7. Flow Chart Depicting the Overall OLS - GLS Procedure ...................... 97
Figure 8. Drawing, Depicting the Location of the Over-Estimated Access / Egress
Links to a Centroid 98
Figure 9. OLS run - actual vs. estimated link counts interval 1 - day 5 ............ 105
Figure 10. OLS run - actual vs. estimated link counts interval 3 - day 5 ............ 106
Figure 11. OLS run - actual vs. estimated link counts interval 6 - day 5 ............ 107
Figure 12. OLS run - actual vs. estimated link counts interval 9 - day 5 ............ 108
Figure 13. OLS run - actual vs. estimated link counts interval 12 - day 5 .......... 109
Figure 14. GLS run - actual vs. estimated link counts interval 1 - day 5 ............ 114
Figure 15. GIS run - actual vs. estimated link counts interval 3 - day 5 ............ 115
Figure 16. GLS run - actual vs. estimated link counts interval 6 - day 5 ............ 116
Figure 17. GLS run - actual vs. estimated link counts interval 9 - day 5 ............ 117
Figure 18. GL5 run - actual vs. estimated link counts interval 12 - day 5 .......... 118
Figure 19. OLS run - actual vs. estimated link counts interval 1 -

adjusted day 5 data .................................................................................................. 121
Figure 20. OLS run - actual vs. estimated link counts interval 3 -

adjusted day 5 data .................................................................................................. 122
Figure 21. OLS run - actual vs. estimated link counts interval 6 -

adjusted day 5 data .................................................................................................. 123
Figure 22. OLS run - actual vs. estimated link counts interval 9 -

adjusted day 5 data. ................................................................................................ 124
Figure 23. OLS run - actual vs. estimated link counts interval 12 -

adjusted day 5 data .................................................................................................. 125
Figure 24. GLS run - actual vs. estimated link counts interval 1 -

adjusted day 5 data .................................................................................................. 127
Figure 25. GLS run - actual vs. estimated link counts interval 3 -

adjusted day 5 data .................................................................................................. 128
Figure 26. GLS run - actual vs. estimated link counts interval 6 -

adjusted day 5 data .................................................................................................. 129
Figure 27. GLS run - actual vs. estimated link counts interval 9 -

adjusted day 5 data 130
Figure 28. GIS run - actual vs. estimated link counts interval 12 -

adjusted day 5 data .................................................................................................. 131
Figure 29. Alternate GLS run - actual vs. estimated link counts interval 1 -
adjusted day 5 data .................................................................................................. 137

ix

Figure 30. Alternate GLS run - actual vs. estimated link counts interval 3 -

adjusted day 5 data .................................................................................................. 138
Figure 31. Alternate GLS run - actual vs. estimated link counts interval 6 -
adjusted day 5 data .................................................................................................. 139
Figure 32. Alternate GLS run - actual vs. estimated link counts interval 9 -
adjusted day 5 data .................................................................................................. 140
Figure 33. Alternate GLS run - actual vs. estimated link counts interval 12 -
adjusted day 5 data .................................................................................................. 141

Chapter 1

Introduction

The purpose of this research is to design a model for estimating a dynamic or
time dependent origin-destination trip table for a large urban area and to test the
model using traffic count data obtained from inductive loops on a sample of the
arterial streets in the network. So far the only reliable way to obtain an origin
destination trip table has been through surveys at origins and destinations.
These surveys are expensive and they do not reﬂect changes in traffic ﬂow that
occur over time. The hope of this research is to provide a reliable substitute to

the traditional method of obtaining an OD trip table.

One potential use for time dependent O-D data is in controlling traffic ﬂow in
freeway corridors. Ramp metering strategies need O-D data to determine the
optimum on-ramp diversion rates to parallel arterials. Likewise, incident
management strategies which divert traffic from freeways to parallel service
roads should consider the prevailing corridor O-D pattern. The need for

0D data is especially acute in freeways with express / collector and transfer
lanes, and in metropolitan-wide control systems which consider freeway to
freeway diversions during peak periods. Dynamic O-D estimates are required
for the analysis of time-varying dynamics of freeway congestion growth and

decay.

The use of time dependent O-D data is not limited to freeway traffic analysis.
An advantage of estimating dynamic 0D data for a large urban area is for use

in on-line control of traffic signals. The new generation of traffic signal models,

1

such as the IVHS model "MOTION" for traffic dependent signal control, rely on
0-D ﬂows for the determination of ﬂow profiles, as well as the incremental
optimization to adapt signal settings to changing traffic conditions. Control
systems in use today can not adapt to the near term projected traffic conditions
in the network, because they have no information about the existing O—D ﬂows.
These models, currently depend on historical data, if the system includes any
forward looking capability. To develop more effective strategies for better on-
line signal control it is necessary to improve the existing models for estimating
O-D ﬂows.

The ultimate goal of future looking traffic adaptive control systems is to predict
variations of ﬂows in a network so that traffic signals can react accordingly. The
conventional way of collecting traffic data by traffic surveys is not suitable for
this purpose. There have been a considerable number of studies conducted to
determine if an O-D matrix could be derived from traffic counts, because these
data can be collected and processed automatically [Cremer,Keller, Okutani,

Casetta] [3,4,7,1].

The need for reliable time-dependent O-D data for traffic analysis in a
metropolitan area has been identified by many European researchers as well.
The European research projects DRIVE and PROMETHEUS are designed to
collect more and better data about real time traffic ﬂow. MOTION [8] is an
on-line signal control system of the DRIVE project. MOTION has been
developed by using the experience gained from the German off-line signal

control model SIGMA and existing on-Iine systems in Italy.

This model determines: (1) split as a function of traffic volume and turning
movements at the intersections, (2) cycle time according to the traffic volume at
critical intersections, (3) number of phases, and on a future development level,
phase sequence as a function of the minimum sum of intergreen times and the
time-space-diagrams and (4) offsets as a multi-fimctional result of origin and
destination ﬂows in the network, traffic volumes and an impact analysis of
traffic and environmental criteria. Different modules of this system include O-D
based estimates of ﬂow at the intersections, and a dynamic optimization plan

based on network O-D ﬂow estimates.

The reference to signal timing adjustment is to a traffic adaptive signal timing
arrangement, not to traditional actuated operation of traffic signals. Under
advanced adaptive signal control strategies, adjustments are made in response to
0-D demand profiles in a network for various time slices. This allows the
system to select the most appropriate signal plan from a predeveloped library.
Individual cycle phases are then adjusted based on the volume of traffic on

individual streets during a cycle.

Chapter 2
2. Literature Review

The subject of dynamic origin destination trip table estimation using on-line
traffic count data has been investigated by many researchers. So far the main
area of concentration has been in estimating time dependent O-D trips for a
section of a freeway with few entering and exit ramps, or an intersection with.
(m>1 )entries and (n>1) exits [Cremer, Keller, Cascetta] [3,1]. Until now, only
limited studies have been done on estimating time dependent O-D trips for an
urban area , using traffic counts received from detectors on arterial roads. A few
researchers have tested dynamic O-D estimation models on very small
networks; using simulated data [Keller, Ploss, Okutani][6,7]. The networks used
in testing different O—D estimation models have 6 or 7 centroids, and a minimum
of 13 links and maximum of 40 links [Okutani][7] . These dynamic O-D
estimation models produce good O—D estimates under controlled conditions. To
validate the dynamic O-D matrix, comparisons have been made with data

obtained either from a sample or through simulation.

The idea in dynamic 0D trip table calculations is to estimate time dependent
variations in 0-D patterns. The assumption is that the 0-D volumes are not
constant, but rather vary between successive time periods. In estimating O-D
volumes for each interval, no matter what kind of modeling approach is used,
information such as departure time from the origin, route taken to go from the
origin to the destination, total travel time from an origin to a destination given a
particular route, and arrival time on various links of a route are needed. The

following paragraphs describe some of the issues that need to be considered in

modeling O-D travel. In addition several O-D modeling approaches will be

reviewed.

2.1 Dynamic O-D estimation models

Several dynamic O-D estimation methods have been tried by researchers. In the
early stages of dynamic 0D estimation, the type of geometry considered was an
intersection. The parameter estimation was limited to finding the fraction of the
volume that entered through entrance i, which exits from exit j. Later, freeway
segments with a few entering and exit ramps, and small networks were
considered. The following are the most significant algorithms originally used in
dynamic O-D estimation for an intersection [Cremer, Keller] [2,3], and later
extended for use in estimating time-varying O-D matrices for networks with few
intersections: Constrained optimization method, Ordinary least square
estimation, Recursive estimation, and estimation by Kalman filtering. The
following are the most significant algorithms used in dynamic O-D estimation
for complex networks such as freeways with entering and exit ramps: Linked
static -dynamic estimation method [Keller, Ploss] [6], estimation by Kalman
filtering[Okutani][7], Generalized Least Square Estimation[Cascetta][1]. The
method of Recursive Generalized Least Square Estimation is the method used in
this study to estimate O-D trip for an urban area. Therefore a thorough

description of this method is provided in a later section.

2.2 Intersection Flow Models

In many cases traffic ﬂows through a complex intersection can not be measured

directly, and therefore must be estimated from vehicle counts at the entries and

exits of the intersection. The following methods were used in estimating the
magnitude of traffic ﬂows through complex intersections by using vehicle counts
at the entries and exits of the intersection . Furthermore, it was shown that by
using the time variant quality of volume counts a unique solution to the problem

of estimating unknown ﬂows was obtained.

2.2.1 Method of cross-correlation matrices [Cremer, Keller][2,3]

This is a least square estimation method, and was used in estimating O—D ﬂows
for an intersection with (m>1) entries, and (n>1) exits. A period H was chosen.
This period was then divided into k intervals of length T each. A sample of
observations was taken for each period. During each sampling period, the
volume q(i,k) that entered the entrance (i) during (k-1) T St < kT was observed.
Volume which left exit j, W) k) during (k-I) T + t S t < kT + ‘C , 1 >0 was also
observed. The objective was to estimate the part of volume q(i,k),f(i,j,k) which
left the intersection through exitj during (k-I) T + t S t < kT + t. where t is the
average travel time a vehicle needs to pass from an entry to an exit. f(i,j,k) is

formulated as:

f(i.j,k) = WM. 120', j, k) (1)

Mi, 1', k) is the fraction of q(i,k) that go from i to j. The following assumptions
hold for b( i, j, k)

0.<.b(i, j, k) 31 for all i,j,k (2)
fl

2: b(i, j, k) = 1 for all i,k, (3)
i=1

b(i, i, k) = 0 for i=1,...,min(m,n) (4)

Balancing the departing and entering vehicles, for each time interval yields the

following relationship
m
yfj,k) = X f(i,j,k) (5)
.=1

since each O-D ﬂow is a certain portion of the ith entering ﬂow, f(i,j,k) can be

replaced by its equivalent in equation (1). The following relationship then holds
m

y(j,k) = 2 q(i,k). b(i, j, k) (6)
i=1

The matrix form of equation (6) is written as:

Y'(k) = (1'00 - WC) (7)

where y’(k) is a (1 x n) row vector with elements y(j,k), q’(k) is a (1 x m) row

vector with elements q(i,k) and B(k) is an (m x n) matrix with elements b( i, j, k).

Since q’(k) and y’(k) are obtained through sampling, the only parameter that
needs to be estimated is B(k). To calculate B(k) first, the mean values ti, 3’, and

B are calculated.

K
E =(1/K)2 q’(k)
k=1
K
y’ =(1/I<)E y’(k)
k=1
K

”E = 0/10 2: B(k)
k=1

K is the total number of sampling intervals. q’(k) , y’(k), and B(k) can be

expressed as the sum of their mean plus a random deviation
cl’(k) = i + AQ’Uc)
Y'(k) = 'y" + A)"(k) (8)

B(k) = B + AB(k)
inserting this into equation (7) gives

y’(k) = (1'00 - (E + A1300) (9)

Or

7’ + AY’UC) = (E + Acl'(k)) - (E + AB(1‘)) (10)

taking the mean on both sides of equation (10) over a whole observation period,

average exit flows are obtained as
K

y’ = 'q'. E + UK 2 Aq’(k). AB(k) (11)
k=1

subtracting equation (10) from equation (11) and substituting the summation

index k with l the following relation is obtained

K
Ay’(k) = Aq’(k) . '1? + {q’(k) . AB(k) - 1/1< 2 Aq’(l). AB(I) (12)
l=1

assuming AB(k)=0 causes the second term in equation (12) to vanish resulting in

the following

A3"(k) = AQ’UC) - E (13)

introducing the following abbreviations

Q = [Aq'(1).Aq'(2)...... Aq'aol'
(14)

y = [Arm ,Ay(2) . ...... . Ay’(1<)]’

prime denotes transpose of a vector. Partitioning the observation period into '
subsequent time intervals and using time sequence of traffic counts provides
additional information which solves the problem of an underdetermined system

of equations and leads to a unique solution.

The least square error solution to equation (13) is given by

E =(Q'Q)'1. 0'. Y (15)

where
Q Q = Z Aq(k)°Aql(k) = Kq’qq
Q . Y = 2 Aq(k). Ay’(k) = K. ¢qy
where ((1% and ¢qy are the cross-correlation matrices correlating the sequences of

Aq(k) with Ay(k), respectively over the observation period k=1,..,K. With this

notation equation (15) can be rewritten as

A _ -I
B ' ‘l’ 44%

B is the least square solution of B.

10

The above estimation procedure was tested using synthetic and real data from
different studies. A 5—leg intersection was considered for testing this model.
The synthetic entry flows q(i,k) were generated as integer random numbers. For
the deviations AM i, j, k) random numbers from a normal distribution with zero
mean and a standard deviation of 0.05 were taken first. Then these deviations
were updated such that conditions (3), and (4) were preserved, and gave realistic
integer ﬂows ﬂ i,j,k) . Comparison of the OD table of synthetic data with those
from the dynamic estimation procedure showed a close match [Cremer, Keller]

[3,4].

2.2.2 Method of constrained optimization [Cremer, Keller] [3,4]

Here the estimation problem described above is reformulated as a constrained
optimization problem. Given a sample of entries and exits over K intervals, an
estimate for B , B is obtained such that the squared error between the observed

exit flow, and estimated exit ﬂow is minimized.

K
I=min(1/KZIly(k)-9(k)l|2) (16)
k=1
where
m
y(i,k) = 2 q(i,k) . b(i,j,k) for all j,k
i=1
0 _<. b(i,j,k) $1 for all i,j,k
n
2 b(i,j,k) =1 for all i,k
i=1

b(i,j,k) = 0 for i=1,...,min(m,n)

11

91k) =q'<k).ﬁ (17)

where y'(k) is a (1 x n ) vector of observed exit ﬂows, and y’(k) is the (1 x n)
vector of measured flows. q'(k) is a (1 x m ) vector of volumes that enter through
entrance i. B is an

(m x n) matrix of estimate for the unknown matrix B. The optimal solution for
B can be computed within the admissible parameter space. The solution is

found as follows by substituting for ya.) in equation (16)

K

I =(<1/I<’) 2 y'(k> . sac) - 2 0'06) .1? + q'(k) 1‘3 ﬁnqtk» (18>
=1

the optimal solution for B can be obtained by setting the first derivative of J

with respect to B equal to zero.

K K
I= -2/I< ( 2 q(k).y'(k> ) + 2/I< (2 q(k) q'(k» . 13* = o (19)
k=1 k=1

Here 13* represents the optimal solution under the assumption that the solution
lies inside the admissible parameter space. The optimal solution 13* to equation

(16) is given by
., _ -1
B ‘ ¢ qq°¢qy

¢qq and ¢qy were defined earlier. This method when tried under the same .
geometry and data described for the cross-correlation method gave better O-D

estimates mainly because it uses the information from all equations formulated.

12

2.2.3 Method of Recursive Estimation [Cremer,Keller] [3,4]

Unlike the two methods described before, the predicted exit ﬂow deviations are
computed from the measured entry ﬂow deviations based on the estimation of

the (k-1)th step by a model similar to equation (13) as follows
Ay’(k) =Aq'(k). ink-1) (20)
this prediction then is inserted into the following recursive correction formula;

130.): B(k—1)+ y. K(1 /1< Z Aq(k) . (Ay(k) - 119(k)), summed over k=1,...,K
(21)

where y is a gain factor.

The following assumptions assure the convergence and stability of the above
equation
1) The expected values of Aq(k) and AB(k) are zero. AB(k) is an (m x n) matrix

of random deviations in split parameter B.

E( Aq(k) ) = o
E( AB(k)) = o (22)

2) Covariances are given by an (m x m) matrix
E{Aq(k). Aq’(1)} = (Sq, 0) (23)

S for k: l, and Ofor k¢l

q

13

E( Ab(i, k).Ab’(j, 1)) = (V), 0) (24)

V,- for i=j,and k=l,and 0for i¢jork¢l
E( Aq(k).Ab’(i, 1)) = 0 for all i,k,l
3) the gain parameter 7 should be chosen according to

O< 7 <2/ (m .03)
a; = max ( a Aqi2(k)) (25)

The exit ﬂows are updated based on the estimated entry ﬂows Aq(k). This
procedure was tested under the same geometry and with synthetic data as the
previous models. The results were better than the other two methods above
when the split parameters b( i, j, k) varied significantly from hour to hour (i.e.
when the OD pattern was different at different hours) due to better tracking
abilities.

2.2.4 Estimation by Kalman Filtering [Cremer, Keller] [3,4]

The Kalman filtering technique is another recursive estimation method.

Consider the equation (6)
m
y(i,k) = 23 (103k). b(i, j, k) = q’(k).b(i,k) (26)

i=1

14

b(i,]c) is the jth column of matrix B(k) containing all split parameters which are
related to the jth exit ﬂow. The dynamic nature of the split parameter b(i,k) is

incorporated via the state equation, wherein

b(iJc) = b(i-I. k) + w(k) <27)

w(k) being a random error term with mean zero and known covariance W.

E{ w(k)} = 0

ElAw(k). Aw’(k)} = (w, 0) (28)

W for k=l, and zero for k¢l.

Information about the state variables is given by the measurement of the jth exit
ﬂows which are related to the b(j,k) by equation (26), plus a random
measurement error vector v(k). The relationship between y(i,k) and q(k) is linear

and is specified by the measurement equation

yolk) = q’(k) - b(i, k) + v(k) (29)
with

E{ v(k)} = 0

Etv2(k)} = (11.0) (30)

Kalman filter generates bias-free estimates only if the system model is

completely observable. This implies that

rank[ q(k-K), q(k-K+1), q(k-K+2),....,q(k)] = n (31)

where n is the number of exits, K is the total number of intervals, and k

represents each interval. This condition implies that the matrix Q of

15

equation(14) or equivalently matrix ¢qq has full rank. It is important to note that
choosing the OD ﬂows f(i,j,k) directly as state variables would set up a system
description which is not completely observable and for which the Kalman filter
could not be applied.

The split parameter b(j, k) can be estimated by the following set of equations

d(k) = [vac-1) +Wlq(k)[q’(k)[l’(k-1)+W]q(k)+R]'1 (32)

d(k) is called the filter gain, and P is the estimation error covariance matrix.

Updating the estimation error covariance matrix is obtained from
Mt) = [I - d(k) - Q’(k)][P(k-1)+Wl (33)
b(i, k) is estimated by the following equation

130,1.) = Bonk-1) + d(k) (yank) - q(k)-130.1(4) (34)

For the start of this computation an initial value of the covariance matrix of
estimation errors P(0) must be specified. If the assumptions of the Kalman
filter are fulfilled, the filter equations compute bias-free, minimum variance
estimates. Otherwise, estimates may not necessarily be optimal, but have
minimum variance. The Kalman filtering process when applied on the same
geometry as the above models with the same synthetic data does not produce the
best results. The reason for this is that the assumption of the random variations
as being Gaussian white noise is only partially fulfilled. The results showed

considerable ﬂuctuations trying to follow the real variations.

16

2.3 Complex Networks

Three major algorithms that are used in estimating O-D matrices for complex
networks will be reviewed here. The first algorithm is the Linked Static-
Dynamic Correlation Method developed by Keller and Ploss [6]. This algorithm
is an extension of the model developed by Cremer and Keller [3,4] which was
explained earlier under the section on single intersections or partial networks.
The linked static-dynamic method for the estimation of O-D ﬂows in .

transportation systems from traffic counts is based on the following concept.

The model finds the coefficient of correlation (or weights) between an entry
volume and an exit volume profile. The coefficient of correlation is an indicator
that shows to what extent the variation of one variable is determined by the
variation of another variable. These coefficients are used as weights for the OD
ﬂows, and are linked with an entropy maximization model to find a unique O-D
estimate. This new approach uses all the information that can be used for partial

networks or single intersections to improve the results for the total system.

The Linked Static-Dynamic Correlation Method is the algorithm used in project
"MOTION" for estimating O-D ﬂows. "MOTION" is an on-line signal control
system developed using the German off-line signal control model SIGMA and

existing on-line systems in Italy.

The Kalman filtering algorithm for complex networks was developed by
Okutani [7]. In this paper, the ideas introduced by Keller and Cremer [3,4] are
extended to a general network. Provisions are made to solve the problem of an

underdetermined system.

17

The third model that will be reviewed is the dynamic estimation of O-D ﬂows
based on time-varying traffic counts and any other available information such as
outdated O-D matrices. Two types of optimization procedures are introduced.
A simultaneous GLS estimation which produces joint estimates of the whole set
of O-D matrices, one for each time interval, and a sequential GIS estimation

which produces a sequence of CD estimates for successive time intervals.

2.3.1 Linked Static-Dynamic Correlation Method [Keller, Ploss] [6]

The relationship between link ﬂows and OD demand is essential for dynamic

O-D estimation. This relationship is expressed by
I I
V0) = X X p(i,j,l) . f(i,j) (35)
i=1 j=1

V(1) represents volume on link 1; p(i,j,l) is the fraction of O-D ﬂows f(i,j) on a
path that includes link I. The system of linear equations based on equation (35)
is highly underdetermined. However, by including the estimated turning ﬂows
at intersections and cordon line counts on the periphery of the system, additional

equations for the determination of the OD ﬂows can be developed.

This addition reduces the degree of under determination of the systems
considerably. As a result, this method has two parts: part one, is estimation of
the O—D ﬂows at intersections using a linked static dynamic correlation model.
This information is then added to part two to decrease the degree of under
determination. Part two is the estimation of the O-D ﬂows in complex traffic

networks with the linked static-dynamic method.

18

PART ONE:
Estimation of the OD ﬂows at intersections in simple networks using a linked

static-dynamic correlation model.

Definitions
I Number of entries in a system
I Number of exits in a system

T Number of intervals with the length At in a time period p

q(i,k) volume passing entry i during interval k, (k=1,..,T, i=1,..,I)
Fiﬁ) = UT 2 q(i,k), summed over k=1,..,T

y(j, k) volume passing exitj during interval k, ( k=1,...,T, j=1,....,])
70') = [If 2 110) k), summed over k=1,...,T

t(i,j) travel time from i to j (integer multiple of At )

f(i,j) partial volume from i to 1' during the observation period p
b(i,j,k) = f( i,j)/q( i) (split parameter), 02 b(i,j,k) $1

The following static conditions have to be satisfied

I
(1) q( i) = X f(i,j) ,
j=1
I
(2) y(j) = X f(i,j),
i=1

J
(3) Z b(i,j,k)= 1,

i=1

19

2.3.1.1 Calculation of the Coefficient of Correlation, w(i,j)

The coefficient of correlation w(i,j) can be obtained from the following equation
which is similar to the correlation coefficient . The correlation coefficient gives a
quantitative measure of how strong the linear relationship is between x and y-

i.e., to what degree x and y are related.

T
w(i,j) = { Z ((q(i,k) - 5(2)) . ( y(j,(k+t(i,j)) - 7(1)) )2 }
k=I
T
+{ 2 ((40310 --q-(i»2. 2t y(j,(k+t(i,j)) - r0»?!
k=1

large variations in w(i,j) can result if the observation period p is small. This can
be avoided by exponential smoothing of w(i,j), using a smoothing factor or as

follows

w(i.j) = a . w( 231', p) + (l-a) . w(ij. p-l)
O-D ﬂows can be expressed by
f(i,j) = w(ij) . Q(z') . Y0)

the unknown values Q( i) and Y( j) can be obtained by
l

Q(i) =q(i) + 2 (YO) . w(i,j)).
i=1

26)

1
Y0) = Y(i) +2 (Q(i) . w(i,j)).

i=1
Using an iterative solution method such as an entropy model, a unique solution
to this system can be found that satisfies the static conditions (1) and (2) exactly
and includes the dynamic weights w(i,j) for the most recent period. This system

of equations are added to equation (34).

PART TWO:
2.3.1.2 O-D Flow Estimation For a Complex Network

The first step in this procedure is to find the shortest path between each O-D pair
that includes link 1. Given information such as entry i, counting site a, and exit j,
and using deterministic or stochastic successive capacity restrained algorithms,
these set of paths can be found. The fraction p(i,j,l) of the total number of
vehicles f(i,j) that take a certain route that includes link I can be determined
given a set of paths. p(i,j,l) is 1 if link 1 falls on a path going from i to j, and 0
otherwise. Travel time from i to j, t(i,j) is the travel time on the shortest path
between i and j. Based on the travel times, the weights w(i,j) are recalculated
and applied to the ﬂows from any entry to any exit at the cordon line of the

system. O-D volumes can be formulated as

f(i,j) .-. w(i,j) . xa» . II xa) . p(i,j,l)
I

p(i,j,l) is equal tol if link I is on the path between i,j and is equal to zero
otherwise.

21

with
I I I I
X(0)={2 Xﬂi,j)}+{2 Xw(i,j)},
i=1 j=1 i=1 j=1

A solution for X(l) can be found iteratively by applying the balancing algorithms
of the entropy model of Van Zuylen [10]

V0) = 22 p(ijJ) . w(i,j) . X(l) . II(X(1) . p(i,j,l))
I

By including the turning ﬂows of intersections inside the observed network that
have already been estimated according to the algorithm in part one, new
equations can be set up which decrease the degree of under determination. The

entropy algorithm should converge reasonably well.

The method suggested by Keller and Ploss [8] has been incorporated in the
system ” MOTION ”, but this model is still in the experim ental stage. Many
improvements can be made by exploring the effects of multiple paths on the
estimation of dynamic weights w(i,j), and travel times t(i,j). The above method
considers the entropy model as the only way of finding the optimal O-D ﬂows
for a network and does not consider other optimization algorithms which may

be more efficient in 0D ﬂow estimation.

22

2.3.2 Kalman Filtering Algorithm

2.3.2.1 Estimating Time Varying O—D Trip Matrices From Trafﬁc
Counts [Okutani][7],[12,13]

This is an extension to a larger network of the Kalman ﬁlter formulation for
estimating O-D ﬂows at an intersection introduced by Cremer and Keller [3,4].
The model is formulated for a network with n O-D pairs and M links. Let f(i,t)
denote the ith O-D ﬂow during a time period t and y(j,t) denote the observed
traffic counts on link j during the same period t. Let f( t) be an n dimensional
vector with elements f(i,t) , then the next period O-D ﬂows can be expressed by

the following relation known as the state equation:

f(t+1) = 2 Hm(t)f(t-m) + w( t), summed over m=0,..,q (36)

where the input, or plant, noise w(t) is a zero-mean white-noise process, with

covariance
E! w( t) } = 0
cov {w(t) ,w(s) I = R515 5ts = 1 for t=s

6ts = 0 for tats

Setting q (the delay time) equal to zero and H0( t) equal to an identity matrix,

equation (36) can be rewritten as

f(t+1) = f(t) +w(t)

The link volume is expressed as a fraction of O-D ﬂows f( t) given by the linear

algebraic relation, known as the measurement equation

23

y( t) =2 P(m,t) . f(t-m) + 'v( t), summed over m= ,..,p
(37)

where the measurement noise v(t) is a zero-mean white noise process, with
13100)) = 0

covaa),v(s)) = R155 815 = 1 or t=s
ts = 0 for tats

where

y( t) = (y1(t),y2(t)....,yM(t))

In equation (37) P(m,t) is the proportion of O-D f(t-m) ﬂows which use link j
during period t. This is an (M x n) matrix. p represents delay. Some
transformation of equations (36) and (37) are necessary to show delay in O-D

ﬂows. Therefore, O-D ﬂows are redefined as

f(m,i,t) = f(i,(t-m)) (i=1,2,...,n, m=0,1,..,r)

1' = max(p,q)

Let f(m,t) denote a matrix with elements f(m,i,t), i.e.

f(m,t) = ( f1(m,t), f2(m,t),...., fn(m,t))’
where prime denotes a transpose. Let F(t) be the vector of dimension n(r+1)

F(t) = (f0(t),f1(t),...,f'(t))’
equations (36) and (37) can be rewritten as

F(t+1) = (b(t) P(t) + W(t)

Y(t) = P(t)F(t)+ v(t)
where

24

‘ H(t)
a...) = j" - -
I 0
d>(t) is an n(r+1) x n(r+1) matrix.
W( t) = (wt(t), 0)
W(t) is an n(r+1) x 1 matrix.
pa) = ((p0(t),p1(t),...,pr(t»
P(t) is an (M x n(r+1)) matrix.
H(t) = (H0(t),H1(t),....,Hr(t))
H( t) is an (n x n(r+1)) matrix.
H(m,t) = 0: (n x 11) matrix for m > q
P(m,t) = 0 (M x 11) matrix for m > p
The estimate of F( t), F( t) can be obtained from the following set of equations
17:04.1): «maﬁa-1) + K(t)[Y(t) - Panama-1)] (38)

K(t) is called the Kalman gain matrix with dimensions ( n(r+1) x M) and is
estimated by

1(a) = S(t)Pt(t)[P(t)S(t)Pt(t) + Kl]-1

S( t) is and ( n(r+1) x n(r+1)) estimation error covariance matrix called Apriori

variance. S(t) satisfies the following system of matrix difference equations
sa) = <I>(t-1)Q(t-1)d>t(t-1) + R (40)

Q(t) = [1- K(t)P(t)]S(t) (41)

Q(t) is called the A posteriori variance.

25

The initial conditions for HO), and 8(0) have to be selected before using the
above results for estimating O-D volumes.

The Kalman Filtering methodology was tested using a network with ﬁve and
eight links. Although the method performed well for a small network, it is still
in the early stage. Before definitive conclusions are drawn, extensive
experimentation must be carried out. The main computational difﬁculty of this
method is that the inversion of a large matrix is required in the algorithm.
Therefore, a way of reducing the dimensionality must be sought along with an
efficient algorithm to achieve this inversion when applying the method to a

network of practical size [Okutani] [7].

2.3.3 Generalized Least Square Method [Cascetta][1]

The relationship between link ﬂows and OD ﬂows is not static, but varies across

time intervals. The dynamic link O-D ﬂow equation is expressed as

"0d )1
v(l,h) = X Z p(r,t,l,h) . f(r,t),
r=1 t=1

where 00,11) is ﬂow on link 1 during interval h. f(r,t) represents O-D ﬂows for an
OD pair r that left the origin during some previous interval t and is on link 1
during h. p(r,t,l,h) is the fraction of O-D ﬂows contributing to the ﬂow on link 1
during interval h. The link O-D ﬂow equation can be expressed in term of path

ﬂows through the following equation

P(k,t) = d(r, t) . q(k,t) (42)

26

120,11) = a(k,t,l,h) . F(k,t), k=1,...,Kr
t=1,..h (43)

P(k,t) is the ﬂow between O-D pair r following path k that left the origin during
previous intervals t. q(k/t) is the probability of choosing path k conditional on
the departure interval t. a(k,t,l,h) is called the link-path incidence fraction which
is the fraction of path ﬂow P(k,t) contributing to link ﬂow v(l,h).

By combining the two equations (42) and (43) the following relationship is

obtained

Kr
p(r,t,l,h) = 2 (10:111.) . q(k/t) ,
k=1

Kr denotes a set of all paths connecting O-D pairs r. An estimate for p(r,t,l,h) can

be found by obtaining estimates for a(k,t,l,h) and q(k/t) represented by d(k,t,l,h)
and ﬁat/t).

q(k/t) can be found by a random utility model such as
q(k/t) = prob(C(k,t) + 0(k,t) .<_ C(m,t) + 0(m,t)) V m e Kr

where

C(k,t) is the cost associated with taking path k, conditional on departure time t
C(m,t) is the cost associated with taking path m, m at k, conditional on departure
time t
0(k,t) is the difference between the perceived cost and the actual cost of taking

path k

22

0(m,t) is the difference between the perceived cost and the actual cost of taking

path 171

A path choice model can be obtained by making different assumptions on the
joint distribution of error term 9(k,t) . The most popular specifications based on
zero variance, Weibull-Gumble and a multivariate normal distribution of

residuals are Logit and Probit models.

The link - incident fractions a(k,t,l,h) can be found based on the assumption that
path ﬂow F(k,t) is uniformly spread over the departure interval of length T and
remain equally distributed over the same time length while moving on the

network. The fractions a(k,t,l,h) can be estimated by

a(k,t,l,j) = 1 - (t(k,t,l)/T - ( j-1)) if (j-1)T < t(k,t,l) < ]T
= 1 - a(k,t,l,(j-1)) if (j-2)T < t(k,t,l) < (j-1)T
= 0 otherwise

t(k,t,l) is the arrival time at the head of the path ﬂow F(k,t) on link 1. Knowledge
of average link speed is required to calculate average travel time on each path k,

and related arrival time on a link of a path.

Since d(k,t,l,h) and 6 (k/t) are approximations of a(k,t,l,j) and q(k/,t), estimated

link ﬂows differ from the actual ones by a random error term

11 "ad
v(l,h)=2 )3 130,111.). f(r,t) + 50,11),
t=1 r=1

in matrix form the above equation can be written as

28

11

V01) = 2 P(t,h). f(t) + 50.), (44)
t=1

The system of equations set up this way can be solved using either simultaneous

or sequential generalized least square (GLS) estimation models.

2.3.3.1 Simultaneous Estimation

This method provides O-D estimates for all intervals in one step. Using GLS as

the desired optimization method the system is set up as

h

so.) = V(h) - ( Z P(t,h). 5(1)),
t=1

s(t) is the unknown O-D demand vector. Optimal estimates of O-D ﬂows for all

intervals can be computed by setting the first derivative of the quadratic
equation (45) with respect to O-D ﬂow estimates 1' 4.“) equal to zero.

h
(r ‘10.), r *2(n),..., 1' *ha.) ) = min 2‘. «5(1) im'V1tt)(s(t)-i(t))+5'hw15h
=1: :1: :1:
fl 20,f 220,...1' hZO i=1 (45)

f (t) is an O-D matrix obtained through a survey. V(h) is the variance -

covariance matrix of sampling errors. W(h) is the variance — covariance matrix
of assignment errors. f *1(h), f *2(h) ..... f a"b(h) are optimal estimates of O-D

ﬂows obtained by solving the GLS equation (45).

29

2.3.3.2 Sequential Estimation

In this case, an OD demand vector for O-D pair r is estimated for a single
interval )1 at each step. Equation (45) can be stated in a new form as
h
to.) = 2 (WM). 1' *(t)) + P(h,h). $01) + 5),,

i=1 .
f *( t) is the previous interval O-D estimate. The GIS formulation of the model is

as follows:
h
51. = $0.) . 2: (P(h,t). r *(t)) + P(h,h). s(h)
i=1
)1
r *(h) = min 2 «5(1) - i (1)) Wm (s(t) - i (t))+8’hW'1(h)8h
r *(n) 20 t=1

The sequential generalized least square estimation method suggested by
Cascetta was tested on a motorway which is part of the motorway system
located in northern Italy linking Turin and Venice. True O-D volumes were
available for comparison. Consistent and significant estimates of the true O—D
ﬂows over 15 minute intervals were obtained with reasonable computational
requirements. The sequential generalized least square estimation method was

not applied to a practical size network [Cascetta] [1].

30

2.4 Research Objective

Cascetta [1] suggests that further research is needed to assess the possibilities of
dynamic O—D estimation with dynamic assignment on a more complex network.
The network used by Cascetta is a linear network (Brescia-Vicenza-Padova
motorway) where there is only one possible path between an origin and
destination. The path choice is always the same; and therefore there is no need

for dynamic path assignment.

The assignment fractions used to test the model were taken to be identity
matrices. Dynamic assigrunent matrices were not considered. The objective of
the present research is to test the sequential GLS O-D ﬂow estimation suggested
by Cascetta for an urban network using time dependent traffic data obtained
from count stations on local streets. In this study path choice varies during each

time interval, and the link - incident fractions a(k,t,l,h) are time dependent.

The methods discussed so far have only been tried on small networks, such as
intersections, with synthetic data. None of the models have been tested on a
large network with real time trafﬁc data. So far only the approach suggested by
Cacsetta [1] was tested on a segment of a highway, with trafﬁc data being

collected for several time intervals.

In all the methods discussed so far, the shortest path between an OD pair is the
only path considered. The issue of multiple path choice has not been addressed
or considered. It is not clear from the description of these methods how they

would deal with a situation where the system is underdetermined.

31

The general approach for ﬁnding O-D estimates in this research is the Least
Square Estimation. This is the overall approach used by all the methods
described so far that were tested on networks. The method used in this study is
new both in its setup and in its implementation. In the setup the model is
similar to the Kalman Filtering system of equations. It consists of two sets of
equations: The observation equation, and the state equation. The observation
equation describes the relationship between link counts and O-D trips. The State
equation describes the relationship between the present interval O-D trips and
the previous interval O-D trips. Though this setup is similar to the Kalman

setup in concept, the State equation is a variation of the original.

The approach to finding time dependent O-D estimates in the first stage of the
implementation is to find the Ordinary Least Square Estimates (OLS) of O-D
trips. These estimates are by definition the best estimates given a particular data
set. The OLS, O—D estimates are used in finding both the observation error and

the error associated with the State equation.

The second stage of the implementation is to ﬁnd the Generalized Least Square
Estimates (GLS) of the OD trips. The O-D estimates obtained through the GLS
application are the unbiased minimum variance estimates. The variance
incorporated in the GLS process is a function of both the observation errors and
the errors associated with the State equation. This process is repeated
sequentially for all the intervals in the study period. This implementation of the

model is a new approach to ﬁnding the best O—D estimates for a network.

32

The contribution of this research is:

0 Introduction of a new model for estimating time

dependent O-D trips for a large network.

0 Implementing the model in an urban network using
existing traffic data. The study area chosen is the City of
Boston. Traffic data used is data that is available for .
every 15 minute interval on selected

links in the network.

Chapter 3

Model Development
3.1 Recursive Generalized Least Square Estimation

The objective of this study is to develop a method for estimating time dependent
O-D matrices. By time dependent it is meant that if a time period is divided into
smaller intervals, the OD matrices vary for each interval. The model
development relies on formulating link ﬂow as a fraction of O-D matrices during
each time interval plus an error term which reﬂects the deviation between the
estimated link ﬂow and the real link ﬂow. An O-D matrix for an interval is
expressed as a linear combination of the OD matrix of the previous interval plus

an error term.

The model is a linear system consisting of two equations, the link ﬂow equation
or measurement equation, and the O-D matrix equation. The objective is to ﬁnd
an estimated O-D matrix such that a quadratic measure is minimized. The
estimate which minimizes the quadratic expression will be called the Least
Square estimate. The procedure used to minimize the quadratic measure, or the
objective function, by adjusting the O-D matrix based on the link ﬂow estimate
error variances and the OD estimate error variance is the Generalized Least

Square Estimation method.
In this section some basic concepts of the recursive GLS algorithm are given. .

Start with the assumption that volume on any link 1 during interval j, z (I, j) is a

fraction h( r, t, l, j), of O - D volumes x( r, t), leaving the origin during either the

33

34

same or previous interval and are contributing to the ﬂow on link 1 during
interval j. In this formulation, r represents O-D pair r, and t represents the
departure time from the origin. Therefore, trafﬁc ﬂow on any link 1 during

interval j can be expressed by the following equation:

"0d f
z( 1, j) = 2 X h( r, t, l, j) . x( r, t) + 120, j), (46)
r=1 t =1

where "ad is the number of CD pairs, t represents intervals from 1 to interval j .

120, j) is a zero-mean observation error term,
E! v(l, j)} = 0

with covariance,

cov{v(l, j), 00, k)} = Vv(j)5jk (47)
In matrix form, equation (46) can be written as:

Z(j) = H(t, j) X(t) + V(j), 1:1,.) (48)

Due to the short duration of total travel times from an origin to a destination in
the proposed study area, only two analysis periods are considered in this study,
the present interval (j), and the previous interval (j-I). Equation (46) is then

modified as follows:

nod j
z(l,j) = 2 2 Mr, t, l,j) . x(r, t) + v(l,j),
r=1 t=j-1 (49)

35'

or
"0d "0d
z(l,j) = 2 h(r,j-1, I, j). £(r,j-1) + 2 h(r, j, I,j) x(r,j) + v(l, j) , (50)
r =1 r=1

where i (r, j-1) is the estimate of the previous interval O-D volume. Equation

(50) can be rewritten as:

"0d
20.)) = X h(r,f,l,j).x(r,j) +v(l,j), (51)
r=1
where
"0d
20.17 = 2(1,j)- 2 h(r,j-1,1,j). £0,131), (52)
r=1

the matrix form of equation (51) is written as:

2(1) = H(J'J)X(j) + W!)

Z is an (n) x 1) vector, H is an ( n) x nod) matrix, X is an ("ad x1) vector, and V is

an (111 x 1) vector. For j 2 2, :2 (j) can be written as

20') = Z(j) - H(j-1, j) 20-1) (53)

for j=1 2 (j) is equal to 20').

The O - D volume for interval j can be interpolated from the O - D estimates of

interval (j-l) as follows:

36

120-1) = 1(0) + W(j) (54)

where J? (j-l) is a vector of estimated O-D volumes for the previous interval; X( j)

is the present interval O—D vector; W is a (nod x 1 ) error or noise vector.
The noise W( j) is a zero-mean estimation error term,
Et W(j)} = o
with covariance
cov{W(j),W(k)} = mm). (55) .

we can adjoin the O- D estimates for interval X (j-l) with 2 (j) to obtain

Z*(j) = H*(j) 12(1) + V*(j) (56)
where
22(j) =1 20'). 220-1)}1‘
H*(j) = { H(j, j) , I)t
V*(j) =1 (V(j), I}, W(j))t

The augmented matrices have the following dimensions: 2* is a ((nj+n0d) x1)
vector, )2' is a ((n1+nod) x1) matrix, H * is an ((n)+n0d) x(n1+n0d)) matrix. V” is an
augmented matrix of the variance-covariance of the observation errors V0 with
the variance-covariance of O-D estimate errors Vw. The partitioned form of V"

can be written as

V*=(Vvl 0V I Vw’ 0w)

V0 is a (njxnj) matrix, 0v is an (n jxnod) matrix with zero entries, Vw is an

37

("0d xnod) matrix, 0W is an ("ad X11 1) matrix of zeros. The augmented matrix, V"

is an «111-mod) x(n1+nod)) matrix.

The objective is to select an estimate if (j) such that the quadratic measure

102(1)) = 1/2 (2*(j) - H*(j) )2(j))t V*'1 (2*(1') - H*(j) 12(1)) (57)

is minimized. Because minimization of I ( J? (j)) is an ordinary deterministic

minimization problem, the least square estimate is obtained by setting

d (1(Ji(j)))/d (12(1)) = 0 (58)

By using equation (57) for if (j) and carrying out the indicated partial

differentiation, we obtain

d (1026)» + d (12(1)) = H*t(j) V*'1(Z*((j)- H*(j) it)» = 0
(59)

The minimum variance unbiased estimate of the OD matrix for interval j is

expressed as:

12(1) = (114(1) V*’1H*(j))'1H*t(j) v.4 21(0) (60)

A description of how the fraction matrix H, and error variance-covariance

matrices V1,, Vw are derived are contained in a later section.

Two system of linear equations were defined. The ﬁrst equation defines a cauSal
relationship between link counts and OD trips. The second equation provides a

connection between the previous interval O-D trips and the present interval O-D

38

trips. This equation states that the previous interval O-D volumes are equal to
the present interval O-D volumes plus some deviation. The causal relationship
defined in the second equation is signiﬁcant since it allows the analyst to
compensate for the problem of rank deficiency, by converting the under-

determined system into a full rank system.

39

3.2 Evaluation of the Accuracy of the Sequential Generalized Least
Square Estimation Procedure

The test of the credibility of the results of the GIS model (estimated OD
volumes) is to ﬁnd out how close the estimated link counts are to the observed

link counts. Link volumes are estimated by the following eqautions:

i nod
2(l,j)= 2 z h(r.t,l.j).£(r.t),

t=j—lr=1
or
nod nod
2(1,j)= 2 h(r,j-1.l,j). i(r.j-1) + 2 Mr, j, 1, j) 526,1) , (61)
r=1 r=l

2 (l,j) is the estimated counts on link 1 during interval j. i( r, t) is the estimated
volume for O-D pair r during interval t. h(r, t, l, j) is an assignment matrix,
specifying the fraction of rth O-D pair volumes that left the origin during

interval t and are on link 1 during interval j .

The statistic used to measure the degree of closeness between predicted and
observed link counts is the root mean square error (rmse). The rmse measures
the degree of disagreement between two series of link ﬂow values (estimated

and observed) and it is given for each interval j by:

nl

rmse(j) = sqrt ({ 2 (z(l,j) - 20.1))21/711) (61)
1:1 '

%rmse (j) = (rmse(j) /2 (j)) * 100 (62)

4o

nl

z(j)=(1/n)) 2 2(1))
(:1

z(l,j) is the observed counts on link 1 during interval j.

Chapter 4

Parameter Estimation

4.1 Introduction

To apply the model it is necessary to estimate the value of the model parameters.
If z(l, j) denotes trafﬁc counts on any link 1 during an interval j, then z(l, j) can
be deﬁned as follows:
i nod
20, j) = Z 2 h(r, t, l,j) . x(r, t) + v(l, j), (63)

t=lr=1

In the above equation h(r, t, l,j) is the fraction of rth O-D volume that departed
the origin during interval t, and is on link 1 during interval j. The procedure for
estimating h( r, t, I,j) is adopted from Cascetta[1]. x(r, t) is the rth O-D trafﬁc
volume that left the origin during interval t and is contributing to flow on link 1

during interval j. 00, j) is the observation error on link 1 during interval j.

To estimate the fractions h( r, t, l,j), link traffic counts must be expressed in terms
of path ﬂows. Letting x( k, t) denote the traffic ﬂow following path k that left the

origin during period t, x( k, t) can be expressed as follows:

x(k, t) = x(r, t) . p(k/t) k e K, (64)

where p(k / t) is the probability of choosing path k going between an OD pair r
given that the trip left the origin during interval t. The observed link trafﬁc

counts can be rewritten in terms of the path ﬂow x( k, t) as follows:

41

42

i Kr
20, j) = Z 2 (a(k, t, I,j) . x(k, t)) + 120, j), (65)
t=j—1k=1

where a( k, t, I,j) is the fraction of path ﬂow x( k, t) contributing to link ﬂow

z(l, j). The assignment fraction h( r, t, I,j) can be expressed in terms of path-link
incidence fractions a(k, t, I,j), and path choice probabilities p( k / t) . Combining
equation (63), and equation (64), the assignment fraction h( r, t, I,j) is expressed
by the following relationship: .

Kr
h(r, t, I, j): 2 a(k, t, l, j).p(k/t), (66)
k=l

Let h (r,t,l,j) denote the estimate of h( r, t, I, j), and a(k, t, I, j), and ('5 (k / t) denote

estimates of a(k, t, l, j), and p( k / t) respectively. Equation (66), can be written in

terms of these estimates as follows:

If the link arrival time t is t=j, then the fraction of trafﬁc on link 1 during

interval j that arrived during j is:

Kr
z a(k, j, I, j). (3 (k /j), (67)
=1

28. :14) =
k

If the link arrival time t is t=j-1

~ K
h(r,j-1,1,j) = 2' d(k)-1, I, j). 6 (k /j-1), (68)
k = l .

In the next two sections, steps taken to develop estimates of the parameters

p(k / t), and a(k, t, I, j) are described.

43

4.2 Calculating the Probability of Choosing a Path p( k / t, r)

p(k / t,r) can be estimated by a multinomial logit model. The probability of

choosing any path k during any interval t can be expressed by:
p(k/ t, r) = Prob ( C(k,t) + q(k,t) < C(m,t)+q(m,t) } (69)

where, C(k,t) is the cumulative travel time for path k leaving the origin during.
interval t, and q(k,t) is the error term. The use of a multinomial logit model to
estimate choice probabilities requires the error terms to have the following

properties:

1) The error terms, q(k,t) are independent, and

2) identically distributed.

Based on the above assumption, the multinomial logit form [12] of the choice

probabilities is expressed as:

p(k /t, r) = exp(-u* (tt(r,k,t) + [3" dk)) / 2 exPHl " (tt(r,m,t) + [3" dm))
(70)

where u is a scale parameter, and tt(r,k,t) is the total travel time on any path k
for an O/ D pair r, given that departure time from the origin is during interval t.
The parameter m represents any other reasonable path . A brief explanation is
required before variables [3, and dk could be deﬁned. A network is a collection
of local streets; the only exceptions are expressways. Since expressways have no

traffic signals, it is reasonable to assume that the travel time for those paths

44

connecting origins and destinations that include an expressway should be
shorter than the alternative paths. It is not possible to calculate travel time on a
path that includes an expressway based on on-line traffic data, since no routine

on-line trafﬁc data for this type of roadway is available.

To compensate for the lack of on-line data to estimate travel time; each segment
of an expressway is assigned a speed. Travel time is then calculated based on
the assigned speed and distance. In addition, a bonus is assigned to each .
segment to denote savings in time due to the absence of traffic signals. Lets
denote, B< 0 to represent a bonus; implying savings in travel time in minutes
per mile and dk as the distance of path k on an expressway in miles. If a path
does not include the expressway, then B = 0. To use equation (70) to calculate
choice probabilities, three parameters need to be estimated, one is the total travel
time tt(r,k,t), another is the value of u; and once u is set; then for a given value of

u, [3 needs to be estimated .

4.2.1 Procedure For Finding Multiple Paths

The first step in OD estimation, whether it is static or dynamic estimation, is to
ﬁnd paths between O-D pairs. In all static O-D estimation procedures, a set of
shortest paths are found. In dynamic modeling, it is assumed that there is
usually more than one path choice between O-D pairs, and drivers sometimes
choose a path other than the shortest path if they anticipate delay. Some models
use what is called an all-or-nothing algorithm to ﬁnd a set of shortest paths and

ignore the possibility of multiple paths even though the modellers recognize

45

that in real life all drivers do not choose the shortest path to go from an origin to
a destination.

As was mentioned before, much of the past research on dynamic O-D estimation
has been based on freeway sections including ramps, and intersections that have
simple constrained geometry. Both of these conditions have only one possible
path between O-D pairs. Therefore in traffic assignment models all trips are
loaded onto this one path [Cassetta][1]; and if the network represents an urban
area, trips are loaded onto a set of shortest paths, one path per O-D pair. In this

study the issue of multiple path choices is dealt with as follows.

Normally the most desirable approach in finding multiple paths based on
variable travel time is a dynamic assignment model. In the absence of a
dynamic assignment algorithm, the paths are found through a static assignment
algorithm, and then the path choice probabilities are calculated. These two steps
are not simultaneous, but sequential. First, all reasonable paths are found by
running a static algorithm, and then the dynamic part (which is the probability
of choosing a particular path during a particular interval) is found through a

separate algorithm.

An equilibrium static assignment program was adopted to ﬁnd a set of
reasonable paths. A maximum of three most likely paths were identiﬁed. The
ﬁrst possible path is the shortest path obtained by running a ten iteration
equilibrium assignment algorithm of the UROAD program of the UTPS [14]
package. To find the second possible path, some of the major links (streets) were
deleted. The choice of which links to omit is based on tracking the shortest

routes for individual O-D pairs, and familiarity with the area. The equilibrium

46'

assignment program is run again to ﬁnd shortest path given the missing links.
To find the third path, the same process was repeated this time with additional
links missing from the second path.

4.2.2 Calculation of Path Total Travel Time

Calculation of link total travel times requires information on link speed, link -
delay, and time of departure from the origin. In developing a dynamic trip
table, the relevant data can be obtained by collecting traffic data for a number of
days. This information is usually received from probe vehicles transmitting to a
monitoring center, [Cremer,Putensen] [5,9] . Based on the sample collected,
general traffic characteristics including delay are determined. For signalized
arterial links, delay is calculated implicitly within travel time estimation by
looking at the variability in the link travel time experienced by individual probe
vehicles on an arterial. Total link travel time within a given period is split into 3
components; the travel time prior to reaching the tail of the over saturated
queue, the subsequent time required to traverse the over saturated queue, and

time to reach the tail of the uniform queue and clear the stop line.

The reliability of information from probe vehicles is heavily dependent on the
number of probe vehicles, their origin-destination, the path that the probe
vehicle is on, and the relationship between the departure time of the probe
vehicle with respect to other vehicles. Unless all vehicles are equipped and
information is received continuously, the information required to estimate travel
times must be derived from sample data. In this study an attempt has been

made to extract the required information from data received via inductive loops

47

that are installed on some arterial links in an urban area. This approach is
preferred over the other methods of collecting traffic information because trafﬁc

data from inductive loops is received daily and can be collected on a daily basis.

Calculation of path total travel time for each O-D pair requires two kinds of
information; link average speed, and delay per vehicle. The first step is to come
up with a reasonable speed for all the links in the network. Delay per vehicle is
measured by dividing total delay experienced by all vehicles having to stop
during the red light within time interval h by the total number of vehicles on

links during the same interval h.

4.2.3 Calculating Link Travel Times

Total travel time on a path is calculated by adding link travel times between the
origin and the destination. Link travel time is found by dividing distance by
link average speed. Since both speed and delay vary from interval to interval,

link travel times vary accordingly.

tt(r,k,t)= 2 (8(1)) *d(I) )1 (60) + 8(l,t) ) (71)
l ek

tt(r,k,t) = Total travel time on path k between O-D pair r, for vehicles
leaving the origin during interval t.

s(l,t) = Speed on link 1, during interval t .

d(l) = Distance of link I.
5(l,t) = Delay on link 1 during interval t.

48

Due to short distances between origins and destinations in the study network,
almost all

O-D pairs can be reached within one 15 minute interval. Therefore, to calculate
total travel time during an interval, it was assumed that drivers leave the origin

within the same interval.

4.2.2 Estimation of parameter )1

The choice of u is arbitrary, but the assumption that the error terms are
independent and identically distributed constrains all error terms to have the
same value of u. The fact that each error term has the same value of )1 implies
that the variances of the error terms are equal. A suitable value for p. was chosen
through an iterative process. Given that the choice probabilities must satisfy the

following axioms:

1) 0 <p(k/t.r)< 1

2) Zp(k/t,r)=1
k

The procedure for estimating a reasonable value for u is as follows: given an
initial value for 11, a sequential, 12 interval OLS O-D estimation procedure was
performed with observation error variances V = I, and OD estimation error
variance W = I . The estimated counts were obtained by multiplying each O-D
estimate x(r,t ) by the corresponding interval assignment fraction h( r,1,t) as

follows:

49

nod
2 (Lt) = 2 Mn“) *x(r,t)
r=1

in matrix form:

2(1) = H(t) . X(t)

where Z (t) is a (111 x 1) matrix of estimated link counts, H( t) is an (111 x nod)
assignment matrix, and X( t) is a (nod x 1) matrix of O-D estimates . 111 represents
the number of links on the O-D paths for which traffic counts are available. nod

is the number of O-D pairs.

The statistic used to measure the suitability of u was the root mean square error.

rmse = sqrt( Z{(z(l,t) - 2 (1,0)2 /n1})
1

where

i nod
£(l,j)= 2 Z h(r,t,l,j).i(r,t),

t=j—lr=l
or
twd um!
20, j): )3 h(r,j-1.I,j). £0.11) + )3 Mr, 1'. I, j) i(r,j) .
r=l r=l

z(l,t) is the observed count during interval t for link I, 2 (Lt) is the estimated
count during interval 1‘ for link 1. By comparing rmse values for any given

value of 11 the best value for 11 can be selected.

50

Keeping it constant, different values of B can be tested. The statistic used to
measure the suitability of B is the root mean square error and percent root mean

square error.

%rmse = (rmse ﬁ» * 100
where

nl

Z =(1/nl) 2 2(1)
[=1

By comparing rmse, and %rmse values, suitable values for B can be chosen.

51

4.3 Procedure for Finding Link-Path Incidence Fraction 01( k,t,l,j)

Trafﬁc ﬂow on any link 1 during any interval j is made up of a fraction of the
OD volume leaving the origin during intervals to j, and reaching link 1 during
interval j. To express this more formally, we deﬁne a quantity a(l, r) 6 [0,1], as

the fraction of the OD volume x(t,r) contributing to the ﬂow on link 1.

"0d h
z(l,j) = E Z 010, r).x(t,r)
r=1 t=1

The fractions a(l, r)s in general depend on link arrival times, which in turn
depend on departure time from the origin, total travel time on the preceeding
links, and identiﬁcation of the origin-destination pairs whose trips use that
particular link. To calculate 010, r), first a set of paths are chosen. The path set
is determined using a trafﬁc assignment procedure, and then the arrival time on

the link is found given a path and time of departure from the origin.

(10, r) should be calculated based on drivers path choice, departure time from
origin, and arrival time on links of a given path. To obtain this information in
real time, links on a given path between origin and destination could be
traversed by vehicles equipped with devices that transmit traffic related

information. This option is not yet practical.

In the literature reviewed, for time-dependent O-D estimation the fraction (10, r)
is taken to be (0,1). In this study the fraction of the OD volume that make up
ﬂow on any link I, is time dependent a(l,j,r), and is found based on the
multiplicative effect of the probability of choosing a path during any period h,

52

(which is a function of alternative paths total travel time, and departure time
from the origin), and the fraction of O-D ﬂows on a particular path that are on a
link during period h (which is a function of link arrival time, and departure

time from the origin).

The term a(k,t,l,j) is the fraction of trips following path k going from origin (0)
to destination (D), which left the origin during interval t, and are contributing to
ﬂow on link 1 during interval j. To estimate a(k,t,l,j) assume that the departure
flow is uniformly spread over the interval length T, and remains equally

uniform while moving in the network.

Based on the above assumptions, the fraction 01( k,t,l,j) can be estimated as
follows:

a(k,t,l,j) = 1 - 1/T(t(k,t,l) - T * ( j-1)) if (j-1)T < t(k,t,l) < jT
=1- a(k,t,l,(j-1)) 116-mu 1061.1) < (Jam
= 0 otherwise

where t(k,t,l) is the arrival time on link I of the first vehicle on path k which
departed the origin during interval t. T is the length of an interval and j is the

present interval.

The ﬁrst step in ﬁnding the link arrival time is to determine the links on any
path k that have detectors. This is done by matching the path ﬁle with a data ﬁle
that contains the A-node, and the B-node of the links with traffic detectors. The
links are found once the A-node, and the B-node of the links on a path match the
A-node, and the B-node of the links identiﬁed as having trafﬁc detectors. The

53

next step is to calculate the arrival time on these links for each O-D pair r on path
k. Link arrival time is calculated similar to total travel time, but the calculation
stops once the link is reached. If there are several links with detectors on a path,
then the algorithm calculates the cumulative arrival time on the following link
until the next link with a detector is reached. The outcome is a listing of links
with detectors for each O-D pair r on path k, and the corresponding link arrival
times.

Due to the short link travel times in the area used for this study, only two cases
are considered. Case one calculates 01( k,t,l,j) for link arrival time during the
present interval (j), given that trafﬁc left the origin some time during interval (1').
Case two calculates a(k,(j-1),l,j) for link arrival time of vehicles which departed
the origin during the previous interval (j-1) that are still on link I. The values of
a(k,(j-2),l,j),.., a(k,1,1,j), are considered to be zero.

A check of total travel times in the study area indicates that for the majority of
O-D pairs total travel times are contained within one interval. Only in a few
cases does total travel time exceed one interval, and then only slightly. For
example the maximum total travel time for an OD pair is found to be
approximately 17 minutes. Therefore, it is reasonable to assume that in order to
be on link 1 during interval (j), traffic must have left the origin during the same
interval (j) or one interval earlier ( j-1) . The value of h(r,t,l,j) which is the
fraction of the O-D volume that departs the origin during interval (j), and is on

link 1 during interval (j) is calculated as follows.

h(rl.lllj) = 2 a(k/ j] I; j) - P(k/ j! r)
kzk e Kr

54

The fraction of trafﬁc that left during interval (j-1) and is on link 1 during

interval (1') is calculated as:

111, (1:1), 1, j): 2 a(k.(j-1),l,j). 611/14, r)
kzkeKr

55

4.4 Calculation of Error Variances

The procedure used to calculate the error variances uses several steps. In the
ﬁrst step, the residuals from the link ﬂow estimates are used to calculate link
flow error variances for each interval by using the normal distribution to group
estimated link counts into low, medium, and high volume categories. The
corresponding residuals are then grouped into low, medium, and high

categories based on the grouping of the estimated link counts.

The logic of grouping data is that observations with smaller variance receive a
larger weight, and therefore have greater inﬂuence in the estimates obtained for
the next interval. The variance estimates are measured by RMSE, and %RMSE
statistics. As an initial grouping, it was decided to group data into low,
medium, and high categories. The error variance matrix is then deﬁned as:
2 _ V1 -
o g — V /(n 1)
8 8 8 (72)
V8: Vector ( 1 X111 ) of grouped observation errors
Vt = Vector of size ( n1 x 1 ) ; Transpose of V8

118 = Number of observation errors that fall into group g.

The observation error variance Vv is a diagonal matrix of size (nl x n1). The
diagonal entries are grouped observation error variances 02g . The off-diagonal
entries are zero. nl is the number of links in the network with detectors. A
similar procedure was used to calculate the O-D estimation error variance for
each interval. First, the O-D estimates for the present interval are grouped into

low, medium and high categories. Then, the residuals are placed into low,

56

medium, high groups based on the grouping of the OD estimates. The error

variance matrix is then deﬁned as:

12g = wgwtg /(n*g -1) (73)
W3 = Vector ( 1 xnod ) of grouped estimated O-D volume
errors
th = Vector of size ( nod x 1 ) ; Transpose of W8

”*8 = Number of O-D estimation errors that fall into group g.

The O—D estimates error variance Vw is a diagonal matrix of size (nod x nod).
The diagonal entries are grouped O-D estimates of the error variances (1.2g ).

nod is the number of O-D pairs.

4.4.1 Formulation of the Error Variance Calculation Procedure

The first step in calculating the error variances is to ﬁnd the OD pair volume

' estimates and the estimated link counts. The O-D pair volume estimates for each
of the 12 intervals are obtained through a sequential Ordinary Least Square
estimation procedure. In OLS both the observation error variance, and the
estimation error variance are identity matrices. The CD estimates are obtained

by the following equation [60]:

X0) = (H*t(j) V*’1H*(j))'1H*t(j) V*’1 2111;)

57

V"'1 is an augmented matrix of the inverse variance of the measurement noise
V0 with the variance of O-D estimates errors Vw. The partitioned form of W1

can be written as

V*'1 =(v-1,,,0 I v-lw, 0)

The link volumes are then estimated as follows:

. i
z(j)= 2 MM) . X(t),

t=j-l
i rwd
2(1,j)= Z 2 h(r,t,l,j).x(r,t),
t=j-lr=l
or
rwd de
2(1,j)= Z 2h(r,j-1,l,j). i(r,j-1) + 2 h(r,j,l,j) x(r,j) ,
r=1 r=1

In general the link counts during an interval can be expressed by the following

equations

1'
21;) = 2 H(t, j) 2.11) + V(j),
t=j-l

i de
z(l,j)= 2 z h(r, t, I,j).x(r, t)+v(l,j),
t=j-lr=l

58

or
nod nod

z(l,j)= 2 h(r,j-1,I,j). 52(r,j-1) + 2 h(r,j,l,j) x(r,j) +v(l,j) ,
r=l r=l

The observation error variance can be calculated by the following equation
or in matrix form as:

vm=zm-Zm

where V (j) is the observation error diagonal matrix of (n; x111), during interval j.
The off-diagonal entries are zero. Z( j) is an (111 xnj) diagonal matrix of
observed link counts during interval j, and i (j) is an (n1 xn) ) diagonal matrix
of the predicted counts resulting from the sequential Ordinary Least Square 0-
D estimation procedure. The maximum likelihood estimator of the observation

error variances is :

n1
02(v(j)) = 2 {(00, j) * v(l,j))/n1}, I=1,...,n1 (74)
1:1

The error variance matrix is obtained from equation (72), with diagonal entries
similar to equation (74) except that ungrouped link observation error v(l, j) is

replaced with grouped link observation error v30, j) replacing n8 for nl.

59:

4.4.2 Calculation of Error Variances for the Estimated Origin-
Destination Volumes

The O-D volume estimates during an interval can be expressed in a matrix form

as: .. .
X(j-1) =X(j) + W(j)

The estimation error can be calculated by subtracting the previous interval

O-D estimates from the present interval O-D estimates

W(j) =5t(j-1) -1t(j) (75)
The O-D estimate )2 is an (nod x 1) matrix. W( j) is the estimation error matrix of
size (nod x1). The maximum likelihood estimator of the error variance is

obtained through an equation similar to equation (74) as follows:

nod
12(w(j))= z {(w(r, j) *w(r,j)) /nod }, r=1,...,nod (76)

r=l

The estimated errors were grouped based on the grouping of CD estimates
during interval (j). The estimation error variances were calculated from
equation (73) with diagonal entries similar to equation (76) except that the
ungrouped O-D estimated error w( r, j) is substituted with grouped O-D

estimated error wg( r, j) substituting "*8 for nod.

Chapter 5

Model Validation

5.1 Introduction

To test the validity of the procedure adopted for the time-dependent O-D
estimation, the model was tested on a synthetic network. The network is shown
in Figure 1. The network consists of 7 nodes, and 7 links. The links are one
directional from node 1 to node 7. Link 7 is considered to be a dummy link with
distance equal to zero. There are 3 entry points (1,2,4), and 3 exit points (3,5,7).
There are 7 possible O-D pairs. The O-D pairs are (1-7), (2-7),

(2-3), (1-3), (1-5), (4-5), (4-7). Traffic counters are shown by a dashed line and are
located on 6 of the 7 links.

Data generated for this simulation study consists of volume on 6 links, plus
travel time on all the links. Data is provided for 15 intervals of 10 minutes
duration each. Data that needed to be calculated were link arrival times on each
link on a path between each O-D pairs, path choice probability for O-D pair (1-7)
( the rest of the O-D pairs have only one route choice). The parameter or, which
is the fraction of traffic that left the origin at some time t and is on link 1 during
interval h; the parameter p which is the probability of choosing a path that
includes link 1 during interval h; and subsequently the estimation of the
assignment matrix. The parameter p was estimated for the O-D pair (1-7) only.
p was set equal to 1 for the other O-D pairs. This was due to the unidirectional

links on the network.

60

61

5.2 Analysis of The Results

An alternate OLS-GL5 procedure was performed on the synthetic data. By an
alternate OLS- GLS procedure it is meant that for each interval an 015 estimated
O-D volumes and counts were found. These estimates were used to calculate the
error variances as described in Chapter 4.3. Next, a GLS estimate of O-D
volumes was obtained using the inverse of the error

variances as weights.

The alternate OLS-GLS procedure was performed for 2 days. Each day consists
of 15 intervals of 10 minutes duration each. The estimated link counts were
compared with the synthetic link counts. The measure of degree of closeness
was the %RMSEs. Table 1 shows the RMSEs and %RMSEs for 15 intervals of
both day 1 and day 2. Figures 2-6, show the distribution of the estimated and
synthetic link counts around the mean, for day 2. For this exercise, the true O-D
volumes were known. Comparison between the estimated O-D volumes, and

true O-D volumes indicated a close match.
5.3 Sensitivity Analysis

Different values for the constant parameter )1 were tried. The purpose of doing
this exercise was to ﬁnd out how responsive the model is to changes in the
values of the constant parameter u. The logit utility model for path choice was
calculated for interval 1 of day 2 sample data. The %RMSE were caculated to
determine the value of the constant parameter )1 that produced link estimates .
closest to the synthetic data. Table 2 shows the RMSE and %RMSE for different

values of the constant parameter )1 for the DIS run of interval 1 of day 2. As is

62

apparent from the RMSEs and %RMSEs, the model does not display a signiﬁcant
sensitivity to changes in the values of 11. This implies that path choice is highly

invariant to different values of 11. This is expected since only one O—D pair has a
choice of separate paths. [.1 = 1 produces slightly lower %RMSE. The results of

the sensitivity analysis are unique to this case study and are not generalizable.

63

Figure 1- Drawing, Depicting the Synthetic Network

FN‘

o

o 4'. ®

O

6
.‘at
\
\\ 1 C e\
\ ‘—’( . r, ,
@1' g
(7) i.
1
._ link
- detector
0 no...
0 point of entrance/

exit

Table 1

Comparison of Predicted Versus Observed Link Flows -
GLS runs - day 1 and day 2 - Sample Data

 

 

 

 

Day 1 Day2
Interval rmse %rmse rmse %rmse

1 58 1 8 66 20
2 22 7 36 1 1
3 43 1 1 36 9

4 55 12 50 1 1
5 60 1 1 64 1 1
6 66 12 57 1 0
7 83 1 3 71 1 1
8 93 1 4 87 1 3
9 1 01 1 6 95 ‘ 1 4
10 99 1 6 83 1 3
1 1 80 13 72 1 1
12 78 1 3 68 1 1
1 3 62 12 53 1 0
14 47 9 40 8

1 5 44 9 32 6

 

 

 

 

 

 

65

 

 

8m 8.. com 8m 00— o

 

 

 

 

 

  
 

 

 

 

 

 

 

 

a 5. . p 3:25
«:53 :5. 0225.3 .m> .aBua - mum . a 2:9“.

paieumsa

mum
Sn
now
com

 

 

«mm
03
3383mm

8v
mom
08
N F N
000
3m

.33

 

 

 

 

 

 

 

patewgtsa

 

 

 

 

 

 

 

 

 

 

 

 

 

Km mom

4... ..11....m.£6.mm m nnv now

a Em mom

>3 - n 328:. 2:38 :5. 335:3 .m> .258 . who - n 959". com mom
w _ v 03

p 8 8m

coEmEcmm _mau<

67

 

 

.386

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1% 08—
N
>3 . o .8225 8.53 x5. 385:3 .9, .388 . who . v 052...

paieumsa

0mm
03
m _ v
won
one

 

 

new
3385”.—

Sum
8m
2w
3m
80
now

383..

68

 

 

323
000— oom 08 00v CON 0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. .
.4. .m .89.
3280 _ w
m, m
M. 88 D
e.
. ONO
. : . . 82 mg
N 0%
>3 - m 322:. 8:33 :5. 33:53 .9 330a . who . m 832“. m3
03
omm
noﬁEcmm

30
N3
00v
va
2K

333.

69'

 

 

320a

 

 

 

 

 

 

 

 

 

 

 

 

 

>a
- a. .8225 2550 x:.. 338.88 .m> 320a . who . m 059“.

paieumse

$0
30
ONV
_ Om
SR

 

 

mow
noueEEm

:6
Be
as
mmv
86
he
see...

70

Table 2

RMSE and %RMSE for Different Values of the Constant Parameter

 

 

u - interval 1 of day 2 - Sample Data

u RMSE %RMSE

0.3 42 13

0.8 41 13
1 41 12

1.5 43 13

2.0 43 13

3.0 44 13

4.0 45 13

 

 

 

 

 

Chapter 6

6.1 Representation of an Urban Area as a Network

Having determined that the model gives good results when all factors are
known, the next phase of the study is to see if it can be applied to an
underdetermined system with several estimated parameters. In order to test the
model, the ﬁrst step was to deﬁne an area of interest and to represent the area as
a network which is a graph (N ,A) of a set of N points which are known as
nodes, and a set A of ordered pairs of these nodes known as directed arcs. Arcs

represent the streets. Each arc has properties such as distance, and direction.

The next step was to identify the points that trafﬁc emanates from or tends
towards such as parking lots, residences, and places of work (in other words
trafﬁc collectors/ generators). Trafﬁc collectors/ generators are represented by a
set of nodes that are not part of the roadway network . These nodes are called
centroids. They are points on the network that represent the approximate
location of traffic collectors/ generators in the area. These centroids are either
origins or destinations. Trafﬁc collectors/ generators are connected to the

network by links with special attributes.

Once the network was coded it was necessary to match the location of traffic
counters with the corresponding links on the network. This was done using
special schematic maps that show the location of detectors on streets, their
proximity from intersections, the direction of trafﬁc covered by loops, and the
number of lanes covered by these detectors. Coding the location of traffic

counters on the network was done by identifying the arcs that represent the

71

corresponding streets on the schematic maps.

The inner study area consists of parts of the Beacon Hill, Back Bay, and
Kenmore neighborhoods, and Commonwealth Avenue to the Boston University
Bridge. The external areas include Cambridge, West end, Financial district,
Downtown crossing, Chinatown, Theater district, Prudential / Copley area,
Fenway, the edge of Brookline; and Brighton. The area is divided into 34
zones. The zoning scheme was adopted from the 1975 DPW (Department of
Public Works) zoning systems. The zones are chosen where there is either a
concentration of business activities or residents. Parking lots are chosen as

traffic collectors or centroids for each zone. (See Map of Boston).

The network representation of the area has 1124 links and 34 centroids. twelve
centroids represent the inner study areas, and 22 centroids represent outside

areas or externals.

There are a total of 77 trafﬁc detectors in this area. The majority of traffic
detectors cover only one lane, and are located in one side of a street only. In a
few cases, such as Cambridge Street, loops cover both directions, and two lanes
on each side. The 77 links that had detectors were identified and the links A-
node, and B-node were stored in a data file. Path files for all O-D pairs are
listings of A-nodes, and B-nodes of the links on paths that join origins to
destinations. The path file was matched with the 77 links file to ﬁnd out which
links on each of the possible paths had detectors. Of the 77 links, a total of 58

links fall on at least one paths that connects O—D pairs.

73

6.2 Data Collection

After coding the network and identifying the location of the loops, the next task
was to collect data for the analysis. Local street trafﬁc count data were provided
by the City of Boston Trafﬁc Control Center. In the Traffic Control Center, traffic
controllers monitor traffic by watching link data displayed on computer screens.
The location of the link on the network is displayed on a large screen where a '
series of ﬂashing lights display traffic signals. A ﬂashing arrow on a link signals
either uninterrupted ﬂow or congestion. A green arrow signals normal ﬂow,
and red congested ﬂow. If congestion occurs traffic controllers attempt to
correct the problem by adjusting the traffic signal timing. The data is not
ordinarily saved since no further analyses are done; but if requested, data can
be saved for any particular period, in 15 minute increments. A hard copy of the

traffic data was obtained from the Trafﬁc Control Center.

Each traffic detector is identified by a number. A set of detectors clustered in
one area are identified by their area number. Each report provides two blocks of
information; the ﬁrst block gives traffic statistics per detector such as volume,
percent occupancy, average speed, number of stops, delay, historical volume
(which is an hourly average based on the previous 5 weeks), and historical
occupancy (which is an average based on the previous 5 weeks). The second
block of information is a summary statistic for the section. Data for this block is

obtained by averaging the detector data in a section.

For this study, data were obtained for every 15 minute interval for the period

from 6:00 a.m. to 9:00 a.m. for six consecutive days from a Monday to the

74.

Monday of the following week not including the weekend. Data were collected
between August 3 and August 14, 1992. The data explicitly used in parameter
estimation of the recursive Generalized Least Square algorithm are; volume,

average speed, and delay.

6.3 Data Description

A brief description of each of the items on the report is given below:

0 Sensor number This is a reference number used for locating traffic
detectors on maps used by the traffic controllers. These maps have

detailed intersection schematics.

0 Traffic volume (VEH) per 15 minute interval. Counts are received and
stored during each traffic signal cycle, and then added together to get 15

minute counts.

0 Average speed (MPH) An average speed for a particular link is
calculated as follows: An algorithm sets an average vehicle length to be
equal to 20 ft. A detector is assumed to have 5 ft x 5 ft dimensions. As the
front of a vehicle passes over a loop, 5 ft. is added to the length of a
vehicle, and as the back of a vehicle crosses the loop another 5 ft. is
added to the length of a vehicle, making a vehicle 30 ft long. While a
loop is occupied by a vehicle, it sends a signal to the computer. The

duration of the signal is taken to be the time over the 100p.

75

The algorithm then calculates speed by dividing the vehicles length
(which is assumed to be a constant 30 ft.) by the duration of the signal.
During the green interval of each cycle vehicle speeds are calculated and
then divided by the number of vehicles that passed over a loop to obtain
an average speed for a link during that cycle.

The average speeds for each cycle are calculated and summed, and then
divided by the number of cycles in an interval to obtain an average speed
for a link for that interval. The section average speed is the arithmetic

average of the calculated average speeds of the sensors in a section.

- Stops (VEH) This is a derived parameter, assuming that some fraction
of vehicles that pass over a sensor during the yellow signal are assumed

to stop at the intersection for the red light.

0 Average stopped time delay (min) This is the number of stops per
cycle, multiplied by delay per vehicle stopped. This is the average delay
experienced by vehicles at the stop line due to the red light. Delay is
calculated per cycle summed over the number of cycles in a 15 minute

interval, and then averaged.

6.4 Estimating Speed and Delay For All The Links in The Network

The network for the study area consists of 1124 links. A total of 77 links
equipped with detectors fall within the study area. Of these 77 links, 58 links fell

on one or more of the minimum paths that connect origins to destinations.

76

Therefore, assumptions had to be made to estimate speed and delay for the
reminder of links in the network. To do this, the network was divided into 5
sections. The division followed the segmentation scheme used by the City of

Boston. Using these section boundaries, the following assumptions were made:

0 For each interval, all the links within the boundaries of a section are
assigned the average section speed.
0 For each interval, all the links within the boundaries of a section are '

assigned a constant section delay. This delay is calculated as follows:

Emu.)
8 = 131—— (77)

m
ZVUJI)
[=1

5 = Section delay
8(1)!) = Delay on link 1 during interval h. Only links that have

traffic detectors are considered .

v( 1,11) = Volume on link 1 during interval h.

6.5 Adjustment of the Apriori (1987) O-D Matrix

The historical trip matrix was derived from the 1987 Census survey data. The
O-D volumes were factored to update this table to be more consistent with the

observed link counts.

The following steps were taken to adjust the apriori O-D table:

77

1) A screen line was picked. In this case due to the particular geometry of
the area, and the availability of on-line count data from the streets that
cross the screen line Massachusetts avenue was chosen. The direction of
trafﬁc chosen was west to east.

2) The apriori (1987) CD, 3 hour volumes for those O-D pairs whose
paths cross the screen line were summed.

3) The on-line traffic counts on the links that crossed the screen line were
summed for the 12 intervals of day 1. i
4) This volume was then divided by the total observed volumes from the
links over the 3 hour period to get the adjustment factor.

The apriori O-D matrix was adjusted by the following ratio:

11 12 m
ratio: ( 2 x(i)) / ( X 2 (JUN)
i=1 h=1 i=1

where

x(i) = trips going from origin (0) to destination (D) that cross the screen line
11 = Number of x(i)s.

v(j,h) = Observed counts on link j during interval h, on those links on the O-D
paths that cross the screen line.

m = Number of v(j,h)s.

Using the above calculation, the adjustment factor was found to be equal to 1.36.

6.6 Choosing the Parameter u

The RMSE values of the OLS run of interval one of day one were calculated for
values of 11 between 0.3 and 4.0. Table 3 lists the value of 11 vs. RMSE. The

78

results show that RMSE is not very sensitive to the value of 11. Choosing u =1.5,
causes the shortest path to be the prominent path choice; but still assigns some
trips to the second and third paths deﬁned for each origin and destination. A
value of )1 =1 means that the probability of trips assigned to the minimum path is
not significantly higher than an alternate path. As 11 increases, the probability of
trips being assigned to the minimum path as opposed to an alternate path
increases. The minimum path update algorithm developed by the Bureau of
Public Roads (BPR) uses an update value of 1.5 for the exponent of v/ c, which-is

somewhat analogous, and supports the choice of u for this study.
6.7 Choosing the Bonus factor B

Using a )1 value of 1.5, different values of B from -0.2 to -1.0 were used to obtain
the RMSE values of the OLS run of interval one of day one. Table 4 shows the
values of B vs. RMSE. Since B is the travel time bonus used to modify the travel
time on Storrow Drive (an expressway), the value is constrained by
reasonableness. Due to the link lengths, the maximum travel time on any
segment of ”Storrow Drive” is 0.5 minutes, therefore a bonus factor B = -0.3

minutes was used.

6.8 Test of Model Responsiveness When Different Weights are
Assigned to the Count Data versus Prior O-D Estimates Data

To test how sensitive the model is to input data (Count data), a weighting factor
a was assigned to the counts and (1- a) was assigned to the prior O-D estimates.
The purpose of this exercise was to ﬁnd out if the link estimates improve by

assigning more weight to the counts and less weight to the prior O-D matrix.

79

The test was performed as follows: The inverse counts error variance Were
multiplied by or, and the inverse O-D estimates error variances were multiplied
by (l-a). The model was tested for several values of a = 0.05, 0.2, 0.6 for one day
only, without a prior day O-D estimate. The prior interval O-D estimate used
came from the historical" O-D matrix. The measure of how close the estimated
counts were to the actual counts was the RMSE values. Table 5 shows the
RMSEs for different values of 01.

Compare this Table with table 6 which contains the RMSE values of a 12 interval
GLS run when equal weights are assigned to the inverse error variance for the
counts and the OD estimates. The model produced better estimates of link
counts when almost all the weight was assigned to the counts. The implication
is that more count data would produce better link count estimates which in turn
implied better O-D estimates. Basically, the less underdetermined the system,

the better the estimating ability of the model.

These results indicate that the model acts rationally, since it produces better
estimates of link counts when more weight is assigned to the counts and less to
the prior O-D estimates. This weight assigning process though reasonable, could
not be adopted to the test network in the study because count data from all links
are not available. The obvious deduction from this exercise is that to get better
O-D estimates additional traffic count data is needed. However, since these data
are not available, and it can be assumed that successive O-D ﬂows are not
randomly distributed, but in fact are related to the previous interval ﬂows, a I
value of a = l was selected. Therefore, the weights assigned to the counts and

the OD estimates are as described earlier derived from observation error

80

variances and OD estimates error variances; and are not artiﬁcially augmented

by any factor a.

81

Table 3

RMSE for different values of the constant parameter

 

 

u - interval 1 of day 1

u RMSE
0.3 46
0.8 45

l 44
1.5 43

2 42

3 41

4 40

 

 

 

 

82

Table 4

RMSE and %RMSE for different values of the bonus factor
(B) - interval 1 of day 1

 

 

(3 RMSE %RMSE
41.2 63 240
-O.3 64 242
OS 63 243

 

 

 

 

 

83

Table 5

RMSEs For Different Values of the Weighting Factor

(I.

12 intervals - GLS Run

 

 

 

 

Interval a = 0.05 L a = 0.2 T a = 0.6
RMSE
l 20 23 31
2 17 21 29
3 20 23 29
4 27 so 36
5 37 42 50
6 39 44 51
7 43 50 61
8 52 61 74
9 64 75 93
10 69 80 98
ll 74 86 105
12 77 86 105

 

 

 

 

 

Table 6

Comparison of Actual Versus Estimated Counts -
12 intervals - GLS Run

 

Interval RMSE

32
31

59
72

95
l 02
l 02

 

 

 

ngaooouomtswm-o

 

Chapter 7
Analysis of the Results
7.1 Introduction

The intent of this chapter is to report the results of the application of the model
described in chapters 3 and 4 to estimate O-D trips for the study network using
the data described in chapter 6. The approach used in the analyses is depicted in
the ﬂow chart shown Figure 7. The results of the initial model application were
used to identify problems with the data. The data were then adjusted, and the

analyses were repeated.
7.2 The OLS Procedure

As shown in Figure 7, the model was initially run using the OLS procedure. The
OLS procedure will result in the O-D estimate that produces the lowest %RMSE
given a particular data set. The OLS procedure is identical to the GLS; except
that the error variance of the link counts and the prior O-D estimates are not
weighted. The OLS O-D estimation equation is similar to equation (60) of

chapter 3 with an the inverse variance matrix V" '1: I, an identity matrix.
)2 (j) = (1141;) 111-1141(1))4141t (j) w-1 21:11))

The OLS procedure was performed in a sequential manner for 12 consecutive)
intervals for each of 5 days. To check the validity of the OD estimates, these

estimates were used to calculate link count estimates for each interval using

85

86

equation 61 of chapter 3. The RMSE and %RMSE statistics measure how close
the link estimates were to the actual counts. The RMSEs and %RMSEs are
exhibited in Tables 7 through 11. A comparison of RMSE’s and %RMSE’s of the
12 intervals shows little change in RMSE values between day 4 and day 5. The
predicted and observed link ﬂows for intervals (1 ,3,6,9,12) of day 5 are shown in
Figures 9-13. A review of these charts shows:

1) A significant over-estimation of almost all links in interval 1 (54 of 58).
2) The identification of 3 outlier links (overestimated).

3) Most of the links being underestimated by interval 12 (39 of 58 links).

7.3 The GLS Procedure

The GLS procedure attempts to minimize the deviations of the estimated counts
from the observed counts while also weighting the magnitude of deviations
between the prior O-D estimates and the present O-D estimates. The O-D
estimates and the link count estimates of the OLS procedure for the same
interval of the same day are used in calculating errors and estimating the

variance as was described in section 4.3.

The GLS procedure was also run sequentially for 12 intervals of each day for 5
days. The O-D estimates were obtained using equation (60) of section 3.1

(shown here as a reminder)

)2 (j) = (11*t (j) V*'1H*(j))'1H*t (j) vr1 21(1))

87

The link counts were then estimated from the OD estimates for each interval.

The link counts are estimated using equation (50) of chapter 3
2(1, j) = 2 h(r, j-l, l, j) . 5H r, j-l) + E h( r, j, l, j) x(r,j) + 00, j) , r=1,..,nod
in matrix form the above equation is stated as

Z(j) = H(j-l) 12(1) + H(j) 12(1)

The measure of the goodness- of --ﬁt of the estimated link counts to the actual
link counts was the RMSE and %RMSE statistics . A comparison of RMSE’s
and %RMSE’s of the 12 intervals for 5 days are shown in Tables 12 through 16
show little change in the RMSEs and %RMSEs between day 4 and day 5. The
predicted and observed link ﬂows for intervals (1 ,3,6,9,12) of day 5 are shown in

Figures 14-18. A review of these charts show:

1) Most of the links are overestimated in interval 1, with 3 links identified as
outliers.

2) In interval 3 and 6, the majority of the links are underestimated, with both
high and low outliers.

3) In interval 9, there is nearly an equal number of links that are over and under
estimated. The low outliers are the most significant.

4) In interval 12, there is an equal number of links overestimated and
underestimated, with no evidence of convergence between the actual and

estimated link counts.

88

7.4 A Revision of the Assumptions

At the end of day 5 there is still a considerable deviation between the actual
and estimated counts for some of the links. This means that either the number of
trips assigned to the links by the assignment matrix is not correct, or there are
errors in the actual traffic counts. Since calculation of the assignment matrix is
dependent on travel times, which are a function of speed and delay, errors in

estimating these parameters could lead to assignment errors.

In an attempt to improve the assignment matrix, the assumptions made in
deriving average speed and delay were reviewed. The under-estimation of
many of the interval 12 day 5 link counts was alarming. One explanation for
under-estimation could be a low average speed on some links or in some
sections of the network which would lead to the link being assigned low
volumes. In checking the raw data obtained from the Trafﬁc Control Center,
some links were being recorded with an average speed as low as 5 mph. Since
this does not appear reasonable, the following procedure was adopted to

increase average speed and reduce average stopped delay.
In each section those detectors that showed speeds that were not within the
range of the level of service A, B, and C were removed from the calculation of an

average speed for the section.

The link average stopped delay was also recalculated considering only those ,

links used to calculate the section average speed.

The section average delay was then calculated from equation (77) of chapter 6

89

where the numerator is the sum of delay from detectors that show level of
service A, B, or C and the denominator is the sum of the volume of those

detectors that fall within these levels of service.

A review of the detector data used to represent "actual" counts on the links
indicated that in some cases the detectors only covered one lane of a multi-lane
street. In these locations, the detector counts were multiplied by the ratio of
total lanes on the links to the number of lanes covered by the counter to obtain

new link volumes.
7.5 Adjustment Procedure

The counts were adjusted and the speed and delay were revised as described
above and the OLS procedure was run again for 12 intervals of day 5. Table 17
shows the values of the RMSEs and %RMSEs for day 5. The %RMSE was
reduced from 64 to 45 by interval 12, and the distribution around the mean
appears to be better with the adjusted data, (Figures 19-23). Observations from

these charts are:

1) Almost all links are overestimated in interval 1 (similar to the original OLS

results).

9O
2) The trend of most links being overestimated in interval 1 and most links
being underestimated in interval 12 is retained.

3) The same three links that were overestimated in the initial OLS runs remain

outliers with the adjusted data.

The location of these three links and their proximity to a centroid is shown in
Figure 8. The over-estimation of these three links constitutes 15 ”/6 of the RMSE

values of the adjusted OLS estimates for interval 12.

Another indicator of how close the link estimates are to the actual counts is the
frequency with which links fall within a specified percent absolute difference

between the estimated and the actual counts for the ﬁnal interval of day 5:

1) 16 Percent of the links fall in interval { 0, 25]

2) 31 Percent of the links fall in interval {25, 50]
3) 27 Percent of the links fall in interval (50,100]
4) 26 Percent of the links fall in interval { >100 }

Comparison of the RMSEs and %RMSEs (Table 18 and Table 16) of the GIS runs
before and after data adjustment show lower RMSEs and %RMSEs for the
unadjusted GLS runs. This is a result of the weighting factors used to calculate
successive estimates. In the GLS estimation procedure, the inverse of the error
variance ( this includes link count error variance and O-D error variance) act as
weighting factors. These weights adjust for the differences between the
estimated link counts and the observed counts, and smooth out large variations

that might occur between the present interval and the previous interval O-D

91
estimates. Therefore, high weights are assigned to data with a low error

variance, and low weights are assigned to data with a high error variance.

In the adjusted GLS estimation procedure the weights assigned to the link counts
for intervals 4-12 are very small compared to the weights assigned to the link
counts in the unadjusted GIS runs. The weights assigned to prior O-D estimates
in the adjusted GLS estimation procedure are also smaller than the weights

assigned to prior O-D estimates in the unadjusted GLS estimation.

The effect of the high volume links (which are generally overestimated) being
assigned extremely low weights in the GLS process is an over-estimation of the
OD trips which use these links. This results in an increase in the volume
assigned to these links and consequently to even larger estimated link ﬂows in
the next interval. Comparison between the estimated and actual link counts for
the GLS runs using the adjusted data show some improvement in the
distribution around the mean (Figures 24-28). Reviewing these figures leads to

the following observations:

1) There is approximately an equal number of links overestimated and
underestimated in interval 1.

2) There are a number of outliers on both the high and low side in interval 1.
3) By interval 12, most of the links are overestimated, but outliers remain on
both the high and low side.

4) The three outliers from the previous runs are no longer identifiable as a

special case.

92
7.6 An alternate OLS-GLS approach

In an attempt to improve the distribution of link estimates around the mean, a
variation of the OLS - GLS procedure was tried. The procedure is as follows:
Interval one OLS O-D estimates are calculated. Subsequently interval one GLS
O-D estimates are calculated. For the second and each subsequent interval OLS
O-D estimation, the prior interval GLS O-D estimates were used as an apriori

O-D matrix.

A comparison of the RMSEs and %RMSEs of 12 intervals of 5 days are shown in
Tables 19-23, both the RMSE and %RMSE values increase towards the later
intervals. However, they are much lower than the GLS results shown in Table
18 when the sequential OLS - GLS was employed. The predicted and observed
link ﬂows for intervals (1, 3, 6, 9, 12) of day 5 are shown in Figures 29-33.
Comparison between the estimated and actual link counts for the alternate GLS
runs using the adjusted data show some improvement in the distribution around

the mean. Reviewing these figures leads to the following observations:

1) There is approximately an equal number of links overestimated and
underestimated in interval 1.

2) There are a number of outliers on both the high and low side in interval 1.
3) By interval 12, most of the links are overestimated, but outliers remain on

both the high and low side.

93

7.7 Possible Sources of Differences Between the Estimated and
Actual Link Counts

There are three outliers that appear in many of the ﬁgures showing link counts
versus estimated link volumes. In the alternating OLS-GL8 procedure they
represent 32% of the RMSE and 54% of the %RMSE terms in interval 12 day 5
results. Two of these links are on the one way street which provides the only
access to and from the centroid shown in Figure 8. Therefore, in the estimating
process, all paths going from any origin to this particular centroid (destination)
must include link 1. Likewise, all paths departing this origin must include link 2.
These links are also used by trips from other origins to other destinations.
Because path assignment must be continous, trips on a link must continue on the
adjacent links on the same paths. Thus overestimating link 2 results in

overestimating link 3.

The concentration of a signiﬁcant percent of the error on these three links is
thought to be a result of the zone structure. Using centroids with multiple access

links to the network should result in fewer outliers, and a lower RMSE.

Certain other assumptions or simpliﬁcations had to be made in the application of
the model structure to the data available. These items are perceived to contribute
to the error terms when applying the model to the study area network. First, the
selection of links to be eliminated in order to find the second and third

alternative path for the trip assignment process was heuristic, and constrained to

be on paths that included one of the 58 links where the detectors were located.

94
The use of a probabilistic assigrunent algorithm that would identify the three

alternate paths in one run might result in a different set of paths.

If the constraint that all paths must include one of the 58 links were relaxed (by
virtue of having more detectors), the vehicle trip distribution would be more
uniformly distributed over the network, thus reducing the probability of over

assignment on the links with detectors.

A second parameter that would change the number of trips assigned to various

paths in the network is the parameter B.

This parameter (as was described in chapter 4) represents savings in travel time
(min) for trips using expressway links. Thus B represents the drivers perception
of time saved by choosing a route that includes a freeway segment. In this
research, the value of B was selected as a maximum reasonable value for the

shortest freeway link in the network.

In fact, different values of B might apply to links other than freeway links. For
example, motorists may perceive that a one way street with good progression
would provide a higher average speed than a two way arterial with a high
density of trafﬁc signals. The use of B for freeway links, B2 for one-way links
and B3 for other arterial links may provide a better representation of driver

perception of the disutility of using links on various paths.

In addition, Bis considered to be time independent in the model formulation.

Thus, the motorist is assigned the same perceived bonus at 6:00 in the morning

95

and in the height of rush hour. A feature that should be considered in future
applications of the model would be to use a B(t) formulation, with B varying

according to the time of day.

Other assumptions or procedural steps used in the application of the model may
contribute to the high RMSE and %RMSE values include:

1) Grouping of the link counts and the estimated O-D volumes.

Link counts and O-D volumes were categorized into 3 groups of low, medium)
and high. Other grouping schemes such as clustering data into groups of 2 or 4
were not considered.

2) Adjustment of the prior O-D estimates.

During interval (j), the prior O-D estimate X(d,j-l), (d=day, j=interval) is

multiplied by a correction factor. The correction factor is:

if X(d-1,j-1) > 0 then X(d-Lj) / X(d-1,j-1)
if X(d-1,j-1) = 0 then X(d-1,j) / X(d-1,j-1) = 1.0

3) Problems with the On-line data. Observed link counts.

Volume data collected by the detectors in the streets often provides volume data
for only one or two lanes out of a multi - lane street. Therefore, the volume data
that is a major input data in the model is incomplete. To adjust for lack of data,
a correction factor based on the number of lanes was used.

4) Observed link speeds

The speed information obtained from traffic counters is an average speed of .
vehicles going over detectors during the green phase of traffic signals. This
speed does not represents the distribution of a vehicle’s speed while traveling

across a link. It only reﬂects the speed of the vehicles passing over the detectors,

96
and this may vary according to the distance from the counter location to the
intersection.
5) Observed link delay
Delay values obtained from the detectors represent delay due to stopping at an

intersection due to a red traffic signal. No on-line delay information is available

for random rnidblock delay on streets.

97

Figure 7

Flow Chart Depicting the overall OLS - GLS

Procedure

 

l

 

 

 

 

 

 

 

Data Preparation

 

 

 

 

 

Calculation of link
travel time

 

 

 

Calculation of path
total travel time

 

 

 

Calculation of the Assignment ‘

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(.____1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

‘ tnx
Interval t=1, day 1 only
Apriori 1987 O-D table #1 OLS OLS
O-D estimates
1...... trafﬁc
counts interval (t)
Starting from t>1 ) .
Prior interval (t-l) '-
O-D estimate Link Estimation
Procedure
Link counts
estimates interval (t)
Error Variance
O-D error Calculation Counts error
variance variance
Augmented
Inverse error
variance

 

 

 

<19

 

 

 

 

 

 

 

 

Interval t=l, day 1 only
requiresApriori

98

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1987 O-D table , GLS GLS
‘ O-D estimates
interval (t)
Link traffic
counts interval (t)
Starting from t>1requires . ,
Prior interval (t-l)
O-D estimates 1 )
Link Estimation
Procedure
1.1
Link counts
estimates interval (1)
Check the validity of
O-D estimates by
calculating RMSE and V
%RMSE for the estimated
link counts

 

 

 

 

=t+1

 

 

 

 

 

 

0“. 411')» 1 l \ ...”...ao . 1
.94.}. . 1.). : L ..M... .hu. eve-511.5%.
. . ...e o. _
. o. / .5 £31.. .. 1:0 : .H.._...> ...5 8.3.2.. HI:
.. {JR ’8". .03 )1) .7... /

 

ii... \\ ...-:3”. . [Uphill ....1..Biu...ﬂ. ... ..Wur.
ll. VmMMW-MWWnﬂmmﬂh Ell"
tohu1\ .EZmu \. on
mask. :28? v

      
 

. . 7 .... 2030.
a

.zoJmom

     

     

/ v. .. ..
../_... .3

 

l
W

 

In":

     

=-

Eozzmu m ,_
a 2 3:3 wmocwmimmuxx VBmEzmm—LEO 2: .o 5:80.— 9: 956380 .wEBED

w 935...

100

Table 7

Comparison of Predicted Versus Observed Link Flows - OLS run - Day 1

 

 

Intervals Xobs Xest RMSE %RMSE
1 1536 4184 64 242
2 1485 2995 43 167
3 2205 3592 46 121
4 3055 4660 57 108
5 3613 4985 61 98
6 3904 4990 57 85
7 4532 5658 65 83
8 5648 6391 70 72
9 6495 6958 81 72
10 6878 6903 83 70
11 6920 6930 80 67
12 7070 6688 80 65

 

101

Table 8

Comparison of Predicted Versus Observed Link Fows - OLS run - Day 2

 

 

Intervals Xobs Xest RMSE %RMSE
1 1509 3823 55 212
2 1638 2950 42 148
3 2569 3765 46 103
4 2928 4109 49 95
5 3757 4877 55 86
6 4348 5418 63 83
7 5071 5842 62 71
8 5928 6630 66 65
9 6760 7129 78 67
10 7076 7153 81 66
11 7172 7040 82 66
12 7384 7098 83 65

 

102

Table 9

Comparison of Predicted Versus Observed Link Flows - OLS run - Day 3

 

 

Intervals Xobs Xest RMSE %RMSE
1 1553 3935 58 217
2 1648 3021 45 157
3 2428 3602 44 106
4 3010 4204 50 96
5 3809 4841 60 91
6 4468 5576 63 82
7 5011 6026 65 75
8 S922 6762 73 72
9 6594 7215 80 70
10 7133 7616 84 69
11 7570 7816 88 67
12 7421 7436 85 66

 

103

Table 1 0

Comparison of Predicted Versus Observed Link Flows - OLS run - Day 4

 

 

Intervals Xobs Xest RMSE %RMSE
1 1532 3780 57 215
2 1622 2820 42 150
3 2431 3598 46 109
4 3058 4355 51 98
5 4060 5421 61 86
6 4490 5631 65 84
7 5134 6284 70 79
8 6150 7303 79 74
9 6754 7652 87 75

10 6661 6668 82 71
11 7524 6507 84 65
12 7419 6302 87 68

 

104

Table 11

Comparison of Predicted Versus Observed Link Flows - OLS run - Day 5

 

 

Intervals Xobs Xest RMSE %RMSE
l 1521 3788 54 205
2 1616 3055 43 153
3 2378 3485 42 102
4 2872 4128 50 102
5 3805 4890 54 83
6 4157 5132 57 79
7 4882 5986 65 77
8 5616 6248 67 69
9 6374 6973 78 71
10 6654 6705 73 64
11 7245 7422 83 66
12 7331 7249 81 64

 

105

OS

of

03

ON_

89

.0200

0m

00

ON

 

 

 

ill..- ow

 

 

 

 

 

 

ow

 

§

 

om—

 

OS

 

 

 

 

 

 

 

 

 

 

Oo—

 

 

m >00 - _ .282... ...—.500 0.5. 00.08.30 .»> .0200 - c2 30 - o 059“.

om—

pamwusa

106

8m

ow—

Oo—

03

ON.

.0200
8 .

ow

om

 

 

 

om

 

ll

 

 

 

ow

 

 

 

 

Ii

om.

 

ov—

 

oo—

 

 

 

 

 

 

 

 

 

 

 

om.

 

 

m >00 - n .0202. £5.00 03.00.08.210 9.0200 - c2 30 - 9 059.

SN

papwusa

107

0mm

8m

.03

.0200

on.

8.

 

 

 

 

 

..I

II

 

Ii

 

 

 

 

 

 

 

 

 

 

 

 

m >00 - o .0202. 3.500 0.5. 00.05.30 .»> .0200 - 2 05m:

-8—

00—

8m

0mm

8m

09..

palowusa

108

omv

0mm.

8m

0mm

.0200

on.

8.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 . o .0202. 2.500 0.5. 00.05.00 ....> .0200 - :2 20 - a. 050.“.

00.

on.

oow

0mm

com

0mm

80

Om.V

papwgse

169

090.

80

0mm

8m,

.0200

8m

8m

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 - a. .0202. 22:00 2.. 00.02.00 ....> .0200- :2 30 - n. 050.“.

8.

0m.

8m

Dom

8m

8m

80

on.0

pawwysa

110

Table 12

Comparison of Predicted Versus Observed Link Flows — GL5 run - Day 1

 

 

Intervals Xobs Xest RMSE %RMSE
1 1536 2787 51 194
2 1485 2269 43 166
3 2205 2335 43 113
4 3055 2439 49 93
5 3613 2445 57 91
6 3904 2475 58 85
7 4532 2623 67 86
8 5648 2746 80 82
9 6495 2935 95 85
10 6878 2945 101 85
11 6920 3057 102 85
12 7070 3131 100 82

 

Comparison of Predicted Versus Observed Link Flows - GLS run - Day 2

111

Table 13

 

 

Intervals Xobs Xest RMSE %RMSE
1 1509 2630 47 179
2 1638 2217 42 149
3 2569 2313 40 91
4 2928 2466 42 83
5 3757 2724 49 76
6 4348 2879 57 76
7 5071 3114 62 71
8 5928 3703 68 67
9 6760 4159 79 68

10 7076 4316 83 68
11 7172 4424 86 69
12 7384 4441 88 69

 

112

Table 14

Comparison of Predicted Versus Observed Link Flows - GL5 run - Day 3

 

 

Intervals Xobs Xest RMSE %RMSE
1 1553 2621 47 176
2 1648 2123 39 136
3 2428 2204 34 81
4 3010 2463 38 73
5 3809 2931 46 70
6 4468 3285 51 67
7 5011 3635 56 65
8 5922 4590 59 57
9 6594 5064 67 59
10 7133 5338 74 60
11 7570 5711 80 61
12 7421 5655 79 62

 

113

Table 15

Comparison of Predicted Versus Observed Link Flows - GL5 run - Day 4

 

 

Intervals Xobs Xest RMSE %RMSE
1 1532 2601 48 180
2 1622 1869 33 118
3 2431 2197 33 78
4 3058 2591 37 70
5 4060 3209 47 67
6 4490 3531 51 66
7 5134 3976 58 65
8 6150 5131 63 59
9 6754 5661 71 61
10 6661 5410 70 61
11 7524 5293 81 63
12 7419 5503 82 64

 

114

Table 16

Comparison of Predicted Versus Observed Link Flows - GLS run - Day 5

 

 

Intervals Xobs Xest RMSE %RMSE
1 1521 2769 49 186
2 1616 2020 33 119
3 2378 2195 30 73
4 2872 2583 35 71
5 3805 3659 51 77
6 4157 3933 55 77
7 4882 4423 58 69
8 5616 5512 60 62
9 6374 6387 71 64
10 6654 6967 75 65
11 7245 7558 86 69
12 7331 7662 87 69

 

115

ow.

.0200

 

 

 

 

 

 

 

 

 

 

oo. ow. ON. 8. 8 ow 0m 0 .
. M I O
I
I
H - om
- I

l I

I - I O?
I
I
W _. 8
I I
II 1

.. 8
oo.

.. .-
ON.

I
00.
00.
I I

ow.

 

 

 

 

 

 

 

 

 

 

m >00 - . .0202. £500 2.. 00.02.00 ....> .0200 .. :2 30 .. v. 050.“.

9310011133

116

Co.

00.

ON. .

CO.

.0200
on

00

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 - n .0202. «.2300 02.. 00.02.00 .2. .0200 - :2 30 - m. 050.“.

ON

00

0w

8.

ON.

00.

00.

pagownsa

117

own

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8m .8m 8m on. 8. om
II
I I I
I III I l
.....rl .
I I
I I I I
. ..
I
I .r
I
I
I I
‘
I
I
I

 

 

m >00 .. o .0202. 32:00 2.. 00.02.00 .»> .0200 .. :2 «.0 - o. 050.“.

8.

.8.

8m

8m

8m

omm

pamnsa

118

own

0mm

.0200
8m

00.

8.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 - o .0202. 02:00 2.. 00.02.00 .0> .0200 - :2 30 .. 2 059.

8.

8m

00m

119

omv

.0200

 

 

 

 

 

 

 

 

 

 

80 can . 8m 0mm 8m 0m. 8. on
I
I I IF I
I I I
I 5 I I
I I I I I
III
I + I
I I I
I ml? 7
I I
I I
I
I
III
I
I I
u
I
I

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 - a. .0202. 02:00 2.. 00.02.00 .0> .0200- :2 30 - o. 059“.

80

09.»

126

Table 17

Comparison of Predicted Versus Observed Link Flows - OLS run -

Adjusted Data - Day 5

 

 

Intervals Xobs Xest RMSE %RMSE
1 2917 5735 69 138
2 3109 4879 55 102
3 4580 6052 60 76
4 5483 7037 72 76
5 7331 8838 84 66
6 8052 9480 91 66
7 9452 10821 99 61
8 11072 12184 107 56
9 12213 13524 121 57
10 12844 12907 105 47
11 13949 13781 115 48
12 14271 13895 112 45

 

121

0mm

Dow

.0200
om .

8.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00

0 >00 00.3.0.0 .. — .0202. 2:300 V...: 00.02.00 .o> .0200- :2 30 .. o— 0.30...

122‘

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

80 0mm 8 0mm 8 on. 8. on o
I -‘ I.

I III
WW I.

I
5 I
I
I I
I

 

 

 

 

 

 

 

 

 

 

 

 

0.00
n >00 00—3—00 .. n .0202. «2:00 v...... 0208.00 £10200 .. :2 30 .. cw 059..

O

18.

on.

omw
8m
own

9910'”

123

8.

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00
m >00 000200 - o .0202. 2:00 2.. 00.02.00 ....> .0200 - :2 30 .. .a 050..

.8,

§§§

93100-11193

124

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00 m >00 000200 - o .0202. 02:00 2.. 00.02.00 .2, .0200 .5. 30 - «a 050..

-8.

§§§§§§§§§

.

93100-31139

125

8>

.0200

8m

8w

 

 

 

 

 

I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00

0. >00 000200 .. a. .0202. 02:00 2.. 00.02.00 .0> .0200 - 2.. 30 - mm 050.“.

-8.

8w

§§§§§§§

WWII“

126

Table 18

Comparison of Predicted Versus Observed Link Flows - GLS run -

Adjusted Data - Day 5

 

 

Intervals Xobs Xest RMSE %RMSE
1 2917 2963 49 94
2 3109 2773 38 67
3 4580 3592 55 67
4 5483 5352 75 75
5 7331 7954 123 94
6 8052 8978 141 101
7 9452 11855 165 101
8 11072 18480 284 149
9 12213 23404 396 188
10 12844 26780 479 216
11 13949 29884 543 226
12 14271 32507 630 256

 

127'-

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

On. OO— 03 ON. 8. Co 00 00 ON 0
I I I I o
I I I I
I I “Illlll I I I-IIII ON
II
I qI I I I “WEIIIIIﬁ 00
I I
I I
I I “u 00
I II
. .I. 8
8.
I I
I ON.
I I
00.
I 00.
. . 8.
0.00

m >00 000200 - . .0202. 02:00 2.. 00.02.00 .0> .0200 :2 30 - cm 050.“.

Palm-””99

128

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cow 8 on. 8. on 0
ill
I I IIIII I
E I II I
I IIII I I I I I I I III:
I I I I I
I I I I I II I
I I I I
I
I
I
I I
I
I
0.00m

>00 000200 - n .0202. 02:00 2.. 00.02.00 .m> .0200 - :2 30 - mm 050.“.

8.

on.

8N

0mm

pawwusa

129

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00 m
>00 000:.00 .. 0 .0202. 02:00 2.. 00.02.00 .0> .0200 .. :2 30 - ca 050.“.

§ § §
papwusa

§ §

130

.0200

8.

 

 

 

 

 

 

 

 

 

 

 

 

0.00

m >00 000:.00 - 0 .0202. 02:00 2.. 00.02.00 .0> .0200 .5. 30 .. R 050.“.

335

0010041130

é

8mm

131

8mm

.0200
8 .

8.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m >00 00.00.00 - a. .0202. «.2300 v.2. 00.02.00 .m> .0200 .. 30 - mm 0.30.“.

0.00

E

Palowllsa

132

Table 19

Comparison of predicted to observed link ﬂows -

Alternate GLS run - day 1

 

 

Intervals Xobs Xest RMSE %RMSE
1 1536 3517 58 219
2 1485 3556 61 238
3 2205 4145 67 177
4 3055 3838 58 109
5 3613 3959 58 93
6 3904 3732 47 69
7 4532 4441 54 69
8 5648 4945 61 62
9 6495 5766 68 61

10 6878 6201 69 58
11 6920 6549 70 59
12 7070 6739 67 58

 

133"

Table 20

Comparison of Predicted Versus Observed Link Flows

Alternate GLS run - day 2

 

 

Intervals Xobs Xest RMSE %RMSE
1 1509 3278 52 200
2 1638 3367 58 207
3 2569 3928 56 126
4 2928 3653 49 97
5 3757 4109 47 73
6 4348 4495 46 62
7 5071 5221 48 55
8 5928 5908 54 53
9 6760 7169 66 57

10 7076 8127 79 65
11 7172 8427 87 70
12 7384 8693 86 67

 

134

Table 21

Comparison of Predicted Versus Observed Link Flows -

Alternate GL5 run - day 3

 

 

hnuuvak; IXobs tht RLASE QGRIASE
1 1553 3176 51 192
2 1648 3240 56 196
3 2428 3722 51 123
4 3010 3687 47 91
5 3809 4453 50 76
6 4468 5453 75 98
7 5011 5889 78 91
8 5922 5873 94 92
9 6594 6976 68 6O
10 7133 7764 80 65
11 7570 3953 99 76
12 7421 4985 80 62

 

135

Table 22

Comparison of Predicted Versus Observed Link Flows -

Alternate GLS run - day 4

 

 

Intervals Xobs Xest RMSE %RMSE
1 1532 2601 47 179
2 1622 2724 51 181
3 2431 3092 45 107
4 3058 3143 38 73
5 4060 3895 43 61
6 4490 4888 49 63
7 5134 4699 70 79
8 6150 4285 67 64
9 6754 6162 67 57

10 6661 6690 72 62
11 7524 7293 109 84
12 7419 9776 203 160

 

136

Table 23

Comparison of Predicted Versus Observed Link Flows -

Alternate GLS run - day 5

 

 

Intervals Xobs Xest RMSE %RMSE
1 2917 3066 48 90
2 3109 3647 45 78
3 4580 4521 47 56
4 5483 5936 57 56
5 7331 8508 86 69
6 8052 11005 121 87
7 9452 8125 93 57
8 11072 15775 159 83
9 12213 22161 287 136

10 12844 24096 382 172
11 13949 20149 246 102
12 14271 19732 231 94

 

om. oo— . 03

ON.

.0200

ow

ov

 

 

 

 

 

 

 

137

 

 

 

 

 

 

 

 

 

D

 
  

 

 

 

 

 

 

 

00.0200 - F .0202. 02:00 2.. 00.02.20 .0> .0200 . 2:. 0.0 0.02.0.3 . 0a 050.".

0.00 0 >00

ov

oo—

on.

00—

patamusa

138

.0200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00 0 >00 - :0 u.
2.0 00 0. .
. 0 n .0202. 02:00 2.. 00.02.00 .0: .0200 .. 2:. 0.0 0.0 ..<
00.0: 0 -

 

paiaumsa

139

.0200

com com. 000 com 8m 8. o

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0 .00
00.0200 . m .0202. 02:00 2.. 00.02.00 .0> .0200 . 2:. 01.0 0.02.024 - 5 050.".

 

DGIBLUIISG

140

.0200
oom — coo —. com com 000 8m

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.00 0 .00
00.0200 - 0 .0202. 02:00 2.. 00.02.00 .0> .0200 - 2:. 0.0 0.020.? - an 050.”.

 

paxewusa

141

333
83 009 . 000— 8m 8m 09.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m E. 8.3.8
- up .9595 «:53 x5. $38.60 .m> 338a . E: mac 29:22 - an 2:9“.

 

DGIBWIISO

Chapter 8

Conclusion

This research started with two objectives in mind:

1) Development and validation of a new dynamic O-D
estimation method.

2) A determination of model accuracy when applied to an
urban area where full data is not available.

8.1 Model Development

A new method for estimating origin-destination trips in an urban area based on
dynamic traffic count data was proposed. The method is configured as an
optimization model consisting of two linear systems of equations. The first
equation treats the OD volumes as time-variant variables, which are dependent
on the time series of link flows. The second equation expresses a causal
relationship between the past and present O-D volumes. This causal
relationship is similar to the Kalman State equation, but varies from it in its

spatial dependency.

In the Kalman filtering setup the State equation defines the future interval O-D
volumes as a linear combination of the present interval O-D volumes. In the
proposed method, unlike the State equation, the previous interval O-D estimate
is defined in terms of the present interval O-D volumes. With this approach
additional information can be obtained, which is used to identify the structure

and size of the flows in urban road system.

142

143

8.2 Implementation

The optimization method used to derive estimates of GB volumes is the Least
Square Estimation. This approach is common in all time-dependent O-D
estimation models. However, each model uses the LS method in a specific
manner. For example, in the Kalman filtering approach the model starts the
optimization process with some initial values for the coefficient and variance
matrices, applies the LS estimation procedure, updates the coefficients and

variances for the next interval and repeats the process.

No initial values for the parameters are assumed except the prior O-D trips of
interval one of day one. All the parameters needed to calculate the assignment
(fraction) matrix are estimated based on specified relationships. The sequential
OLS application finds O-D estimates for the 12 intervals of each day of the study
period. The link count estimates are then derived from the estimated O-D

volumes.

The GLS estimates are based on an optimization process that finds an unbiased
minimum variance estimate. This is achieved through insertion of weights,
which are an inverse function of link error variances, and adjust for the
deviations from the observed counts. Prior O—D estimates are adjusted by
applying weights which are an inverse function of the O-D error variances to
control the magnitude of variations from the previous O-D estimate. TheGLS
estimation process is also applied sequentially for 12 intervals. During each
subsequent interval, the OD estimates of the previous interval are used as

apriori O-D trips.

144

8.3 Model Verification

The validity of the mathematical relationships was tested by applying the
process to a fully determined artificial network consisting of 7 links and 7 OD
pairs. Data such as volume on the links,and link total travel times were
obtained through a random number generator process. The true O-D volumes
were known. The objective was to replicate O—D volumes by applying the
model. The outcome of the model implementation showed close match between

estimated O-D volumes and real O-D volumes.

The conclusion drawn from implementing the model on synthetic data is that
the model is mathematically sound and given enough information can produce a

reliable outcome.
8.4 Application

The model was applied to a network representing a portion of the city of Boston.
In this network, ﬂow on only 58 of 1124 links was known. Travel speed and
delay were also only available on these same links, and no "true" O-D data

exists.
8.4.1 OLS Runs

The OLS was run sequentially for 12 intervals per day for five days. The
objective was to minimize the magnitude of the RMSE and %RMSE statistics.
Both RMSE and %RMSE stabilized by day five. The distribution of the

145

difference between the estimated link counts and the actual link counts

improved significantly over this analysis period.

8.4.2 GLS Runs

Similar to the OLS runs the RMSEs and the %RMSEs stabilize after day five.
Both RMSEs and %RMSEs remain high.

8.4.3 Alternate OLS-GL5 Runs

A comparison of the RMSEs and %RMSEs of 12 intervals of day 5, show the
RMSE and %RMSE values increase towards the later intervals. Comparison
between the estimated and actual link counts for the alternate GLS runs of day 5
using the adjusted data show some improvement in the distribution around the

mean .

8.5 Revision of Assumptions

In an attempt to improve the results, assumptions made on speed, delay, and
link counts were scrutinized. All three were revised. Speed was generally
raised, delay was adjusted accordingly to reﬂect changes in speed; and counts

were raised based on the number of lanes covered by detectors.

146

8.6 Listing of the Assumptions

The following assumptions were made in this study. These assumptions will
influence the results of the estimating procedure, and should be verified ( or
modified) prior to using the model.

1) The value of the constant p: This constant was set equal to 1.5 based on the
results of the sensitivity analysis, reasonableness of the (31.5 run RMSEs, and the
fact that the minimum path update algorithm developed by the Bureau of

Public Roads (BPR) used an update value of 1.5 for the exponent of v/c. This
value effects the choice of paths between O-D pairs, and thus link flow.

2) The value of the constant B: This constant was chosen to be equal to -O.3.
This choice was based on a reasonable expectation of time savings on the
expressway links contained in the network. This value effects the travel time on

paths which include an expressway, and thus the path choice and link ﬂow.

3) Speed was calculated based on the average speed of monitored links in a
section of the city. This average speed was adjusted by eliminating all the links
with what was considered to be unreasonably slow speeds. This speed was then
assigned to all the links that fell in that section of the city. This assumption was
made because all the individual link speeds were not available. This value
effects the travel time on paths passing through the section, and thus route
choice and link ﬂow.

4) Average delay was derived from total delay and total volume data for the

monitored links within each section of the city. All the links that fell in a

147

particular section were assigned the same delay value. Delay has the same effect

as a change in the link speed.

The OLS procedure was repeated based on the adjusted data. The distribution
of the difference between the estimated link counts and the actual link counts

was significantly improved.

Both the RMSEs and %RMSEs of the adjusted day 5 GLS estimates are higher
than the unadjusted RMSEs and %RMSEs. The error distribution is not
significantly better than the unadjusted GLS estimated link count distribution.

Overall, this dynamic optimization method is a valid mathematical model. Its
applicability to an urban network requires reliable traffic data. When tested on a
network with synthetic data, model gives reliable results. When the model was
applied to the City of Boston network, where full data were not available, and
the parameters had to be estimated from data samples, the model did not give

reliable results.

REFERENCES

1.

10.

11.

Cascetta, E., Inaudi, D., Marquis, G. : Dynamic Estimation of Origin-
Destination Matrices Using Traffic Counts, 1991.

Cascetta, E., Nguyen, S. : A Unified Framework For Estimating or
Updating Origin/ Destination Matrices From Traffic Counts,
Transportation Research-B, Vol. 22B, No. 6, pp. 437-455, 1988.

Cremer, M., Keller, H. : A New Class of Dynamic Methods For the
Identification of Origin-Destination Flows, Transportation Research-
B, Vol. 21B, No. 2, pp. 117-132, 1987.

Cremer, M., Keller, H. : Dynamic Identification of Flows From Traffic
Counts at Complex Intersections. Proc. 8th Int. Symp. on Transp. and
Traffic Theory, University of Toranto press, Toranto, Canada.

Cremer, M., Putensen, K.: Monitoring of Traffic Density Profiles Using
Measurement From Loop Detectors and Vehicle Trip Data. Proc. 11th Int.
Symp. on Transp. and Traffic Theory, Yokohama, Japan.

Keller, H., Ploss, G. : Real-Time Identification of CD Network Flows
From Counts for Urban Traffic Control. Transportation and Traffic
Theory (Gartner N. H. and Wilson N. H. M. eds.), Elsevier Science
Publishing Co., Inc.

Okutani, I. : The Kalman Filtering Approaches in Some Transportation
and Traffic Problems, Transportation and Traffic Theory, pp. 1-11, 1987.

Ploss, G., Philipps, P., Inaudi, D., Keller, H. : MOTION - A New Traffic
Control Concept Based on Real-Time Origin-Destination Information,
Transportation and Traffic Theory, 1990.

Van Arde, M., Hellinga, B., Yu, L., Rakha, H. : Vehicle Probes as Real-
Time ATMS Sources of Dynamic OD and Travel Time Data. CITE ’93
Conference Compendium. Edmonton, Alberta.

Van Zuylen, H. j.: Some Improvements in the Estimation of an O-D
matrix from Traffic Counts, Proc. of the 8th Int. symp. on Transportation
and Traffic Theory, Toronto, 1981.

Ben-Akiva, M., Lerman, S. : Discrete Choice Analysis : Theory and
Application to TravelDemand. MIT Press, 1985.

I48

HIGH A

to N STATE UNIV. LIBRARIES
1|“WWW"IWMWWIHIIIHIINWHWWI
31293014107324