0% found this document useful (0 votes)
13 views47 pages

Lecture 1 - Trip Generation

The document outlines a lecture on the design of transportation infrastructure, focusing on trip generation as part of the four-stage travel demand model. It covers data collection methods, the importance of socio-economic factors, and the application of linear regression models for predicting trip production and attraction. Additionally, it discusses the process of validating data and the significance of regression analysis in transportation planning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views47 pages

Lecture 1 - Trip Generation

The document outlines a lecture on the design of transportation infrastructure, focusing on trip generation as part of the four-stage travel demand model. It covers data collection methods, the importance of socio-economic factors, and the application of linear regression models for predicting trip production and attraction. Additionally, it discusses the process of validating data and the significance of regression analysis in transportation planning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Lecture 1
Trip Generation

Dr. Morsaleen Chowdhury


Civil Engineering & Quantity Surveying
Military Technological College
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Lecture Outline

 The Four-Stage Travel Demand Model


 Data collection and the four main steps of modelling.
 Introduction to Trip Generation
 Introduction to basic definitions and regression analysis.
 Linear Regression Models
 Single Independent Variable, Example 1.
 Two Independent Variables, Example 2.
 Regression Analysis Using Microsoft Excel
 Step by step analysis using MS Excel, Example 3.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

The Four-Stage
Travel Demand Model
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Introduction
 Transport modeling (or travel demand
modelling) is the most important part of any
transportation planning.
 Transport models start by defining a study
area and dividing it into a number of zones.
 Each zone database includes current (base
year) transport networks, population,
employment, shopping space, educational,
and leisure facilities.
 As the transport models are applied to large
systems, they require large amounts of
data about travelers of the zones.
 It may take years for the data collection,
data analysis, and model development.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Data Collection
 Designing the data collection survey for the transportation projects
is not easy. It requires considerable experience, skill, and a sound
understanding of the study area.
 1. Survey Design:
 A. Information Required from the Data Collection.
 Socio-economic data (Income, vehicle ownership, family size, etc.)
 Travel surveys: No. of trips made by each member of the household.
 Land use inventory: House density at residential zones, establishments at
commercial and industrial zones.
 Network data: Road networks, traffic signals, junctions etc., and inventories
data like public and private transport networks.
 B. Defining the Study Area.
 C. Dividing the Study Area into Zones.
 D. Transport Network Characteristics.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 2. Household Data:
 A. Questionnaire Design.
 Number of house members, number of employed and unemployed, number
of cycles and cars, etc.
 B. Survey Administration.
 Conduct the actual survey, e.g. telephonic, mail back, face-to-face.

 3. Data Preparation:
 A. Data Correction.
 Household size correction, socio-demographic correction ( differences in
sex, age, etc. between the survey and population), non-response correction:
(any non-response from people that were surveyed).
 B. Sample Expansion.
 Amplify the survey data in order to represent the total population of the
zone, usually done by some expansion factor.
 C. Validation of Results.
 Data needs to be validated by: 1) field visits, 2) computation check (e.g. age
of a person is 150 years, unrealistic), 3) check data consistency (e.g. person
< 18 years cannot have a driving license).
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Travel Demand Modelling


 After the data has been collected, and
corrections are done, the data is ready to be
used in modeling.
 Although various modeling approaches are
adopted, we will discuss classical transport
model known as the four-stage model.
1. The Generation. (How many trips in
total?)
2. Trip Distribution. (Which zone to travel
to?)
3. Modal Split. (Which mode of transport to
take?)
4. Trip Assignment. (Which route to take?)
 We will discuss each of these stages in
detail over the following weeks!
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 Trip Generation:
 In this stage, the socio-economic and land use data will be used to
estimate the total number of trips produced (origin) and attracted
(destination) by each zone.

 Trip Distribution:
 In this stage, the travel surveys will be used to assign the above trips
from each zone-to-zone in the study area. The output is a trip matrix
which denotes the trips from each zone to every other zones.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 Modal Split:
 In this stage, the trips are
assigned to different modes
based on the modal attributes
from the socio-economic data.
The output will be a trip matrix
for various modes.
 Trip Assignment:
 Finally, each trip matrix is
assigned to the route network
of that particular mode based
on the network data. This step
will give the loading on each
link of the network.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Introduction to Trip Generation


MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Introduction to Trip Generation


 Trip generation is the first stage of the classical four-stage travel
demand model:
1. The Generation. (How many trips in total?)
2. Trip Distribution. (Which zone to travel to?)
3. Modal Split. (Which mode of transport to take?)
4. Trip Assignment. (Which route to take?)
 The trip generation stage aims at predicting the total number of trips
produced and attracted to each zone of the study area.
 Factors That Affect Trip Generation:
 Income.
 Vehicle ownership.
 Family size and composition.
 Land use characteristics.
 Distance of zone from town center.
 Accessibility to public transport system.
 Employment opportunities.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 In this module, we will discuss the main modeling approach to trip


generation:
 Linear Regression Model
 However, before trip generation can be discussed in detail, it is
necessary to explain some basic definitions.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Definitions
 Origin:
 The point at which a trip begins.
 Destination:
 The point at which a trip ends.
 Journey:
 An outward movement from a point of origin to a point of destination.
 Trip:
 An outward and return journey.
 Home-Based Trip:
 A trip for which either the origin or destination of the trip is the home.
 Non Home-Based Trip:
 A trip for which neither trip end is at home.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 Production:
 For home based trips, the production is defined as all the trips at the
home, i.e. whether the home is origin or destination.
 For non home based trip, the production is the origin of the trip.
 Attraction:
 For home based trips, the attraction is defined as all the trips at the
opposite end of the home, e.g. work.
 For non-home based trips, the attraction is the destination of the trip.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Regression Analysis
 In transport planning, regression analysis is used to predict the
relationship between independent and dependent variables.
 The most common form of a regression model used for trip
generation is a multi-linear function of the form:

 X1 to Xr are independent variables (from the travel surveys). For e.g.:


 X1 = total population in the zone and X2 = average household income.
 Yi is the dependent variable, which can be the Attraction or Production
in zone i, measured in trips/day.
 a is the intercept constant, and b1 to br are the regression coefficients.
 The constants a and b can be obtained by performing regression
analysis, and the solutions are tedious to obtain manually.
 We will first discuss regression analysis for the case of one and two
independent variables.
 Then we will investigate how to use Microsoft Excel to perform multiple
linear regression analysis.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Linear Regression Model


(Single Independent Variable)
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Simple Regression Model


 Consider that n numbers of data of a single independent variable
(X) and corresponding dependent variable (Y) have been obtained
from a travel survey and plotted as scatter graph.
 The simple regression model for this case is in the form:
Ye = a + bX
 Ye is the estimated values of the independent variable Y.
 a is the intercept constant and b is the regression coefficient.
 The idea of the regression
analysis is to fit the best line
through the observed data.
 In other words, finding the
best values of a and b.
 To do this requires some
knowledge from statistics as
we will discuss.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 To obtain the most accurate simple regression model, we need to


define the following error terms.
 𝑦𝑑 = Y − Ye , the deviation of the Y data from the regression line (Ye).
 𝑦𝑒 = Ye − Y, the deviation of the estimated value Ye from the average Y.
 ∑𝑦 2 = ∑𝑦𝑑2 + ∑𝑦𝑒2 , total sum of squares of deviations.

 The regression model is created


based on Least Squares Criterion:
 Fit a regression line through the
data points.
 Calculate the sum of the squares
of the deviations, ∑𝑦 2 .
 Repeat the above steps for
several different trial regression
lines.
 The best line is the line with the
minimum error.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 After the error has been minimized, the constants a and b can be
calculated. This requires the following statistical parameters:
∑X
 X= , the average of the independent variable X data.
𝑛
∑Y
 Y= , the average of the dependent variable Y data.
𝑛
 𝑥 = X − X, the scatter of independent variable X about the average X.
 𝑦 = Y − Y, the scatter of dependent variable Y about the average Y.

 Regression coefficient:
∑𝑥𝑦
b= .
∑𝑥 2
 Intercept constant
a = Y − bX.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Accuracy of the Regression Model


 Coefficient of Determination, R2 indicates the extent of accuracy:
∑𝑦𝑒2
𝑅2 = 0 ≤ 𝑅2 ≤ 1
∑𝑦 2
 R2 = 1 (perfect fitted line).
 R2 = 0 (worst fitted line).
 Standard Error of the Estimate, Se (another measure of variability):
∑𝑦𝑑2
𝑆𝑒 =
𝑛−𝑘
 (n - k) is the degree of freedom.
 n is the number of X data.
 k number of independent, X and dependent, Y variables. For simple
regression model, k = 2.
 Standard deviation, Sd is the level of scatter of the data:
∑𝑦 2
𝑆𝑑 = 𝑛−(𝑘−1)

 If Se < Sd, the model is regarded as good fit.


MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

 The regression coefficient b is an estimate, and therefore has error.


The standard error, Seb is used to measure this error:
𝑆𝑒2
𝑆𝑒𝑏 =
∑𝑥 2
 Se is the standard error as defined in the previous page.
 ∑𝑥 2 = ∑ X − X 2 , sum of the squares of the deviation of the X data.
 The ‘t’ test is used to determine if the regression coefficient, b is
actually significant:
𝑏
𝑡=
𝑆𝑒𝑏
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 1
 The following trip production data have been obtained from a
travel survey of a particular zone.

Average Household 2 3 4 5 6
Size (X)
Average Total 5 7 8 10 10
Trip/Day (Y)

 The value of ‘t’ statistic for 3 degrees of freedom (n – k) at 5% level


of significance is 2.353. Usually 5% LOS is acceptable in statistics.
(a) Develop a simple regression model for the trip production.
(b) Calculate the following statistics to check the validity of the model:
 Coefficient of Determination, R2.
 Standard Error of the Estimate, Se.
 The standard error of the regression coefficient , Seb.
 The ‘t’ value.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 1 (Solution)
(a) Create a table of the calculations required:

 n = 5, number of data X that is observed (n – k =3 degrees of


 Average of Y data, and average of X data,

 Regression coefficient, b 

 Intercept constant, a 
 Simple regression model:
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

(b) Statistical parameters to measure the accuracy:

 

 Therefore, this indicates that the significance of b is more than 5% (or less
than 95% accuracy).
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Linear Regression Model


(Two Independent Variables)
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE
Regression Model for
Two Independent Variables
 In reality, we may need to deal with more than one independent
variable. The linear regression model for two independent variables
is in the form:
Ye = a + b1 X1 + b2 X2
 X1 and X2 are the two independent variables.
 Ye is the estimated value of the dependent variable.
 b1 and b2 are the regression coefficients and ‘a’ is intercept constant.
 The constants a, b1 and b2 can be obtained from the following:
∑𝑥22 ∑𝑥1 𝑦 − ∑𝑥1 𝑥2 ∑𝑥2 𝑦
b1 =
∑𝑥12 ∑𝑥22 − ∑𝑥1 𝑥2 2

∑𝑥12 ∑𝑥2 𝑦 − ∑𝑥1 𝑥2 ∑𝑥1 𝑦


b2 =
∑𝑥12 ∑𝑥22 − ∑𝑥1 𝑥2 2

a = Y − 𝑏1 X1 − 𝑏2 X2
where 𝑥1 = X1 − X1 , 𝑥2 = X2 − X2 and 𝑦 = Y − Y.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Accuracy of the Regression Model


 Coefficient of Determination, R2 :
∑𝑦𝑒2
𝑅2 = 0 ≤ 𝑅2 ≤ 1
∑𝑦 2

 Standard Error of the Estimate, Se :


∑𝑦𝑑2
𝑆𝑒 =
𝑛−𝑘
 For the regression model with two independent variables (X1, X2) and
one dependent variable (Y) model, k = 3.
 The standard error, Sb is now measured with respect to the two
regression coefficients b1 and b2:
𝑆𝑒2 𝑆𝑒2
𝑆𝑏1 = and 𝑆𝑏2 =
∑𝑥12 (1−𝑟12
2 ) ∑𝑥22 (1−𝑟12
2 )

 r12 is known as the correlation coefficient between the two independent


variables X1 and X2 . We will discuss this coefficient again later.
∑𝑥1 𝑥2
𝑟12 =
∑𝑥12 ∑𝑥22
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 2
 The following trip attraction data have been obtained from a travel survey
conducted over five different zones.

 The value of ‘t’ statistic for 2 degrees of freedom, at 5% level of significance


is 2.92. Usually 5% LOS is acceptable in statistics.
(a) Develop a regression model for the trip production.
(b) Calculate the following statistics to check the validity of the model:
 Coefficient of Determination, R2.
 Standard Error of the Estimate, Se.
 The standard error of the regression coefficient , Seb.
 The ‘t’ value.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 2 (Solution)
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Linear Regression Analysis


Using Microsoft Excel
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Microsoft Excel: Regression Analysis


 In a realistic scenario, the trip production is influenced by many
factors, not just one or two as discussed in the previous examples.
 In such cases where many independent variables need to be
considered (X1, X2, X3, …), which would highly complicate the
regression analysis.
 It would be useful to apply computational tools, rather than hand
calculations, to make this task easier. In particular, we will discuss
the application of Microsoft Excel (ME) for regression analysis.
 The regression analysis in ME can help answer some very important
questions that can simplify our problem:
 Which factors (or variables) have the most influence on the trip
production/attraction?
 Which factors (or variables) can we ignore?
 How do the factors (or variables) relate to each other?
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 3
 We will explore the
application of ME for
regression analysis through
a numerical example. In
this example, a land-use
trip survey gives the
employment information
over sixteen zones.
 Four categories of
employment, i.e.
independent variables (X1,
X2, X3, X4)
 One dependent variable,
Attraction (Y).
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Example 3 (Solution)
 The main steps for regression analysis using ME:
1. Firstly ensure that the
Analysis ToolPak is installed. If
not, then:
a. Go to File  Options, then
click on the Add-Ins tab on
the left.
b. Cat the bottom, next
Manage Excel Add-Ins,
click on Go.
c. Check the Analysis
ToolPak, then click on OK.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

2. Develop the correlation matrix of all the dependent and independent


variables. The correlation coefficient can be defined between two independent
variables, 𝑟𝑥𝑖𝑥𝑗 or between an independent and dependent variable, 𝑟𝑥𝑖 𝑦 :
∑𝑥𝑖 𝑥𝑗 ∑𝑥𝑖 𝑦
𝑟𝑥𝑖 𝑥𝑗 = , 𝑟𝑥𝑖 𝑦 =
∑𝑥𝑖 2 ∑𝑥𝑗 2 ∑𝑥𝑖 2 ∑𝑦 2

a. Click on the DATA tab, and then click on


Data Analysis on the far right.
b. In the Data Analysis dialog box, highlight
the Correlation tool, then click OK.
c. In the Correlation dialog box, click on the
Input Range to enter the data.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

d. Using the mouse, highlight all the columns of data then press Enter.

e. Repeat the same for the Output Range, but remember to highlight some other
random group of cells on the page.
f. Then click OK.
g. The result of the correlation matrix will
be displayed in the Output Range that
was selected.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

3. Examine the correlation matrix in order to detect:


a) High correlation between any independent variable and the dependent variable. Do
the independent variables have any meaningful relation to the dependent variable?
• Both X1 and X2 is highly correlated with Y, while X3 is moderately correlated with Y, and finally
X4 has a low correlation with Y.

b) High correlation between any two independent variables (collinearity). If any two
independent variables are found to be highly correlated, eliminate one of them from
the regression analysis as it is not necessary to use both.
• There is a high correlation between X1 and X2, hence we may eliminate one of these.

• Why chose X2 to develop models C and D?


Because X2 is less correlated to X3 and X4
as compared to X1.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

4. Perform regression analysis with the selected set of independent variables:


a. Go to the DATA tab and click on Data Analysis.
b. In the Data Analysis dialog box, highlight the
Regression tool and click OK.
c. Click on the Input Y Range, use the mouse to
highlight the Y column and enter.
d. Click on the Input X Range, use the mouse to
highlight the X1 column and enter.
e. Click on the Output Range and highlight any group
of cells on the page.
f. Click OK.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

g. The results of the regression analysis will be displayed in the table formats below:

h. Repeat steps (a) to (f) to determine the other required regression models:
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

5. Select the best regression model based on logic and statistics.


a) What is the magnitude of statistical parameters?
 Coefficient of determination (R2).
 Standard error of estimate (Se).
 Standard error of the regression coefficients (Seb).
 t-value of the regression coefficients.

b) Do the regression coefficients (b1, b2, b3, …) have the correct signs, and are their
magnitudes reasonable?
 The relationship between X and Y should be positive, the more the X  the more the Y.
 Any negative signs of b will give an illogical meaning!

c) Are the regression coefficients (b1, b2, b3, …) statistically significant (t-value)?
 Table values: No. degrees of freedom, 1 % level of significance, t-value in table.

d) Is the magnitude of the intercept constant (a) reasonable?


 It should NOT be very high values and should not produce illogical values of Y.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Model A:
 R2 = 0.992 (can predict more than 99 % of data).
 t = 44 > 2.98 (for 14 d.f, 1% l.o.s)  therefore the regression coefficient b1 is significant.
 Intercept constant is a = 62.2, reasonable (this requires experience to understand).
 Regression coefficient, b1 = 0.93 (positive value  therefore logically correct).
Model B:
 R2 > 0.912 (can predict more than 91 % of data)  Model A is better, also Se is lower for Model A.
 t = 12.5 > 2.98 (for 14 d.f, 1% l.o.s)  therefore the regression coefficient b2 is significant.
 Intercept constant is a = 507.7, relatively large compared to Model A.
 Regression coefficient, b2 = 0.98 (positive value  logically correct). (Eliminate Model B).
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Model C:
 R2 = 0.996, slightly more than Model A and Se = 194 which is less than Model A.
 t = (52, 18) > 3.01 (13 d.f, 1% l.o.s)  therefore the regression coefficients (b1, b2) are significant.
 Intercept constant is a = 34, reasonable like Model A.
 Regression coefficients, b1 = 0.89 and b2 = 1.26 (positive value  therefore logically correct).
Model D:
 R2 > 0.998  Higher than Model A and C is better, also Se is lower than Model A and C.
 t = (3.7, 1.1, 0.06) < 3.06 (for 12 d.f, 1% l.o.s)  therefore the is b2 is significant, but (b3, b4) are not, i.e. X3 and X4
are not significant in explaining the variation of Y.
 Regression coefficient, b3 = -0.37 (negative value  logically incorrect). (Model D is invalid  eliminate).

Which to choose from Model A and C?


 Statistically, Model C is only slightly better than Model A. However, Model C requires information for X 2, X3
and X4 in the future so that we can predict the attractions for the future.
 Hence, Model A requires only X1 information, so the data collection effort is less. It is also easier to obtain
information on total employment. Therefore choose Model A.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

Summary
 On completion of Trip Generation analysis, we are able to predict for
the future:
 Trip Productions.
 Trip Attractions.
 This will be based on future values of data and will be performed for
every zone.
MTCC6021: DESIGN OF TRANSPORTATION INFRASTRUCTURE

End Lecture

You might also like