Aicp Review Stats
Aicp Review Stats
[email protected]
765-285-5871
Presentation Goals
z Vocabulary
z BasicStatistical Methods
z Shotgun approach
– Brief overviews
Vocabulary
z Constant
– An unchanging value (pi, or π: 3.1417)
z Variable
– A value that can take on different
quantities
z Levels of Measurement
– Nominal, Ordinal, Interval, Ratio
Vocabulary
z Levels of Measurement
– Nominal
• Non-ordered categories (redhead, blonde)
– Ordinal
• Ordered categories (Excellent, moderate, poor)
– Interval
• Equal intervals, arbitrary, 0 (temperature)
– Ratio
• Zero based, used for proportions
Vocabulary
z Histogram
– A graphic depiction of the
distribution of a variable
z Frequency Distribution
– The pattern or shape
formed by the group of
measurements in a
distribution
Vocabulary
z Ogive Curve
– A frequency distribution curve
in which the frequencies are
cumulative
z Scatter Plot
– Also known as scattergram or
scatter diagram. A two
dimensional graph representing
a set of bi-variate data
Vocabulary
z Sample
– A subset of a population
• Random (probabilistic)
• Stratified (separate groups)
• Systematic (every kth element)
• Cluster (correct mix group)
Population
Sample
Vocabulary
z Parametric
– Assumes that the data is drawn from a
normal distribution
– the data is measured on an interval
scale
– parametric tests make use of
parameters such as the mean and
standard deviation
– T-test, ANOVA, regression, correlation
Vocabulary
z Non-parametric
– Not normally distributed data
– Nominal and ordinal data
– chi2
Vocabulary – Central Tendency
z Mode
– The most frequently occurring value of
a group of values
z Mean
– is the sum of all values divided by the
number of observations
− x1 + x2 + x3 + ...xn
x=
n
Vocabulary – Central Tendency
z Median
– the 50th percentile, such that half the
values are above the median and half
the values are below
Vocabulary – Dispersion
z Range
– the maximum value minus the
minimum value of a data set n
∑ i
( x − x ) 2
z Variance s2 = i =1
n
– Population
n
– Sample ∑ (x − x)
i
2
s2 = i =1
z Standard Deviation n −1
– the square root of the variance
n
∑ (x − x)
i
2
s= i =1
n
Statistics – Basic Concepts
z Normal Distribution
– A theoretical frequency distribution for
a set of variable data, usually
represented by a bell-shaped curve
symmetrical about the mean
Statistics – Basic Concepts
z Cross-tabs
– Tables showing the joint distribution of
two or more categorical variables
• e.g., cell counts, cell percentages, expected
values, and residuals
– Various measures of association,
between the variables, can be obtained
– 2-way and 3-way
Statistics – Basic Concepts
z Cross-tabs
– 2-way
Sex * Town Government Crosstabulation
Town Government
Excellent Good Average Fair Poor Total
Sex Male Count 13 25 21 9 10 78
% within Sex 16.7% 32.1% 26.9% 11.5% 12.8% 100.0%
% within Town
48.1% 45.5% 38.2% 39.1% 47.6% 43.1%
Government
% of Total 7.2% 13.8% 11.6% 5.0% 5.5% 43.1%
Female Count 14 30 34 14 11 103
% within Sex 13.6% 29.1% 33.0% 13.6% 10.7% 100.0%
% within Town
51.9% 54.5% 61.8% 60.9% 52.4% 56.9%
Government
% of Total 7.7% 16.6% 18.8% 7.7% 6.1% 56.9%
Total Count 27 55 55 23 21 181
% within Sex 14.9% 30.4% 30.4% 12.7% 11.6% 100.0%
% within Town
100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
Government
% of Total 14.9% 30.4% 30.4% 12.7% 11.6% 100.0%
Statistics – Basic Concepts
z Cross-tabs
– 3 way, used to test a 2-way cross-tab
Sex * Town Government * Age group Crosstabulation
Town Government
Age group Excellent Good Average Fair Poor Total
65 or over Sex Male Count 1 11 1 2 2 17
% within Sex 5.9% 64.7% 5.9% 11.8% 11.8% 100.0%
% within Town Sex * Town Government * Age group Crosstabulation
33.3% 50.0% 25.0% 40.0% 66.7% 45.9%
Government
Town Government
% of Total 2.7% 29.7% 2.7% 5.4% 5.4% 45.9%
Age group Excellent Good Average Fair Poor Total
Female Count 2 11 3 19 to 29 3 1 20
Sex Male Count 1 3 3 3 1 11
% within Sex 10.0% 55.0% 15.0% 15.0% 5.0% 100.0%
% within Sex 9.1% 27.3% 27.3% 27.3% 9.1% 100.0%
% within Town
66.7% 50.0% 75.0% 60.0% 33.3% % within Town
54.1%
Government 50.0% 50.0% 23.1% 60.0% 50.0% 39.3%
Government
% of Total 5.4% 29.7% 8.1% 8.1% 2.7% 54.1%
% of Total 3.6% 10.7% 10.7% 10.7% 3.6% 39.3%
Total Count 3 22 4 5 3 37
Female Count 1 3 10 2 1 17
% within Sex 8.1% 59.5% 10.8% 13.5% 8.1% 100.0%
% within Town
% within Sex 5.9% 17.6% 58.8% 11.8% 5.9% 100.0%
100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
% within Town
Government
50.0% 50.0% 76.9% 40.0% 50.0% 60.7%
% of Total 8.1% 59.5% 10.8% 13.5% 8.1% Government
100.0%
% of Total 3.6% 10.7% 35.7% 7.1% 3.6% 60.7%
Total Count 2 6 13 5 2 28
% within Sex 7.1% 21.4% 46.4% 17.9% 7.1% 100.0%
% within Town
100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
Government
% of Total 7.1% 21.4% 46.4% 17.9% 7.1% 100.0%
Statistics – Basic Concepts
z Z-score
– A measure of the distance, in standard deviation
units, from the mean
– Used to determine probability if something
would, or would not, happen
– Unit-less value!
x−x
z=
s
Statistics – Basic Concepts
z Hypotheses
– Research hypothesis
• H1 – what we really want to prove
– Null hypothesis
• H0 – what we statistically use, and really want
to prove wrong
– It’s easier to statistically prove
something wrong, thus the null
hypothesis
Statistics – Basic Concepts
z Hypotheses
– Type 1 error
• Reject null hypothesis when it is actually
true
– Type 2 error
• Fail to reject null hypothesis when it is
actually false
Statistical Tests
z Chi2 (χ2)
– Test for a relationship (dependence)
between two nominal or ordinal based
variables.
– Examines the joint probabilities between
the two variables
Statistical Tests
z Chi2 (χ2)
Statistical Tests
z Spearman Rank Correlation
– Correlate two ranked data sets
6∑i =1 Di2
N
rs = 1 −
N ( N 2 − 1)
Statistical Tests
Spearman Rank Correlation example
Sum of d2 is 74
– Squaring r gives us R2
• Calculates how much one variable explains the other
– R2 is an extremely useful measure
• Example
• r = -.44102 = 0.194, thus var1 accounts for 19.4% of the
variability of var2. That leaves 80.6% to be explained…
- But still does not indicate causality!
Statistical Tests
z Regression
– Explore the nature of relationships
between variables
– Simple (two variables) and multiple
regression
– Always one dependent and 1 or more
independent variables
Statistical Tests
z Regression
z Issues:
– What is the association between Y and X?
– How can changes in Y be explained by
changes in X?
– What are the functional relationships
between Y and X?
A functional relationship is symbolically written as:
Y = f(X)
Statistical Tests
Simple Regression
Y = b0 + b1X
b0 is the intercept,
b1 is the slope.
Statistical Tests
z Multiple Regression
– Just an extension of simple linear regression!
Y = b0 + b1X
Y = b0 + b1X1 + b2X2 + … bnXn
Planning Models - Population
z Population Projections
– Historical
• Historical trends
– Component (Cohort Survival)
• Birth and death rates by cohorts
• Migration can be included
– Ratios
• Use larger unit (state, region) to ratio with
Planning Models - Population
z Population Projections – terms
– Estimates
• Current population levels
– Projections
• Future population levels
– Forecasts
• Selected projection (judgmental)
– Migration
• Migration = Pop2000 – Natural increase1990-2000
Planning Models - Population
z Population Projections - Historical
– Linear
Pt + n = Pt + b ( n )
m +1
∑(P − P
t =2
t )
t −1
b=
m
Planning Models - Population
z Population Projections - Historical
– Exponential
• r is rate of change
Pt + n = Pt (1 + r ) n
1 Pt − ( pt − 1)
r= ∑
m Pt − 1
Planning Models - Population
z Population Projections - Historical
– Modified Exponential
• Ceiling to growth
• V is % of unused capacity
Pt + 1 = K − [( K − Pt )(v ) n ]
1 K − Pt
v= ∑
m K − Pt − 1
Planning Models - Population
z Population Projections – Historical
– Gompertz
• Similar to modified exponential but more rigorous
Planning Models - Population
z Population Projections – Historical
– Ratio your community to another larger known
quantity
Concept
Rearranging terms
Planning Models - Population
z Population Projections – Cohort
– Population increases with natural increase
and migration
– Pt+n = Pt + N + m
– Pop2006 = Pop2005 + (B-D) + net migration
Planning Models - Population
z Population Projections – Cohort
– Assumptions
• New births: women 15-44 years of age
• Births divided by some ratio (m/f)
• Cohorts could be anything
– 0-1
– 1-5
– 5-9
– 10-14
– 15…
Planning Models - Location
z Gravity Models
– Based on attractiveness of a location as a
destination and difficulty getting there
F – attraction
K – regional index
P – mass PA PB
d - distance FAB =K 2
d
Planning Models - Location
z Gravity Models
– Unconstrained model for generating trips
between two locations
S βj
d ijλ Probability of a trip between ones i and j
p ji =
S βj
∑ dλ
ij
Planning Models - Location
z Gravity Models
– Residential Location
Planning Models - Location
z Gravity Models
– Lowry Model
• Classic application of a gravity model
• Assigns secondary employment and
population to zones
• Driving force is location of basic
employment
Planning Models - Location
z Gravity Models
– Lowry Model
• Step 1: Calculate population (N) dependent on basic
employment
• Step 2: Calculate non-basic employment (E) dependent
on population
• Step 3: Calculate population (N) dependent on non-
basic employment
• Step 4: Repeat steps 3 and 4 until minimum increase is
reached
Ei Ni
d ij2 d ij3
Nj = E sj =
Ei Ni
∑ ∑ d2 ∑ ∑ d3
ij ij
Planning Models - Economics
z Basic sector
– Those economic sectors that produce
goods for export from the region
– Brings in new $ to the region
z Nonbasic sector
– Those economic sectors that produce
goods for local consumption
Planning Models - Economics
z NAICS
– The North American Industry Classification System
(NAICS) has replaced the U.S. Standard Industrial
Classification (SIC) system
– 6 digit code hierarchy
– All industry is assigned to a consistent set of
categories
– http://www.naics.com/index.html
Planning Models - Economics
z Location Quotient (LQ)
– Estimates if an industry is a basic
industry
– Compares local industry employment to
national employment for the same sector
– If LQ is > 1.0 then industry is an exporter
– If LQ is < 1.0 then imports are implied
Planning Models - Economics
z Location Quotient (LQ)
– Simple ratios of employment
LQ =
E / ERj R > 1
E /E Nj N
<
e = (1 + R )e
i
t'
i
t −t '
i
t
E ti ' − E ti
R ti − t ' =
E ti
Planning Models - Economics
z Employment Projections
– Shift-Share technique
• Assumes local economy does not keep
constant share of larger region
• Assumes local sectors grow faster, or more
slowly, so share changes
• Hence, employment shifts into, or out of, a
region
Planning Models - Economics
z Employment Projections
– Shift-Share technique
e = (1 + R
i
t'
i
t −t ' + s )e
i
t −t '
i
t
eTt
BM = t Ratio of the total employment
bT to the total basic employment
Planning Models - Economics
z Input-Output Analysis
– A technique used in economics for tracing
resources and products within an economy
– The quantities of input and output for a given
time period, are entered into an input-output
matrix
• one can then analyze what happens within, and
across, various sectors of an economy where growth
and decline takes place and what effects various
subsidies may have
Planning Models - Economics
z Input-Output Analysis
– Example input-output matrix