Basic Concepts
Basic Concepts
Econometrics
1
Variables
2
Population
3
Sample
4
Types of Variables
6
Real Limits
7
Measuring Variables
9
4 Types of Measurement Scales
11
Experiments
13
Data
16
Sampling Error
17
Determining Scale of Measurement
• 3 steps
• Examples
• Try a few
Scales (levels) of Measurement
To determine level of measurement
• Observational unit?
– One school bus
• Place bus in category?
– No
• Measurement Unit?
– One student
• Absolute zero?
– Yes, bus could carry no students
• Level of measure?
– Ratio
What Level Measurement?
Students Fail v. Pass
• Observational unit?
– One student
• Place student in category?
– Yes: fail or pass
• Are the categories ranked on some quality?
– Yes (underlying continuous variable…success in
learning)
• Level of measurement?
– Ratio
– Ordinal
Frequency Distributions
24
Frequency Distribution Tables
25
Frequency Distribution Tables (cont.)
• A third column can be used for the
proportion (p) for each category: p = f/N.
The sum of the p column should equal
1.00.
• A fourth column can display the
percentage of the distribution
corresponding to each X value. The
percentage is found by multiplying p by
100. The sum of the percentage column is
100%.
26
Regular Frequency Distribution
27
Grouped Frequency Distribution
28
Grouped Frequency Distribution (cont.)
29
Frequency Distribution Graphs
30
Histograms
31
Frequency distribution graphs
34
Positively and Negatively
Skewed Distributions
• In a positively skewed distribution, the
scores tend to pile up on the left side of
the distribution with the tail tapering off to
the right.
• In a negatively skewed distribution, the
scores tend to pile up on the right side and
the tail points to the left.
35
Percentiles, Percentile Ranks,
and Interpolation
• The relative location of individual scores
within a distribution can be described by
percentiles and percentile ranks.
• The percentile rank for a particular X
value is the percentage of individuals with
scores equal to or less than that X value.
• When an X value is described by its rank,
it is called a percentile.
37
Interpolation
38
Central Tendency
40
Central Tendency
45
Changing the Mean
47
The Median
48
The Median (cont.)
49
The Median (cont.)
50
The Mode
51
The Mode (cont.)
52
Bimodal Distributions
53
Central Tendency and the
Shape of the Distribution
• Because the mean, the median, and the
mode are all measuring central tendency,
the three measures are often
systematically related to each other.
• In a symmetrical distribution, for example,
the mean and median will always be
equal.
55
Central Tendency and the
Shape of the Distribution (cont.)
• If a symmetrical distribution has only one
mode, the mode, mean, and median will
all have the same value.
• In a skewed distribution, the mode will be
located at the peak on one side and the
mean usually will be displaced toward the
tail on the other side.
• The median is usually located between the
mean and the mode.
56
Variability
58
Measuring Variability
59
The Range
60
The Interquartile Range
61
The Standard Deviation
62
The Standard Deviation (cont.)
65
z-Scores and Location
68
Transforming back and forth
between X and z
• The basic z-score definition is usually
sufficient to complete most z-score
transformations. However, the definition
can be written in mathematical notation to
create a formula for computing the z-score
for any value of X.
X– μ
z = ────
σ
70
Transforming back and forth
between X and z (cont.)
• Also, the terms in the formula can be
regrouped to create an equation for
computing the value of X corresponding to
any specific z-score.
X = μ + zσ
71
Z-scores and Locations
75
z-Scores as a Standardized
Distribution (cont.)
• Because z-score distributions all have the
same mean and standard deviation,
individual scores from different
distributions can be directly compared.
• A z-score of +1.00 specifies the same
location in all z-score distributions.
76
z-Scores and Samples
77
z-Scores and Samples (cont.)
• Thus, for a score from a sample,
X–M
z = ─────
s
• Using z-scores to standardize a sample also has
the same effect as standardizing a population.
• Specifically, the mean of the z-scores will be
zero and the standard deviation of the z-scores
will be equal to 1.00 provided the standard
deviation is computed using the sample formula
(dividing n – 1 instead of n).
78
Other Standardized Distributions
Based on z-Scores
• Although transforming X values into z-
scores creates a standardized distribution,
many people find z-scores burdensome
because they consist of many decimal
values and negative numbers.
• Therefore, it is often more convenient to
standardize a distribution into numerical
values that are simpler than z-scores.
79
Other Standardized Distributions
Based on z-Scores (cont.)
• To create a simpler standardized
distribution, you first select the mean and
standard deviation that you would like for
the new distribution.
• Then, z-scores are used to identify each
individual's position in the original
distribution and to compute the individual's
position in the new distribution.
80
Other Standardized Distributions
Based on z-Scores (cont.)
• Suppose, for example, that you want to
standardize a distribution so that the new mean
is μ = 50 and the new standard deviation is σ =
10.
• An individual with z = –1.00 in the original
distribution would be assigned a score of X = 40
(below μ by one standard deviation) in the
standardized distribution.
• Repeating this process for each individual score
allows you to transform an entire distribution into
a new, standardized distribution.
81
Correlation
82
Correlations: Measuring and
Describing Relationships
• A correlation is a statistical method used to
measure and describe the relationship
between two variables.
• A relationship exists when changes in one
variable tend to be accompanied by
consistent and predictable changes in the
other variable.
83
Correlations: Measuring and
Describing Relationships (cont.)
86
Correlations: Measuring and
Describing Relationships (cont.)
• The direction of the relationship is
measured by the sign of the correlation (+
or -). A positive correlation means that the
two variables tend to change in the same
direction; as one increases, the other also
tends to increase. A negative correlation
means that the two variables tend to
change in opposite directions; as one
increases, the other tends to decrease.
87
Correlations: Measuring and
Describing Relationships (cont.)
89
Correlations: Measuring and
Describing Relationships (cont.)
97
The Spearman Correlation (cont.)
The calculation of the Spearman correlation requires:
100
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
With either one or two dichotomous variables the
calculation of the correlation precedes as
follows:
1. Assign numerical values to the two categories of
the dichotomous variable(s). Traditionally, one
category is assigned a value of 0 and the other
is assigned a value of 1.
2. Use the regular Pearson correlation formula to
calculate the correlation.
101
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• In situations where one variable is
dichotomous and the other consists of
regular numerical scores (interval or ratio
scale), the resulting correlation is called a
point-biserial correlation.
• When both variables are dichotomous, the
resulting correlation is called a phi-
coefficient.
102
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• The point-biserial correlation is closely related to
the independent-measures t test introduced in
Chapter 10.
• When the data consists of one dichotomous
variable and one numerical variable, the
dichotomous variable can also be used to
separate the individuals into two groups.
• Then, it is possible to compute a sample mean
for the numerical scores in each group.
103
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• In this case, the independent-measures t
test can be used to evaluate the mean
difference between groups.
• If the effect size for the mean difference is
measured by computing r2 (the percentage
of variance explained), the value of r2 will
be equal to the value obtained by squaring
the point-biserial correlation.
104