Unit 10 - More Multiple Regression - 1 Per Page
Unit 10 - More Multiple Regression - 1 Per Page
1
Unit 10 Outline
2
Example: Inference for 3+ Means – Bone Density
• Studies suggest a link between exercise and healthy bones
• A study of 30 rats examined the effect of jumping on the bone
density of growing rats
• Three treatment groups
– No jumping (10 rats - group 1)
– 30 cm jump (10 rats - group 2)
– 60 cm jump (10 rats - group 3)
• 10 jumps per day, 5 days per week for 8 weeks
• Bone density measured after 8 weeks
• Test to see if the jumping treatments affect bone density
(measured in mg/cm3)
3
Inference for 3+ Means Example – Bone density
• As always, first visualize the data:
700
Groups
1 – No jumping
2 – 30 cm jump
650
3 – 60 cm jump
600
2 613 19.3
1 2 3 3 639 16.6
5
Connection to Classic ANOVA
• In classic ANOVA, the model is set-up in such a way that is
looks at the group means directly (rather than modeling the
differences between groups like the regression with binary
predictors does). This actually makes the algebra easier by
hand (but complicated to explain/understand).
• The F-test from classic ANOVA is mathematically
equivalent to the F-test from a regression with binary
predictors. It compares the variability between group means
(the model) vs. the variability within groups (the error).
• Is less general than our binary predictor approach because
then it is difficult to have both binary and quantitative
predictors.
• Anyway, an example from Stata is shown on the next slide…
ANOVA Results from Stata
ANOVA from Stata
. oneway bonedensity group
Analysis of Variance
Source SS df MS F Prob > F
7
Building a Regression Model with Binary Predictors
• Below are the summary statistics from a survey given out during the
regular school year asking how many texts the student sends per day, split
across class year:
. tabulate year, summarize(text_day)
| Summary of text_day
year | Mean Std. Dev. Freq.
------------+------------------------------------
freshman | 68.074074 73.094736 27
junior | 28.645161 29.214778 31
senior | 21.4 19.043993 20
sophomore | 56.56044 134.71164 91
------------+------------------------------------
Total | 49.118343 104.87412 169
------------------------------------------------------------------------------
text_day | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
soph | -11.51363 22.91892 -0.50 0.616 -56.7658 33.73853
junior | -39.42891 27.53005 -1.43 0.154 -93.7855 14.92768
senior | -46.67407 30.85374 -1.51 0.132 -107.5931 14.24495
_cons | 68.07407 20.12676 3.38 0.001 28.33488 107.8133
------------------------------------------------------------------------------
b) Since all the coefficients for the group differences are negative, that
means the reference group, freshmen, send the most texts. Seniors
send the least since their coefficient is most negative.
------------------------------------------------------------------------------
text_day | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
soph | -11.86717 23.09216 -0.51 0.608 -57.46763 33.73328
junior | -39.0642 27.73669 -1.41 0.161 -93.83627 15.70787
senior | -47.79541 32.14659 -1.49 0.139 -111.2758 15.68496
haircut | .1501553 .1961588 0.77 0.445 -.2372026 .5375133
_cons | 62.95767 21.34816 2.95 0.004 20.80112 105.1142
------------------------------------------------------------------------------
What is the model statement for this regression? What are the
interpretation of the coefficient estimates (b’s) in this model?
Unit 10 Outline
13
Contrasts
• After the omnibus F-test has shown overall significance in a multiple
regression, we can then investigate other comparisons of combinations of
parameters using contrasts
Note: the textbook’s definition of contrasts are for the ANOVA setting, not
for multiple regression, so the formulas are completely different but have a
similar feel. So do NOT refer to your text for this topic!!!
Unit 10 Outline
16
The multiple comparisons problem
• To test H0: μ1 = μ2 = . . . = μI, why not simply conduct multiple two-
sample t-tests instead of using the binary regression F-test?
• For example, with I = 9 there are 36 possible pair-wise
t-tests. What is the probability of rejecting a true H0 at least once?
– P(Type I error) = 1 – (0.9536) = 0.84
• This inflated Type I error is due to multiple comparisons: we have
looked at multiple tests at once (each with α = 0.05), and thus it will
lead to significant results that are not truly there (simply by chance).
• In general, we would like the probability of a type I error to be some
fixed value α (e.g., α = 0.05)
• This is accomplished using the overall F-test
• What happens when we start using multiple contrasts or looking at all
the different t-stats in one regression model?
• If we don’t have any contrasts pre-specified, we can just look at all the
pairwise two-sample t-tests to see a difference in groups, but adjust α.
17
The Bonferroni correction
• A solution to the multiple comparisons problem is the adjustment of
α levels using the Bonferroni correction
• This correction is a conservative solution
• Suppose we wish to perform all possible pairs of comparisons
among I groups
I I! I ( I 1)
There are
2 2!( I 2)! such comparisons
2
• The Bonferroni correction - to protect the overall level of α we must
perform each individual test at level
*
I
2
18
Example - Bonferroni correction
• Suppose we wish to perform pair-wise comparisons among 3
groups but still maintain an overall α = 0.05
• If I = 3, there are
3 3!
3 possible comparisons
2 2!(1)!
(Group 1 versus 2), (1 versus 3), and (2 versus 3)
• The Bonferroni correction says that if we want an overall α level
of < 0.05, then we do each of the 3 tests at the
α* = 0.05 / 3 = 0.0167 level
• Thus, with each test at level α* = 0.0167, this Bonferroni
correction gives an overall α < 0.05
19
Unit 10 Outline
20
Transformation of Variables
1500
.01
.008
1000
.006
Residuals
Density
.004
500
.002
0
0
0 500 1000 20 30 40 50 60 70
Residuals Fitted values
cellphones: the number of different cellphones the student has ever owned
fastest_drive: the fastest the students has ever driven, in mph
haircut: how much the student’s last haircut cost ($)
senior: a 0/1 binary variable for whether or not the student is a senior
.01
variable to see if it is skewed:
.008
.006
Density
• It is skewed-right, so we take the log
.004
of it:
.002
gen log_text = log(text_day+1)
0
0 500 1000 1500
log_text:
.6
• Looks good, so that is what we will
.4
use going forward in all models
Density
.2
Note: the “+1” is just so that our computer
does not vomit when we take the log of zero
0
0 2 4 6 8
log_text
2. Checking the histograms of log(y) vs. x
8
log_text vs. cellphones log_text vs. fastest_drive
8
6
6
log_text
log_text
4
4
2
2
0
0
0 5 10 15 20 0 50 100 150 200
cellphones fastest_drive
8
8
6
6
log_text
log_text
4
4
2
2
0
0
8
.2
6
.15
log_text
Density
Density
4
.1
.5
2
.05
0
0
0
0 5 10 15 20 0 1 2 3 0 1 2 3
cellphones log_cellphones log_cellphones
.8
8
.015
.6
6
log_text
Density
Density
.01
4
.4
.005
2
.2
0
0
Note: haircut won’t ever get fixed completely since there are a lot of people
all piled up at zero. But it is more symmetric and more linear
3. Fit the Model
. regress log_text log_cellphones log_haircut fastest_drive senior
--------------------------------------------------------------------------------
log_text | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
log_cellphones | .8642687 .1891676 4.57 0.000 .4907163 1.237821
log_haircut | .1166675 .057846 2.02 0.045 .002438 .230897
fastest_drive | .0077078 .0025508 3.02 0.003 .0026707 .012745
senior | -.6377016 .2404366 -2.65 0.009 -1.112496 -.1629077
_cons | .9308021 .4171938 2.23 0.027 .1069629 1.754641
--------------------------------------------------------------------------------
.5
4
.4
2
.3
Residuals
Density
0
.2
-2
.1
-4
0
-4 -2 0 2 2 3 4 5
Residuals Fitted values
Unit 10: Main Points
a) What is the formula for this regression model?
yˆ 0.9308 0.864( x1 ) 0.117( x2 ) 0.0077( x3 ) 0.6377( x4 )
29
Unit 10: Main Points
• When you are trying to compare means across 3 or more groups,
this should be done via a regression with (I – 1) binary predictors
Note: for a categorical response this would be done via a chi-sq test