0% found this document useful (0 votes)

22 views30 pages

Unit 10 - More Multiple Regression - 1 Per Page

This document discusses additional topics in multiple regression analysis including: 1. Using binary predictor variables to compare three or more group means through regression analysis. 2. Performing contrast tests after an overall significant F-test to investigate specific comparisons among parameters. 3. Applying multiple comparison corrections like Bonferroni to address the problem of multiple comparisons. 4. Transforming variables to better meet the assumptions of linear regression models.

Uploaded by

Kase1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views30 pages

Unit 10 - More Multiple Regression - 1 Per Page

Uploaded by

Kase1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Unit 10: More Regression

1
Unit 10 Outline

• More Multiple Regression Topics

– Binary Predictors to compare 3 or more group means
– Contrast Testing
– Multiple comparisons (& the Bonferroni correction)
– Transformation of Variables

2
Example: Inference for 3+ Means – Bone Density
• Studies suggest a link between exercise and healthy bones
• A study of 30 rats examined the effect of jumping on the bone
density of growing rats
• Three treatment groups
– No jumping (10 rats - group 1)
– 30 cm jump (10 rats - group 2)
– 60 cm jump (10 rats - group 3)
• 10 jumps per day, 5 days per week for 8 weeks
• Bone density measured after 8 weeks
• Test to see if the jumping treatments affect bone density
(measured in mg/cm3)

3
Inference for 3+ Means Example – Bone density
• As always, first visualize the data:
700

Groups
1 – No jumping
2 – 30 cm jump
650

3 – 60 cm jump
600

Means & SD’s

1 601 27.4
550

2 613 19.3
1 2 3 3 639 16.6

• We’d like to do a t-test, but there’s no formula for 3 groups 

• Solution: Regression with Binary Predictors
4
The F-test in a Binary Regression Model
• A naive application of regression here might use the codes
no jump = 1, low jump = 2, high jump = 3 in a single predictor,
but this is mathematically incorrect!
• A correct way to apply regression here is to create (I – 1) binary
variables (variables coded (0 or 1), sometimes called dummy
variables) that recreate the groups:
• Then to determine if there are any differences among the 3
groups, we can just perform the F-test from this regression model

5
Connection to Classic ANOVA
• In classic ANOVA, the model is set-up in such a way that is
looks at the group means directly (rather than modeling the
differences between groups like the regression with binary
predictors does). This actually makes the algebra easier by
hand (but complicated to explain/understand).
• The F-test from classic ANOVA is mathematically
equivalent to the F-test from a regression with binary
predictors. It compares the variability between group means
(the model) vs. the variability within groups (the error).
• Is less general than our binary predictor approach because
then it is difficult to have both binary and quantitative
predictors.
• Anyway, an example from Stata is shown on the next slide…
ANOVA Results from Stata
ANOVA from Stata
. oneway bonedensity group

Analysis of Variance
Source SS df MS F Prob > F

Between groups 7433.86667 2 3716.93333 7.98 0.0019

Within groups 12579.5 27 465.907407

Total 20013.3667 29 690.116092

Bartlett's test for equal variances: chi2(2) = 2.3353 Prob>chi2 = 0.311

7
Building a Regression Model with Binary Predictors
• Below are the summary statistics from a survey given out during the
regular school year asking how many texts the student sends per day, split
across class year:
. tabulate year, summarize(text_day)
| Summary of text_day
year | Mean Std. Dev. Freq.
------------+------------------------------------
freshman | 68.074074 73.094736 27
junior | 28.645161 29.214778 31
senior | 21.4 19.043993 20
sophomore | 56.56044 134.71164 91
------------+------------------------------------
Total | 49.118343 104.87412 169

• If we were to run a regression with the binary x-variables of just soph,

junior, and senior, what would be the resulting estimated regression
model (aka, the formula for the regression model)? Hint: which group
is the reference group?
yˆ  68.07  11.51( X soph )  39.43( X jr )  46.67( X sr )
Solution from Stata:
. regress text_day soph junior senior

Source | SS df MS Number of obs = 169

-------------+------------------------------ F( 3, 165) = 1.31
Model | 43101.4669 3 14367.1556 Prob > F = 0.2718
Residual | 1804660.17 165 10937.3343 R-squared = 0.0233
-------------+------------------------------ Adj R-squared = 0.0056
Total | 1847761.63 168 10998.5811 Root MSE = 104.58

------------------------------------------------------------------------------
text_day | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
soph | -11.51363 22.91892 -0.50 0.616 -56.7658 33.73853
junior | -39.42891 27.53005 -1.43 0.154 -93.7855 14.92768
senior | -46.67407 30.85374 -1.51 0.132 -107.5931 14.24495
_cons | 68.07407 20.12676 3.38 0.001 28.33488 107.8133
------------------------------------------------------------------------------

a) Is there any evidence that these groups send a different number of

texts per day, on average?
b) From this model, which group sends the most text messages? Which
group sends the fewest?
c) What is the estimate of the st.dev. of texts sent within the groups?
Solution:
a) H 0 : 1   2   3  0
H A : at least one   0
  0.05
SSM / df M
F  1.31
SSE / df E
p  value  0.278
Since our p-value is not less than 0.05, we cannot reject the null. The 4
groups may send about the same number of texts per day on average.

b) Since all the coefficients for the group differences are negative, that
means the reference group, freshmen, send the most texts. Seniors
send the least since their coefficient is most negative.

c) root MSE  se  104.58

Expanding Regression: combo of binary
and quantitative predictors
• Mathematically, there is no reason we cannot have both binary
and quantitative predictors
• It’s simple to do in software (Stata): just include both types in a
regression model
• This allows us to compare group means while controlling for the
effect of a quantitative predictor.
• This also allows us to look at interactions of effects, which we
will not cover in this course. An interactive effect is where the
effect of one variable differs across different groups
• For example (weight vs. height): every extra inch may add 5
lbs. for men while only add 3 lbs. for women, on average
• Visual: allows for non-parallel lines
Combining Binary and Quantitative
Predictors
. regress text_day soph junior senior haircut

Source | SS df MS Number of obs = 167

-------------+------------------------------ F( 4, 162) = 1.07
Model | 47339.9435 4 11834.9859 Prob > F = 0.3750
Residual | 1798015.69 162 11098.8623 R-squared = 0.0257
-------------+------------------------------ Adj R-squared = 0.0016
Total | 1845355.63 166 11116.6002 Root MSE = 105.35

------------------------------------------------------------------------------
text_day | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
soph | -11.86717 23.09216 -0.51 0.608 -57.46763 33.73328
junior | -39.0642 27.73669 -1.41 0.161 -93.83627 15.70787
senior | -47.79541 32.14659 -1.49 0.139 -111.2758 15.68496
haircut | .1501553 .1961588 0.77 0.445 -.2372026 .5375133
_cons | 62.95767 21.34816 2.95 0.004 20.80112 105.1142
------------------------------------------------------------------------------

What is the model statement for this regression? What are the
interpretation of the coefficient estimates (b’s) in this model?
Unit 10 Outline

• More Multiple Regression Topics

– Binary Predictors to compare 3 or more group means
– Contrast Testing
– Multiple comparisons (& the Bonferroni correction)
– Transformation of Variables

13
Contrasts
• After the omnibus F-test has shown overall significance in a multiple
regression, we can then investigate other comparisons of combinations of
parameters using contrasts

• This makes a lot of sense in the binary regression setting where a

combination of parameters can lead to a useful comparison. For the rats
example, what might be an interesting comparison of the 3 groups
involved (no jump, low jump, high jump)?
• We could compare the control (group 1) versus the 2 treatment groups
combined (groups 2 and 3).
• What would this mean in terms of the model β’s?
H0: β1 + β2 = 0
• We may want to compare just the 2 levels of jumping to see if there is
an effect of height (group 2 versus group 3).
• What would this mean in terms of the model β’s?
H0: β1 = β2
14
Results in Stata
After first fitting a regression in stata, contrasts are easy to do using
the test command (make sure you run the appropriate regress first):

To test whether the . test lowjump + highjump == 0

combined effect of the two ( 1) lowjump + highjump = 0
treatment groups is different
from zero: F( 1, 27) = 8.59
Prob > F = 0.0068
H0: β1 + β2 = 0

. test lowjump == highjump

To test whether the two
treatment groups are equal ( 1) lowjump - highjump = 0
to each other:
F( 1, 27) = 7.37
H0: β1 = β2 Prob > F = 0.0114

Note: the textbook’s definition of contrasts are for the ANOVA setting, not
for multiple regression, so the formulas are completely different but have a
similar feel. So do NOT refer to your text for this topic!!!
Unit 10 Outline

• More Multiple Regression Topics

– Binary Predictors to compare 3 or more group means
– Contrast Testing
– Multiple comparisons (& the Bonferroni correction)
– Transformation of Variables

16
The multiple comparisons problem
• To test H0: μ1 = μ2 = . . . = μI, why not simply conduct multiple two-
sample t-tests instead of using the binary regression F-test?
• For example, with I = 9 there are 36 possible pair-wise
t-tests. What is the probability of rejecting a true H0 at least once?
– P(Type I error) = 1 – (0.9536) = 0.84
• This inflated Type I error is due to multiple comparisons: we have
looked at multiple tests at once (each with α = 0.05), and thus it will
lead to significant results that are not truly there (simply by chance).
• In general, we would like the probability of a type I error to be some
fixed value α (e.g., α = 0.05)
• This is accomplished using the overall F-test
• What happens when we start using multiple contrasts or looking at all
the different t-stats in one regression model?
• If we don’t have any contrasts pre-specified, we can just look at all the
pairwise two-sample t-tests to see a difference in groups, but adjust α.
17
The Bonferroni correction
• A solution to the multiple comparisons problem is the adjustment of
α levels using the Bonferroni correction
• This correction is a conservative solution
• Suppose we wish to perform all possible pairs of comparisons
among I groups
I  I! I ( I  1)
There are   
 2  2!( I  2)!  such comparisons
  2
• The Bonferroni correction - to protect the overall level of α we must
perform each individual test at level

 *

I 
 
 2

18
Example - Bonferroni correction
• Suppose we wish to perform pair-wise comparisons among 3
groups but still maintain an overall α = 0.05
• If I = 3, there are

 3 3!
    3 possible comparisons
 2  2!(1)!
(Group 1 versus 2), (1 versus 3), and (2 versus 3)
• The Bonferroni correction says that if we want an overall α level
of < 0.05, then we do each of the 3 tests at the
α* = 0.05 / 3 = 0.0167 level
• Thus, with each test at level α* = 0.0167, this Bonferroni
correction gives an overall α < 0.05

19
Unit 10 Outline

• More Multiple Regression Topics

– Binary Predictors to compare 3 or more group means
– Contrast Testing
– Multiple comparisons (& the Bonferroni correction)
– Transformation of Variables

20
Transformation of Variables

• Way back in Unit 2, we mentioned the importance of linearity

and normality of residuals in any regression model.
• A violation of this can lead to incorrect conclusions for
hypothesis tests in regression: we may not be able to reject the
null hypothesis of no relationship when one is clearly there,
just not linearly (or vice versa)
• How to correct this? Non-linear transformations…like log,
square root, or raised to a power
• This works just fine since these functions are all increasing
functions (which just means an increase on the converted scale
means an increase on the original scale). The order of
observations is preserved.
• Let’s go back to the text messaging data (y = texts, x = class year)
Histogram of residuals,
and scatterplot of residuals vs. fitted

1500
.01
.008

1000
.006

Residuals
Density

.004

500
.002

0
0

0 500 1000 20 30 40 50 60 70
Residuals Fitted values

Why is the residual-vs.-fitted plot just 4 vertical bars? Does

that violate Regression’s Assumptions?

What is a cause for concern in the above graphs?

Fixing non-linearity and non-normality

• When attempting to fix non-linearity or non-normality, follow

these steps:
1. Check the histogram of the y-variable for symmetry
1. If right-skewed, consider logging or square-root
2. If left-skewed, consider raising to second power
3. Use the symmetric version of y in all future analyses
2. After making the y-variable symmetric, look at the
scatterplot of [converted] y vs. x (or multiple x’s).
1. If not linear, consider transforming the x in a similar
fashion as the steps for the y-variable above
2. Continue for all x-variables to consider for model
Example: Predicting Text Messages

• We want to create a regression to predict the number of text

messages (text_day) a person send per day. The candidate
predictors are:

cellphones: the number of different cellphones the student has ever owned
fastest_drive: the fastest the students has ever driven, in mph
haircut: how much the student’s last haircut cost ($)
senior: a 0/1 binary variable for whether or not the student is a senior

• We want to begin our model building process by first making sure

everything will work linearly, and then fit the model…
1. Making the y-variable symmetric
• We first check a histogram of the y-

.01
variable to see if it is skewed:

.008
.006
Density
• It is skewed-right, so we take the log

.004
of it:

.002
gen log_text = log(text_day+1)

0
0 500 1000 1500

• And check the histogram of

text_day

log_text:

.6
• Looks good, so that is what we will

.4
use going forward in all models
Density

.2
Note: the “+1” is just so that our computer
does not vomit when we take the log of zero
0

0 2 4 6 8
log_text
2. Checking the histograms of log(y) vs. x
8
log_text vs. cellphones log_text vs. fastest_drive

8
6

6
log_text

log_text
4

4
2

2
0

0
0 5 10 15 20 0 50 100 150 200
cellphones fastest_drive

log_text vs. haircut log_text vs. senior

8
8

6
6

log_text
log_text

4
4

2
2

0
0

0 50 100 150 200 250 0 .2 .4 .6 .8 1

haircut senior

Which of these look linear-ish? Which of these could benefit

from a transformation of x?
Fixing Symmetry (and Linearity) in the x’s

8
.2

6
.15

log_text
Density

Density

4
.1

2
.05

0
0

0
0 5 10 15 20 0 1 2 3 0 1 2 3
cellphones log_cellphones log_cellphones

cellphones above, haircut below

.02

8
.015

6
log_text
Density
Density

.01

4
.4
.005

2
.2

0
0

0 50 100 150 200 250 0 1 2 3 4 5 0 2 4 6

haircut log_haircut log_haircut

Note: haircut won’t ever get fixed completely since there are a lot of people
all piled up at zero. But it is more symmetric and more linear
3. Fit the Model
. regress log_text log_cellphones log_haircut fastest_drive senior

Source | SS df MS Number of obs = 167

-------------+------------------------------ F( 4, 162) = 11.05
Model | 40.7124798 4 10.1781199 Prob > F = 0.0000
Residual | 149.24163 162 .921244633 R-squared = 0.2143
-------------+------------------------------ Adj R-squared = 0.1949
Total | 189.95411 166 1.14430187 Root MSE = .95981

--------------------------------------------------------------------------------
log_text | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
log_cellphones | .8642687 .1891676 4.57 0.000 .4907163 1.237821
log_haircut | .1166675 .057846 2.02 0.045 .002438 .230897
fastest_drive | .0077078 .0025508 3.02 0.003 .0026707 .012745
senior | -.6377016 .2404366 -2.65 0.009 -1.112496 -.1629077
_cons | .9308021 .4171938 2.23 0.027 .1069629 1.754641
--------------------------------------------------------------------------------
.5

4
.4

2
.3

Residuals
Density

0
.2

-2
.1

-4
0

-4 -2 0 2 2 3 4 5
Residuals Fitted values
Unit 10: Main Points
a) What is the formula for this regression model?
yˆ  0.9308  0.864( x1 )  0.117( x2 )  0.0077( x3 )  0.6377( x4 )

b) What is the interpretation of the coefficient for senior in this

model?

Since the sign of the coefficient is negative, seniors send fewer

text messages per day than non-seniors, controlling for the other
3 predictors. In fact comparing a senior to a non-senior, we
expect a multiplicative change of e-0.6377 = 0.529 in #texts sent
(almost half as many), controlling for the other 3 variables in the
model.

29
Unit 10: Main Points
• When you are trying to compare means across 3 or more groups,
this should be done via a regression with (I – 1) binary predictors
Note: for a categorical response this would be done via a chi-sq test

• If there is evidence of difference among groups, then an a priori

hypothesis can be tested via a contrast F-test

• Care must be taken when doing many different hpothesis tests so

not to inflate the Type I error (using Bonferroni correction).

• Fixing linearity can be done by doing a non-linear transformation

of the y-variable or x-variable (or both), and symmetric variables
usually work the best in regression. Log-transforming usually
works best, and only works for right-skewed variables. It makes
interpretation difficult.
30

Flash BTC Sender Guide
No ratings yet
Flash BTC Sender Guide
3 pages
Regresión Lineal - II - 14
No ratings yet
Regresión Lineal - II - 14
42 pages
Regn Lect 7
No ratings yet
Regn Lect 7
26 pages
Lecture 11 Slides - Testing Restrictions
No ratings yet
Lecture 11 Slides - Testing Restrictions
59 pages
Padeepz Ma3251 QB-1
No ratings yet
Padeepz Ma3251 QB-1
21 pages
Stats Final Cheat Sheet PDF
No ratings yet
Stats Final Cheat Sheet PDF
8 pages
Econometrics For Management Assignment
No ratings yet
Econometrics For Management Assignment
3 pages
Quantitative Methods For Business
No ratings yet
Quantitative Methods For Business
16 pages
Panel Data Model Princeton 101 SHORT
No ratings yet
Panel Data Model Princeton 101 SHORT
29 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Statistics Module 4: Calculating A T Score: Click "Slide Show" To Start This Presentation As A Show
No ratings yet
Statistics Module 4: Calculating A T Score: Click "Slide Show" To Start This Presentation As A Show
21 pages
DTNS Lab
No ratings yet
DTNS Lab
26 pages
Measures of Central Tendency and Variability
No ratings yet
Measures of Central Tendency and Variability
20 pages
Sta 226
No ratings yet
Sta 226
5 pages
Intermediate Statistics Test Sample 2
100% (1)
Intermediate Statistics Test Sample 2
19 pages
Lecture Set 4
No ratings yet
Lecture Set 4
39 pages
Statistical Method MTH 313/414 HND Ii
No ratings yet
Statistical Method MTH 313/414 HND Ii
14 pages
Programming With R Test 2
50% (2)
Programming With R Test 2
5 pages
Final Exam Practice Questions
No ratings yet
Final Exam Practice Questions
16 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
38 pages
p6. Hip Test Anova Nada Karimi
No ratings yet
p6. Hip Test Anova Nada Karimi
12 pages
Regression ANOVA Compiled
No ratings yet
Regression ANOVA Compiled
112 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
CH 04
No ratings yet
CH 04
34 pages
Midterm 2023 Sol
No ratings yet
Midterm 2023 Sol
10 pages
Summary Data
No ratings yet
Summary Data
9 pages
Problem Set 2 Quantitative Methods UNIGE
No ratings yet
Problem Set 2 Quantitative Methods UNIGE
10 pages
Final 21
No ratings yet
Final 21
9 pages
Assign3 Sol PDF
No ratings yet
Assign3 Sol PDF
7 pages
4 . Chapter 4
No ratings yet
4 . Chapter 4
49 pages
Statistics EXP-5
No ratings yet
Statistics EXP-5
10 pages
Intermediate Statistics Test Sample 2
0% (1)
Intermediate Statistics Test Sample 2
19 pages
Chapter 7-Exercises Solutions
No ratings yet
Chapter 7-Exercises Solutions
5 pages
Computer Project - Student Choose Data
No ratings yet
Computer Project - Student Choose Data
4 pages
Eco 6
No ratings yet
Eco 6
96 pages
10PPTs-Handout Ten-Descriptive and Inferential Methods in Regression and Correlation - Chapter 14 & 15
No ratings yet
10PPTs-Handout Ten-Descriptive and Inferential Methods in Regression and Correlation - Chapter 14 & 15
25 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
38 pages
Lab Kamal Sir
No ratings yet
Lab Kamal Sir
5 pages
STAT501 Online - Spring2024 - FinalExam
No ratings yet
STAT501 Online - Spring2024 - FinalExam
14 pages
Week 6 - Result and Analysis 2 (UP)
No ratings yet
Week 6 - Result and Analysis 2 (UP)
7 pages
Midterm2021R1 Sol PDF
No ratings yet
Midterm2021R1 Sol PDF
13 pages
BT MGCR 650 Sample Final Exam Solutions MBAJapan
No ratings yet
BT MGCR 650 Sample Final Exam Solutions MBAJapan
9 pages
Regn Lect 5
No ratings yet
Regn Lect 5
9 pages
Indexing Sites
No ratings yet
Indexing Sites
2,191 pages
Mehran Riaz - 17232720-016 - QTB
No ratings yet
Mehran Riaz - 17232720-016 - QTB
13 pages
Bivariate
No ratings yet
Bivariate
28 pages
Sample Questions PUHE6003
No ratings yet
Sample Questions PUHE6003
19 pages
STAT 328 (Regression and Correlation Analysis)
No ratings yet
STAT 328 (Regression and Correlation Analysis)
9 pages
ENME392-Sample Final
No ratings yet
ENME392-Sample Final
8 pages
Problem 4.1 A)
No ratings yet
Problem 4.1 A)
11 pages
Geometric Design For Highways and Railways Including Cross Sections Horizontal and Vertical Alignments Super Elevation and Earthworks - Compress
No ratings yet
Geometric Design For Highways and Railways Including Cross Sections Horizontal and Vertical Alignments Super Elevation and Earthworks - Compress
23 pages
Results 1
No ratings yet
Results 1
4 pages
Intergrated Problem
No ratings yet
Intergrated Problem
8 pages
Exercises - (Activity #6)
No ratings yet
Exercises - (Activity #6)
15 pages
FINAL EXAM IN E-WPS Office
No ratings yet
FINAL EXAM IN E-WPS Office
12 pages
FinalExam Fall2020 Updated GB213
No ratings yet
FinalExam Fall2020 Updated GB213
11 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
Question Set - Asset Integrity
100% (1)
Question Set - Asset Integrity
5 pages
Econometric Project - Permanent Income Hypothesis
No ratings yet
Econometric Project - Permanent Income Hypothesis
9 pages
Reznor Handbook
100% (1)
Reznor Handbook
72 pages
October 25, 2011
No ratings yet
October 25, 2011
27 pages
Stock Watson 3U ExerciseSolutions Chapter5 Students PDF
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter5 Students PDF
9 pages
Configuring A JOB in T24
No ratings yet
Configuring A JOB in T24
2 pages
Series LLC SPV PPM
No ratings yet
Series LLC SPV PPM
62 pages
GET 206 Lecture Note 3
No ratings yet
GET 206 Lecture Note 3
20 pages
What Is Figurative Language?
No ratings yet
What Is Figurative Language?
8 pages
解冻失水率英文版
No ratings yet
解冻失水率英文版
20 pages
Base Excitation
No ratings yet
Base Excitation
24 pages
Prevention and Management of Obstetric Lacerations at Vaginal Delivery ACOG
No ratings yet
Prevention and Management of Obstetric Lacerations at Vaginal Delivery ACOG
46 pages
Smartax Mt800 Adsl Router: User Manual
No ratings yet
Smartax Mt800 Adsl Router: User Manual
109 pages
Specifications Alphasorb Barrier Fabric Wrapped Acoustic Panels
No ratings yet
Specifications Alphasorb Barrier Fabric Wrapped Acoustic Panels
3 pages
Exaugural Speech by Outgoing President Ronaldo Nilo
No ratings yet
Exaugural Speech by Outgoing President Ronaldo Nilo
1 page
ARM313R Data Sheet
No ratings yet
ARM313R Data Sheet
2 pages
DRRM Minutes DSV, ZMDV
No ratings yet
DRRM Minutes DSV, ZMDV
17 pages
100 Consumer Behavior Questions
No ratings yet
100 Consumer Behavior Questions
50 pages
Day 3 Slot 3 Mid Sm23 Summer
No ratings yet
Day 3 Slot 3 Mid Sm23 Summer
32 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
No ratings yet
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
1 page
Agilent ERP Failure
No ratings yet
Agilent ERP Failure
2 pages
Syllabus - Private International Law Copy 2
No ratings yet
Syllabus - Private International Law Copy 2
5 pages
Markingson FINAL2 Nov 14
No ratings yet
Markingson FINAL2 Nov 14
8 pages
Gamal Mohamed CV
No ratings yet
Gamal Mohamed CV
2 pages
Tap Magic Eco Oil Sds en Us 2023pdf
No ratings yet
Tap Magic Eco Oil Sds en Us 2023pdf
8 pages
Multi Point Inspection MPI
No ratings yet
Multi Point Inspection MPI
2 pages
SO12913 ORBITech PDF
No ratings yet
SO12913 ORBITech PDF
1 page
Consultancy Project Assessment Sheet
No ratings yet
Consultancy Project Assessment Sheet
1 page
24 Coercion Exercise
No ratings yet
24 Coercion Exercise
1 page
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Exercises of Combinatory Calculus
From Everand
Exercises of Combinatory Calculus
Simone Malacrida
No ratings yet

Unit 10 - More Multiple Regression - 1 Per Page

Uploaded by

Unit 10 - More Multiple Regression - 1 Per Page

Uploaded by

Unit 10: More Regression

• More Multiple Regression Topics

Means & SD’s

• We’d like to do a t-test, but there’s no formula for 3 groups 

Between groups 7433.86667 2 3716.93333 7.98 0.0019

Total 20013.3667 29 690.116092

Bartlett's test for equal variances: chi2(2) = 2.3353 Prob>chi2 = 0.311

• If we were to run a regression with the binary x-variables of just soph,

Source | SS df MS Number of obs = 169

a) Is there any evidence that these groups send a different number of

c) root MSE  se  104.58

Source | SS df MS Number of obs = 167

• More Multiple Regression Topics

• This makes a lot of sense in the binary regression setting where a

To test whether the . test lowjump + highjump == 0

. test lowjump == highjump

• More Multiple Regression Topics

• More Multiple Regression Topics

• Way back in Unit 2, we mentioned the importance of linearity

Why is the residual-vs.-fitted plot just 4 vertical bars? Does

What is a cause for concern in the above graphs?

• When attempting to fix non-linearity or non-normality, follow

• We want to create a regression to predict the number of text

• We want to begin our model building process by first making sure

• And check the histogram of

log_text vs. haircut log_text vs. senior

0 50 100 150 200 250 0 .2 .4 .6 .8 1

Which of these look linear-ish? Which of these could benefit

cellphones above, haircut below

0 50 100 150 200 250 0 1 2 3 4 5 0 2 4 6

Source | SS df MS Number of obs = 167

b) What is the interpretation of the coefficient for senior in this

Since the sign of the coefficient is negative, seniors send fewer

• If there is evidence of difference among groups, then an a priori

• Care must be taken when doing many different hpothesis tests so

• Fixing linearity can be done by doing a non-linear transformation

You might also like