0% found this document useful (0 votes)

12 views34 pages

Sociology: Intermediate Quantitative Research Method

Uploaded by

iris200193

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views34 pages

Sociology: Intermediate Quantitative Research Method

Uploaded by

iris200193

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

1

Extensions to Linear Regression

Part 2 of Week 7

Aida Parnia
[email protected]

U of T Sociology

October 15, 2024

SOC252H1F
2

Extensions to Linear regression

Linear regression as explaining variation
→ R2 and adjusted R2
Hypothesis testing in regression
→ t distribution and p value
Multiple linear regression
→ Categorical variables as multiple indicator variables
→ Interaction term
Model selection criteria

SOC252H1F
3

Today’s example: UN data

Our investigation is around the determinants of life expectancy
1 str(UN)
tibble [193 × 8] (S3: tbl_df/tbl/data.frame)
$ country : chr [1:193] "Afghanistan" "Albania" "Algeria" "Angola" ...
$ region : chr [1:193] "Asia" "Europe" "Africa" "Africa" ...
$ group : chr [1:193] "other" "other" "africa" "africa" ...
$ fertility : num [1:193] 5.97 1.52 2.14 5.13 2.17 ...
$ ppgdp : num [1:193] 499 3677 4473 4322 9162 ...
$ lifeExpF : num [1:193] 49.5 80.4 75 53.2 79.9 ...
$ pctUrban : num [1:193] 23 53 67 59 93 64 47 89 68 52 ...
$ infantMortality: num [1:193] 124.5 16.6 21.5 96.2 12.3 ...
- attr(*, "na.action")= 'omit' Named int [1:20] 4 6 21 35 38 54 67 75 77 78 ...
..- attr(*, "names")= chr [1:20] "4" "6" "21" "35" ...

SOC252H1F
4

Explained Variation and R-Squared

Definition and Interpretation of R-Squared
→ Formula: R2 = 1 − SSres
SS

→
tot

Proportion of variance explained by the model

Adjusted R-Squared for Multiple Regression Models
→ As we increase the number of variables in the model the
proportion explained would increase,
→ R-squared can be adjusted for the number of predictors

→ Formula: Adjusted R = 1 − (
2 (1−R2)(n−1)
n−k−1 )
→ We can use this in model comparison

SOC252H1F
5

Explained variation of life expectancy by GDP or log-

GDP
1 fit_gdp <- lm(lifeExpF ~ ppgdp, GDP log(GDP)
2 data = UN)
3 (Intercept) 68.072 29.257
4 fit_loggdp <- lm(lifeExpF ~ log(ppgdp),
5 data = UN)
[66.598, [24.154,
6 69.546] 34.359]
7 modelsummary(list("GDP"=fit_gdp,
8 "log(GDP)"=fit_loggdp),
ppgdp 0.000
9 statistic = "conf.int",
[0.000, 0.000]
10 fmt = 3,
11 gof_map = c("nobs", "r.squared")) log(ppgdp) 5.090
[4.494, 5.687]
Num.Obs. 193 193
R2 0.311 0.597

In the model with GDP as the predictor, only 30% of the variation in female life expectancy is explained by
the model, while with the log GDP as the predictor, 59.6% of the variation is explained by the model.

SOC252H1F
6

Hypothesis Testing of the Regression Coefficients

The Null Hypotheses If the assumptions of the model
H0 : betak = 0 are met, then we can get the
We know the point estimate, but probability distribution of the
we want to account for the estimated coefficient.
uncertainty in the data. Turns out that the estimated
coefficients follow a t distribution
(with n - number of parametrs
degrees of freedom.

SOC252H1F
7

The t distribution is a modified normal distribution

SOC252H1F
8

Hypothesis Testing of the Regression Coefficients

t-Tests for Individual Coefficients

→ Formula: t =
^
β−0
^
SE(β)
→ The calculated t value is then put on this t distribution and we
calculate the corresponding probability for it (p-value).
1 sum_loggdp <- summary(fit_loggdp)
2 sum_loggdp

SOC252H1F
9

Hypothesis Testing of the Regression Coefficients

Call:
lm(formula = lifeExpF ~ log(ppgdp), data = UN)

Residuals:
Min 1Q Median 3Q Max
-25.885 -2.908 1.396 3.982 12.402

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 29.2566 2.5870 11.31 <2e-16 ***
log(ppgdp) 5.0901 0.3025 16.83 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.479 on 191 degrees of freedom

Multiple R-squared: 0.5972, Adjusted R-squared: 0.5951
F-statistic: 283.2 on 1 and 191 DF, p-value: < 2.2e-16

SOC252H1F
10

Hypothesis Testing of the Regression Coefficients

SOC252H1F
11

Multiple Linear Regression

For our example the model is the following

Life Expectancy = β0 + β1 ⋅ log(GDP) + β2 ⋅ fertility + ε

1 UN <- UN %>% mutate(log_gdp = log(ppgdp))
2 # ML model
3 fit_ml <- lm(lifeExpF ~ log_gdp + fertility,
4 data = UN)

SOC252H1F
12

Multiple Linear Regression as a regression plane

Regression Plane for Life Expectancy

pred_le

SOC252H1F
13

More on non-linear relationships

In our model we assume that fertility rate has a linear relationship.
Meaning if fertility increases from 1 to 2, it has the same relationship to
life expectancy as it increases from 5 to 6.
But this may not be true, so we can categorize our fertility measure to
see different relationships. Here we use quartiles.

1 fertility_b <- c(min(UN$fertility), # A tibble: 4 × 2

fertility_q n
2 quantile(UN$fertility, 0.25),
<fct> <int>
3 quantile(UN$fertility, 0.5), 1 [1.13,1.75) 48
4 quantile(UN$fertility, 0.75), 2 [1.75,2.26) 48
5 max(UN$fertility)) 3 [2.26,3.7) 48
4 [3.7,6.92] 49
6 UN <- UN %>% mutate(
7 fertility_q = cut(fertility,
8 breaks = fertility_b,
9 include.lowest = TRUE,
10 right = FALSE))
11 UN %>% count(fertility_q)

SOC252H1F
14

More on non-linear relationships: categorical

variables
Dummy Variable Coding is when a categorical predictor is separated
into multiple dummy / indicator variables. And one category is defined
as the reference category.
1 fit_ml2 <- lm(lifeExpF ~ log_gdp + fertility_q,
2 data = UN)
3
4 summary(fit_ml2)

SOC252H1F
15

More on non-linear relationships: categorical

variables
Call:
lm(formula = lifeExpF ~ log_gdp + fertility_q, data = UN)

Residuals:
Min 1Q Median 3Q Max
-21.6867 -1.6328 0.2597 2.9554 12.4939

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 52.2092 3.6933 14.136 < 2e-16 ***
log_gdp 2.9235 0.3801 7.691 7.93e-13 ***
fertility_q[1.75,2.26) -1.2362 1.1532 -1.072 0.285
fertility_q[2.26,3.7) -5.6775 1.2696 -4.472 1.34e-05 ***
fertility_q[3.7,6.92] -11.8377 1.5434 -7.670 9.01e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.646 on 188 degrees of freedom

SOC252H1F
16

More on non-linear relationships: categorical

variables
(1)
(Intercept) 52.21 Interpretation of coefficients
[44.92, 59.49] β0 or the intercept
log_gdp 2.92 The average life expectancy of females in countries with 0
[2.17, 3.67] log_GDP and a fertility rate of less than 1.75 is 52.21 years.
fertility_q[1.75,2.26) -1.24 β1 or the coefficient for log_GDP
[-3.51, 1.04] When fertility is held constant, the average life expectancy of
fertility_q[2.26,3.7) -5.68
females in countries with one percent higher GDP is 0.03 years
[-8.18, -3.17]
higher than in others.
fertility_q[3.7,6.92] -11.84
[-14.88, -8.79]

SOC252H1F
17

More on non-linear relationships: categorical

variables
(1)
(Intercept) 52.21 Interpretation of coefficients
[44.92, 59.49] β2 or the coefficient for fertility [1.75,2.26)
log_gdp 2.92 When GDP is held constant, the average life expectancy of
[2.17, 3.67] females in countries with fertility rates of 1.75 to just below
fertility_q[1.75,2.26) -1.24 2.26 is 1.24 years lower than in countries with fertility rates of
[-3.51, 1.04] less than 1.75 children per woman.
fertility_q[2.26,3.7) -5.68
β3 or the coefficient for fertility [2.26,3.7)
[-8.18, -3.17]
When GDP is held constant, the average life expectancy of
females in countries with fertility rates of 2.26 to just below 3.7
fertility_q[3.7,6.92] -11.84
is 5.68 years lower than in countries with fertility rates of less
[-14.88, -8.79]
than 1.75 children per woman.
β4 or the coefficient for fertility [3.7,6.92]
When GDP is held constant, the average life expectancy of
females in countries with fertility rates of 3.7 to 6.92 is 11.83
years lower than in countries with fertility rates of less than
1.75 children per woman.

SOC252H1F
18

More on non-linear relationships: categorical

variables
1 # Added predictions
2 UN %>% mutate(ml_pred = predict(fit_ml2)) %>%
3 # visualizing the model
4 ggplot(aes(y = ml_pred, x = log_gdp, colour = fertility_q)) +
5 geom_line() +
6 geom_point(aes(y = lifeExpF, x = log_gdp, colour = fertility_q)) +
7 theme_light(base_size = 20)

SOC252H1F
19

Interaction in Regressions
What if we allow the slope of one variable to change by the values of
another?
In our example, what if relationship of life expectancy with GDP
depended on the values of fertility rate?
This is the concept of an Interaction Effect.
The previous example the predicted values increases additively with the
values of fertility rate, but with an interaction this increase is
multiplicative.

Life Expectancy = β0 + β1 ⋅ log(GDP) + β2 ⋅ fertility+

β3 ⋅ log(GDP) × fertility + ε

SOC252H1F
20

Interaction in Regressions
1 # Note the change from + to * in the formula
2 fit_ml3 <- lm(lifeExpF ~ log_gdp * fertility_q,
3 data = UN)
4
5 summary(fit_ml3)

SOC252H1F
21

Interaction in Regressions
Call:
lm(formula = lifeExpF ~ log_gdp * fertility_q, data = UN)

Residuals:
Min 1Q Median 3Q Max
-21.6893 -1.4094 0.2941 3.2680 12.3541

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.40179 7.99615 6.803 1.37e-10 ***
log_gdp 2.69213 0.83931 3.208 0.00158 **
fertility_q[1.75,2.26) -0.01215 9.94127 -0.001 0.99903
fertility_q[2.26,3.7) -8.58005 10.38544 -0.826 0.40978
fertility_q[3.7,6.92] -19.63143 9.90168 -1.983 0.04889 *
log_gdp:fertility_q[1.75,2.26) -0.13312 1.04587 -0.127 0.89886
log_gdp:fertility_q[2.26,3.7) 0.31927 1.16941 0.273 0.78515
log_gdp:fertility_q[3.7,6.92] 1.06005 1.19859 0.884 0.37762
---

SOC252H1F
22

Interaction in Regressions
1 # Added predictions
2 UN %>% mutate(ml_pred = predict(fit_ml3)) %>%
3 # visualizing the model
4 ggplot(aes(y = ml_pred, x = log_gdp, colour = fertility_q)) +
5 geom_line() +
6 geom_point(aes(y = lifeExpF, x = log_gdp, colour = fertility_q)) +
7 theme_light(base_size = 20)

SOC252H1F
23

Interaction in Regressions

SOC252H1F
24

SOC252H1F
25

Interaction in Regressions

SOC252H1F
26

Interpreting interaction effects: Interpretation

Interpretation of interaction is often hard to put clearly, so visualization
or choosing specific predicted values is often more intuitive.
Alternatively, we can define partial effects by calculating coefficient for
each category of the the categorical variable.
The coefficient for log_GDP when fertility is below 1.75 (reference
group) is β ⋅ log_gdp
The coefficient for log_GDP when fertility is [1.75,2.26) is
β ⋅ log_gdp + β ⋅ log_gdp:fertility_q[1.75,2.26)
The coefficient for log_GDP when fertility is [2.26,3.7) is
β ⋅ log_gdp + β ⋅ log_gdp:fertility_q[2.26,3.7)
The coefficient for log_GDP when fertility is [3.7,6.92] is
β ⋅ log_gdp + β ⋅ log_gdp:fertility_q[3.7,6.92]

SOC252H1F
27

Interpreting interaction effects: Interpretation

1 log_gdp_fer1 <- fit_ml3$coefficients["log_gdp"]
2
3 log_gdp_fer2 <- fit_ml3$coefficients["log_gdp"] + fit_ml3$coefficients["log_gdp:fertility_q[1.7
4
5 log_gdp_fer3 <- fit_ml3$coefficients["log_gdp"] + fit_ml3$coefficients["log_gdp:fertility_q[2.2
6
7 log_gdp_fer4 <- fit_ml3$coefficients["log_gdp"] + fit_ml3$coefficients["log_gdp:fertility_q[3.7
8
9 c(log_gdp_fer1,log_gdp_fer2,log_gdp_fer3,log_gdp_fer4)
log_gdp log_gdp log_gdp log_gdp
2.692128 2.559012 3.011393 3.752180

SOC252H1F
28

Model Selection Criteria

So which model is best?
There are a number of criteria created to evaluate goodness of fit of a
model.
R squared is one but it is not always the most useful.
Two very useful model selection criterion are
→ Akaike Information Criterion (AIC)
→ Bayesian Information Criterion (BIC)
There are other methods too, like cross-validation and the purpose is to
find the best model without over-fitting the data.

SOC252H1F
29

Model Selection Criteria

1 modelsummary(list("Fertility" = fit_ml, "Fertility_q"=fit_ml2,
2 "Interaction" = fit_ml3),
3 fmt = 2,
4 statistic = "conf.int",
5 gof_map = c("r.squared", "aic", "bic"),
6 stars = TRUE)

SOC252H1F
30

Model Selection Criteria

Fertility Fertility_q Interaction
(Intercept) 63.06*** 52.21*** 54.40***
[55.52, 70.60] [44.92, 59.49] [38.63, 70.18]
log_gdp 2.45*** 2.92*** 2.69**
[1.77, 3.14] [2.17, 3.67] [1.04, 4.35]
fertility -4.18***
[-4.96, -3.39]
fertility_q[1.75,2.26) -1.24 -0.01
[-3.51, 1.04] [-19.62, 19.60]
fertility_q[2.26,3.7) -5.68*** -8.58
[-8.18, -3.17] [-29.07, 11.91]
fertility_q[3.7,6.92] -11.84*** -19.63*
[-14.88, -8.79] [-39.17, -0.10]
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
SOC252H1F
Fertility Fertility_q Interaction
log_gdp × fertility_q[1.75,2.26) -0.13
[-2.20, 1.93]
log_gdp × fertility_q[2.26,3.7) 0.32
[-1.99, 2.63]
log_gdp × fertility_q[3.7,6.92] 1.06
[-1.30, 3.42]
R2 0.745 0.699 0.701
AIC 1186.6 1222.8 1227.4
BIC 1199.6 1242.4 1256.8
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

SOC252H1F
31

Choosing the Best Model

1. We first map the model based on our theoretical understanding
2. If and when we don’t have a good grasp of the model, we can follow
other methods available to us

Stepwise Regression Methods (Forward, Backward, and Stepwise

Selection)
→ Adding/removing predictors based on criteria (often p value)
Comparing Models Using Selection Criteria
→ Evaluating AIC, BIC, and adjusted R-squared; or cross-validation
techniques
Practical Considerations in Model Selection
→ Interpretability and simplicity

SOC252H1F
32

Summary of Key Points

Research design,
→ cycle of research, hierarchy of evidence, experimental vs. non-
experimental methods
Regression is about explaining variation (R-Squared)
Hypothesis testing in regression and p values
Multiple linear regression with categorical variables
Interaction effect
Model selection criteria

We have scratched the surface of regression analysis. But we proud of yourselves, as these are not easy
topics.

SOC252H1F
33

Next week: Introduction to Causal Inference and

Directed Acyclic Graphs

SOC252H1F

ARIMA Models: Instructions
60% (5)
ARIMA Models: Instructions
3 pages
Pset 7 - Fall2019 - Solutions PDF
50% (2)
Pset 7 - Fall2019 - Solutions PDF
35 pages
PDF
No ratings yet
PDF
9 pages
Regression Analysis - VCE Further Mathematics
No ratings yet
Regression Analysis - VCE Further Mathematics
5 pages
Solutions To Problem Set 1
No ratings yet
Solutions To Problem Set 1
6 pages
Sociology: Intermediate Quantitative Research Method
No ratings yet
Sociology: Intermediate Quantitative Research Method
37 pages
Splines
No ratings yet
Splines
13 pages
ECON20003 S1 2024 Sample Exam
No ratings yet
ECON20003 S1 2024 Sample Exam
27 pages
3caffc7a4480bae36d9b13faa92ee16f
No ratings yet
3caffc7a4480bae36d9b13faa92ee16f
11 pages
My Part
No ratings yet
My Part
3 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
Example Econometrics
No ratings yet
Example Econometrics
6 pages
Chapter 4 Assigment
No ratings yet
Chapter 4 Assigment
3 pages
Econometrics Assignment HW4
No ratings yet
Econometrics Assignment HW4
8 pages
10 - Linear Models
No ratings yet
10 - Linear Models
57 pages
Spline Terms in A Cox Model
No ratings yet
Spline Terms in A Cox Model
10 pages
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
No ratings yet
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
14 pages
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
4 Special Models PDF
No ratings yet
4 Special Models PDF
16 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
Unit 540 Differences Between Two Groups With Answers
No ratings yet
Unit 540 Differences Between Two Groups With Answers
8 pages
BT1101 L5 LAB - Linear Regression AY2425S1
No ratings yet
BT1101 L5 LAB - Linear Regression AY2425S1
33 pages
Regression hw3
No ratings yet
Regression hw3
3 pages
CBCS Core - Introductory Econometrics Semester 4th
No ratings yet
CBCS Core - Introductory Econometrics Semester 4th
28 pages
2-06 Non-Linear Models - Logged Variables and Standardized Coefficients
No ratings yet
2-06 Non-Linear Models - Logged Variables and Standardized Coefficients
28 pages
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
No ratings yet
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
27 pages
(The SAGE Quantitative Research Kit) Peter Martin - Linear Regression - An Introduction To Statistical Models-SAGE Publications (2022)
No ratings yet
(The SAGE Quantitative Research Kit) Peter Martin - Linear Regression - An Introduction To Statistical Models-SAGE Publications (2022)
201 pages
DA R Assignment2
No ratings yet
DA R Assignment2
9 pages
Regn Lect 5
No ratings yet
Regn Lect 5
9 pages
Factors Contributing To Lower Value of Life Expectancy
No ratings yet
Factors Contributing To Lower Value of Life Expectancy
18 pages
BUS Assignemnt
No ratings yet
BUS Assignemnt
11 pages
Introduction To Predictive Modeling With Examples: David A. Dickey, N. Carolina State U., Raleigh, NC
No ratings yet
Introduction To Predictive Modeling With Examples: David A. Dickey, N. Carolina State U., Raleigh, NC
14 pages
YEAR
No ratings yet
YEAR
14 pages
Unit 540 Differences Between Two Groups Without Answers
No ratings yet
Unit 540 Differences Between Two Groups Without Answers
5 pages
Choosing A Functional Form
No ratings yet
Choosing A Functional Form
8 pages
HW 3
No ratings yet
HW 3
9 pages
LTAMMergedSummaries PDF
No ratings yet
LTAMMergedSummaries PDF
17 pages
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
ESB2021 Resit With Solution
No ratings yet
ESB2021 Resit With Solution
9 pages
Lecture 9 - Slides - Multiple Regression and Effect On Coefficients PDF
No ratings yet
Lecture 9 - Slides - Multiple Regression and Effect On Coefficients PDF
127 pages
Introductory Econometrics 1758
No ratings yet
Introductory Econometrics 1758
28 pages
Assigment 3
No ratings yet
Assigment 3
19 pages
STAT2 2e R Markdown Files Sec4.4
No ratings yet
STAT2 2e R Markdown Files Sec4.4
13 pages
Tutorials2016s1 Week9 Answers
No ratings yet
Tutorials2016s1 Week9 Answers
4 pages
HW3 Solutions - Stats 500: Problem 1
No ratings yet
HW3 Solutions - Stats 500: Problem 1
4 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
QM 3 Multiple Regression 1
No ratings yet
QM 3 Multiple Regression 1
48 pages
Applied Econometrics: William Greene Department of Economics Stern School of Business
No ratings yet
Applied Econometrics: William Greene Department of Economics Stern School of Business
68 pages
Create A New Stochastic Mortality Model Using R
No ratings yet
Create A New Stochastic Mortality Model Using R
9 pages
R Lab 4
No ratings yet
R Lab 4
7 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
07 GLM
No ratings yet
07 GLM
49 pages
Ansprac 2
No ratings yet
Ansprac 2
6 pages
Problem Set
No ratings yet
Problem Set
8 pages
Public Health, Health Economics, Regression Analysis
No ratings yet
Public Health, Health Economics, Regression Analysis
22 pages
EDA Final Exam Question Paper
No ratings yet
EDA Final Exam Question Paper
2 pages
Ex Day4
No ratings yet
Ex Day4
11 pages
Solutions To Sample Final Exam ECO2151
No ratings yet
Solutions To Sample Final Exam ECO2151
7 pages
Jurnal Cuu
No ratings yet
Jurnal Cuu
8 pages
2025-Article Text-4034-1-10-20190801
No ratings yet
2025-Article Text-4034-1-10-20190801
11 pages
Corr and Regress
No ratings yet
Corr and Regress
42 pages
CH 02 Ans
No ratings yet
CH 02 Ans
20 pages
Lampiran Pengolahan Data: Tabel Hitung
No ratings yet
Lampiran Pengolahan Data: Tabel Hitung
6 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
43 pages
Pengaruh Disiplin Kerja Dan Kompensasi Terhadap Kepuasan Kerja Karyawan (2020)
No ratings yet
Pengaruh Disiplin Kerja Dan Kompensasi Terhadap Kepuasan Kerja Karyawan (2020)
15 pages
Uni Bi Multi Variant Analysis
No ratings yet
Uni Bi Multi Variant Analysis
2 pages
Cost Behaviour
No ratings yet
Cost Behaviour
30 pages
Tut Two Way Anova
No ratings yet
Tut Two Way Anova
6 pages
Chapter 9,10,11,12 - Công TH C
No ratings yet
Chapter 9,10,11,12 - Công TH C
9 pages
Cluster Analysis: Grouping Cases or Variables
No ratings yet
Cluster Analysis: Grouping Cases or Variables
42 pages
Experimental Design and Data Analysis For Biologists 1st Edition Gerry P. Quinn Download
No ratings yet
Experimental Design and Data Analysis For Biologists 1st Edition Gerry P. Quinn Download
46 pages
ch-5 - Demand Estimation
No ratings yet
ch-5 - Demand Estimation
38 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Journal of Financial Economics: Jonathan B. Cohn, Zack Liu, Malcolm I. Wardlaw
No ratings yet
Journal of Financial Economics: Jonathan B. Cohn, Zack Liu, Malcolm I. Wardlaw
23 pages
Final Exam 1
50% (2)
Final Exam 1
4 pages
Unit 3 MCQ
No ratings yet
Unit 3 MCQ
20 pages
Pengaruh Kunjungan Wisatawan Dan Rata-Rata Pengeluaran Wisatawan Terhadap Pad Dan Pertumbuhan Ekonomi Provinsi Bali
No ratings yet
Pengaruh Kunjungan Wisatawan Dan Rata-Rata Pengeluaran Wisatawan Terhadap Pad Dan Pertumbuhan Ekonomi Provinsi Bali
14 pages
Chapter 13
No ratings yet
Chapter 13
108 pages
Econ 331 Econometrics 1
No ratings yet
Econ 331 Econometrics 1
3 pages
R Companion - Kruskal-Wallis Test
No ratings yet
R Companion - Kruskal-Wallis Test
9 pages
Measures of Dispersion
100% (1)
Measures of Dispersion
25 pages
Optimasi Pelayanan Bongkar Muat Peti Kemas Di
No ratings yet
Optimasi Pelayanan Bongkar Muat Peti Kemas Di
35 pages
REVIEW OF CLRMs
No ratings yet
REVIEW OF CLRMs
53 pages
Decision Tree
No ratings yet
Decision Tree
3 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
59 pages
Assignment Decision Tree
No ratings yet
Assignment Decision Tree
15 pages
Logistic Regression Cia3
No ratings yet
Logistic Regression Cia3
14 pages

Sociology: Intermediate Quantitative Research Method

Uploaded by

Sociology: Intermediate Quantitative Research Method

Uploaded by

1

Extensions to Linear Regression

October 15, 2024

Extensions to Linear regression

Today’s example: UN data

Explained Variation and R-Squared

Proportion of variance explained by the model

Explained variation of life expectancy by GDP or log-

Hypothesis Testing of the Regression Coefficients

The t distribution is a modified normal distribution

Hypothesis Testing of the Regression Coefficients

Hypothesis Testing of the Regression Coefficients

Residual standard error: 6.479 on 191 degrees of freedom

Hypothesis Testing of the Regression Coefficients

Multiple Linear Regression

Life Expectancy = β0 + β1 ⋅ log(GDP) + β2 ⋅ fertility + ε

Multiple Linear Regression as a regression plane

More on non-linear relationships

1 fertility_b <- c(min(UN$fertility), # A tibble: 4 × 2

More on non-linear relationships: categorical

More on non-linear relationships: categorical

Residual standard error: 5.646 on 188 degrees of freedom

More on non-linear relationships: categorical

More on non-linear relationships: categorical

More on non-linear relationships: categorical

Life Expectancy = β0 + β1 ⋅ log(GDP) + β2 ⋅ fertility+

Interpreting interaction effects: Interpretation

Interpreting interaction effects: Interpretation

Model Selection Criteria

Model Selection Criteria

Model Selection Criteria

Choosing the Best Model

Stepwise Regression Methods (Forward, Backward, and Stepwise

Summary of Key Points

Next week: Introduction to Causal Inference and

You might also like