Assignment_02

This assignment requires students to conduct regression analyses using provided data on GMAT scores and GPAs, as well as simulate data for linear regression models. Students must interpret coefficients, predict values, and assess interaction effects among various predictors related to starting salaries. Additionally, the assignment involves generating data with varying noise levels and analyzing the impact on model fit and confidence intervals.

Uploaded by

zhiqianhuang813

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

Assignment_02

Uploaded by

zhiqianhuang813

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

BU.510.

650 Assignment #2
Data Analytics Page 1 of 2
Dr. Ruxian Wang Johns Hopkins Carey Business School

Assignment #2

Attention: Please prepare two files for each homework assignment: the .docx or .pdf file for your
answers including figures to each question; the other .R file for your R script. File names should
be “LastName FirstName number.docx” and “LastName FirstName number.R”. All assignments
should submitted via our course website.

1. Grade point average of 12 graduating MBA students, GPA, and their GMAT scores taken
before entering the MBA program are given below. Use the GMAT scores as a predictor of
GPA, and conduct a regression of GPA on GMAT scores.

x=GMAT y=GPA
560 3.20
540 3.44
520 3.70
580 3.10
520 3.00
620 4.00
660 3.38
630 3.83
550 2.67
550 2.75
600 2.33
537 3.75

(a) Obtain and interpret the coefficient of determination R2 .

(b) Calculate the fitted value for the second person
(c) Test whether GMAT is an important predictor variable (use significant level 0.05)

2. Suppose we have a data set with five predictors, X1 =GPA, X2 = IQ, X3 = Gender (1
for Female and 0 for Male), X4 = Interaction between GPA and IQ, and X5 = Interaction
between GPA and Gender. The response is starting salary after graduation (in thousands of
dollars). Suppose we use least squares to fit the model, and get βb0 = 50, βb1 = 20, βb2 = 0.07,
βb3 = 35, βb4 = 0.01, βb5 = −10.

(a) Which answer is correct, and why?

i. For a fixed value of IQ and GPA, males earn more on average than females.
ii. For a fixed value of IQ and GPA, females earn more on average than males.
iii. For a fixed value of IQ and GPA, males earn more on average than females provided
that the GPA is high enough.
iv. For a fixed value of IQ and GPA, females earn more on average than males provided
that the GPA is high enough.
(b) Predict the salary of a female with IQ of 110 and a GPA of 4.0.
(c) True or false: Since the coefficient for the GPA/IQ interaction term is very small, there
is very little evidence of an interaction effect. Justify your answer.
2 BU.510.650, Assignment #2

3. In this exercise you will create some simulated data and will fit simple linear regression
models to it. Make sure to use command set.seed(1) prior to starting part (a) to ensure
consistent results. (Hint: rnorm(n, mean = a, sd = b) generates n random variables with
mean a, standard deviation b, e.g., rnorm(100, mean = 10, sd = 5) returns a vector with
100 values, each of which follows a normal distribution with mean 10 and standard deviation
5.)

(a) Using the rnorm() function, create a vector, x, containing 100 observations drawn from
a N (0, 1) distribution. This represents a feature, X.
(b) Using the rnorm() function, create a vector, , containing 100 observations drawn from
a N (0, 0.25) distribution i.e. a normal distribution with mean zero and variance 0.25.
(c) Using x and , generate a vector y according to the model

Y = −1 + 0.5X + . (1)

What is the length of the vector y? What are the values of β0 and β1 in this linear
model?
(d) Create a scatterplot displaying the relationship between x and y. Comment on what
you observe.
(e) Fit a least squares linear model to predict y using x. Comment on the model obtained.
How do βb0 and βb1 compare to β0 and β1
(f) Now fit a polynomial regression model that predicts y using x and x2 . Is there evidence
that the quadratic term improves the model fit? Explain your answer.
(g) Repeat (a)-(f) after modifying the data generation process in such a way that there is
less noise in the data. The model (1) should remain the same. You can do this by
decreasing the variance of the normal distribution used to generate the error term in
(b). Describe your results.
(h) Repeat (a)-(f) after modifying the data generation process in such a way that there is
more noise in the data. The model (1) should remain the same. You can do this by
increasing the variance of the normal distribution used to generate the error term in
(b). Describe your results.
(i) What are the confidence intervals for β0 and β1 based on the original data set, the noisier
data set, and the less noisy data set? Comment on your results.

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Homework2 1
No ratings yet
Homework2 1
3 pages
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
0% (1)
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
103 pages
PDF
No ratings yet
PDF
9 pages
S Doc1
100% (1)
S Doc1
7 pages
Validation of Titrations PDF
No ratings yet
Validation of Titrations PDF
28 pages
Practice Problem 2
No ratings yet
Practice Problem 2
7 pages
SDSC3006_Assignment 1
No ratings yet
SDSC3006_Assignment 1
2 pages
Assignment 3( QM)
No ratings yet
Assignment 3( QM)
3 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
SDSC3006 - Assignment 1
No ratings yet
SDSC3006 - Assignment 1
3 pages
Assignment 2
No ratings yet
Assignment 2
11 pages
Lec 5 V 11
No ratings yet
Lec 5 V 11
44 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
Sample Solution
No ratings yet
Sample Solution
4 pages
Hoja 2 English
No ratings yet
Hoja 2 English
3 pages
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
MECO6312-2021F-Test1_AZ(1)
No ratings yet
MECO6312-2021F-Test1_AZ(1)
6 pages
dataanalyticsunit-2
No ratings yet
dataanalyticsunit-2
24 pages
Problems 1
No ratings yet
Problems 1
4 pages
Chapter 08 Nonlinear Regression Functions (1)
No ratings yet
Chapter 08 Nonlinear Regression Functions (1)
75 pages
Econ7020X FinalReview (Answers)
No ratings yet
Econ7020X FinalReview (Answers)
10 pages
HW1
No ratings yet
HW1
18 pages
Chapter 4 Functional form
No ratings yet
Chapter 4 Functional form
27 pages
Assignment-15 BA
No ratings yet
Assignment-15 BA
11 pages
Chapter 4 Functional form
No ratings yet
Chapter 4 Functional form
27 pages
EconometricsII Exercises
100% (1)
EconometricsII Exercises
27 pages
QBA Final Exam (May 31, 2021)
No ratings yet
QBA Final Exam (May 31, 2021)
12 pages
UC Berkeley Econ 140 Section 10
No ratings yet
UC Berkeley Econ 140 Section 10
8 pages
ISLP - Website 135 200
No ratings yet
ISLP - Website 135 200
66 pages
ISLP - Website-135-200 (1) - 1-60
No ratings yet
ISLP - Website-135-200 (1) - 1-60
60 pages
Topic5 Lab TwoDimVariable
No ratings yet
Topic5 Lab TwoDimVariable
14 pages
Sample Exam For ML YSZ: Question 1 (Linear Regression)
No ratings yet
Sample Exam For ML YSZ: Question 1 (Linear Regression)
4 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
DATA ANALYTICS CLASS - UNIT-III
No ratings yet
DATA ANALYTICS CLASS - UNIT-III
45 pages
CH 03 Regression Techniques
No ratings yet
CH 03 Regression Techniques
74 pages
Assignment3 05.01.24
No ratings yet
Assignment3 05.01.24
4 pages
Data Science Lab 5
No ratings yet
Data Science Lab 5
8 pages
Lecture 6.2 - Polynomial Regression
No ratings yet
Lecture 6.2 - Polynomial Regression
56 pages
Problem Set 4
No ratings yet
Problem Set 4
3 pages
Sample Exam For ML YSZ Sample For Machine Lerning - CMNKNVMNCS."NMD, MN, MVN, MDNV, MNDV MC, MDN, MDCNVM, NDV, M Ccwdmnbnbew, Mwbe
No ratings yet
Sample Exam For ML YSZ Sample For Machine Lerning - CMNKNVMNCS."NMD, MN, MVN, MDNV, MNDV MC, MDN, MDCNVM, NDV, M Ccwdmnbnbew, Mwbe
4 pages
Week 14: Exam Preparation: Slide 1
No ratings yet
Week 14: Exam Preparation: Slide 1
48 pages
Econometric Methods
No ratings yet
Econometric Methods
4 pages
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
No ratings yet
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
34 pages
7.3 - Non-Linear Regression
No ratings yet
7.3 - Non-Linear Regression
4 pages
ch14 Solutions
No ratings yet
ch14 Solutions
68 pages
ps4 Fall2015
No ratings yet
ps4 Fall2015
8 pages
Answers For Homework #2: 1 Theoretical Exercises
No ratings yet
Answers For Homework #2: 1 Theoretical Exercises
7 pages
NVT SDS Unit V Final PDF
No ratings yet
NVT SDS Unit V Final PDF
100 pages
Statistical Methods
No ratings yet
Statistical Methods
7 pages
Regression With Dummy Variables Econ420 1
No ratings yet
Regression With Dummy Variables Econ420 1
47 pages
hw3 Spring2024 Solution
No ratings yet
hw3 Spring2024 Solution
18 pages
(Chapman & Hall - CRC Texts in Statistical Science) Paul Roback and Julie Legler - Beyond Multiple Linear Regression-Applied Generalized Linear Models and Multilevel Models in R-CRC Press (2020)
No ratings yet
(Chapman & Hall - CRC Texts in Statistical Science) Paul Roback and Julie Legler - Beyond Multiple Linear Regression-Applied Generalized Linear Models and Multilevel Models in R-CRC Press (2020)
437 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
Statistical Data Analysis Assignment
No ratings yet
Statistical Data Analysis Assignment
17 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Shanghai Jiaotong University Shanghai Advanced Institution of Finance
No ratings yet
Shanghai Jiaotong University Shanghai Advanced Institution of Finance
3 pages
QBM 101 Lecture 10
No ratings yet
QBM 101 Lecture 10
45 pages
ESB2021 Resit With Solution
No ratings yet
ESB2021 Resit With Solution
9 pages
Module 6B Regression - Modelling Possibilities
No ratings yet
Module 6B Regression - Modelling Possibilities
62 pages
Econometrics - Week 5 Tutorials 2024
No ratings yet
Econometrics - Week 5 Tutorials 2024
3 pages
Panel Data Models Stata Program and Output PDF
100% (1)
Panel Data Models Stata Program and Output PDF
8 pages
Download Applied Spatial Analysis of Public Health Data 1st Edition Lance A. Waller ebook All Chapters PDF
100% (13)
Download Applied Spatial Analysis of Public Health Data 1st Edition Lance A. Waller ebook All Chapters PDF
60 pages
1.silman2011 - Role of Physical Activity and Perceived Adequacy Onpeak Aerobic Power in Children With Developmental
No ratings yet
1.silman2011 - Role of Physical Activity and Perceived Adequacy Onpeak Aerobic Power in Children With Developmental
10 pages
Femur Fractures in The Pediatric Population: Abuse or Accidental Trauma?
No ratings yet
Femur Fractures in The Pediatric Population: Abuse or Accidental Trauma?
7 pages
Betetelehem Wodajo Final Thesis After Defense
No ratings yet
Betetelehem Wodajo Final Thesis After Defense
85 pages
ICAP QM QM Solution (Full Book)
No ratings yet
ICAP QM QM Solution (Full Book)
376 pages
Store 24 AB
0% (2)
Store 24 AB
15 pages
2023 JFCA NIR Pea System
No ratings yet
2023 JFCA NIR Pea System
9 pages
Pesaran 2015 TimeSeriesAndPanelDataEconometrics
100% (1)
Pesaran 2015 TimeSeriesAndPanelDataEconometrics
1,095 pages
9STEPSBinomial Logistic Regression EDWINABU
No ratings yet
9STEPSBinomial Logistic Regression EDWINABU
10 pages
Causal Effect
No ratings yet
Causal Effect
16 pages
Kumudumali Akki
No ratings yet
Kumudumali Akki
97 pages
Mathematical Model To Calculate Volumes of Lumber and Residue Produced in Sawmilling
No ratings yet
Mathematical Model To Calculate Volumes of Lumber and Residue Produced in Sawmilling
47 pages
EJ1174338
No ratings yet
EJ1174338
6 pages
Determination of Suitable Thin Layer Drying Curve Model For Some Vegetables and Fruits
No ratings yet
Determination of Suitable Thin Layer Drying Curve Model For Some Vegetables and Fruits
10 pages
Ane Turner Johnson Dissertation
100% (2)
Ane Turner Johnson Dissertation
7 pages
Interfacing Geostatistics and GIS
100% (1)
Interfacing Geostatistics and GIS
282 pages
Recitation Notes For PR2
No ratings yet
Recitation Notes For PR2
7 pages
MPC 006 Previous Year Question Papers by
No ratings yet
MPC 006 Previous Year Question Papers by
67 pages
Prognosis Developing Prognosis: Outcome. Effectiveness Clinical Accurate
No ratings yet
Prognosis Developing Prognosis: Outcome. Effectiveness Clinical Accurate
8 pages
A Comparison of Forecasting Methods For Hotel Revenue Management
No ratings yet
A Comparison of Forecasting Methods For Hotel Revenue Management
15 pages
Uniglobe College: Lesson Plan: Business Statistics
No ratings yet
Uniglobe College: Lesson Plan: Business Statistics
6 pages
Hayashi chp3
No ratings yet
Hayashi chp3
57 pages
Simple Methods and Procedures Used in Forecasting
No ratings yet
Simple Methods and Procedures Used in Forecasting
23 pages
EE2211 Introduction To Machine Learning: Semester 1 2020/2021
No ratings yet
EE2211 Introduction To Machine Learning: Semester 1 2020/2021
34 pages
Patient Factors Associated with 10-Year Survival After Arthroplasty for Hip Fracture
No ratings yet
Patient Factors Associated with 10-Year Survival After Arthroplasty for Hip Fracture
12 pages
3151 H3 20240514
No ratings yet
3151 H3 20240514
1 page
Math Behind Machine Learning
No ratings yet
Math Behind Machine Learning
9 pages
State Space And Unobserved Component Models Theory And Applications Draft Harvey A instant download
No ratings yet
State Space And Unobserved Component Models Theory And Applications Draft Harvey A instant download
87 pages

Assignment_02

Uploaded by

Assignment_02

Uploaded by

BU.510.

(a) Obtain and interpret the coefficient of determination R2 .

(a) Which answer is correct, and why?

You might also like