0% found this document useful (0 votes)
105 views1 page

HW 1

This document provides instructions for 4 homework problems involving statistical analysis methods. Problem 1 involves analyzing height and weight data using linear regression. Problem 2 examines price data using linear and log transformations. Problem 3 predicts heights using linear regression. Problem 4 analyzes the properties of least squares estimates in linear regression models.

Uploaded by

Sanjana Sambana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views1 page

HW 1

This document provides instructions for 4 homework problems involving statistical analysis methods. Problem 1 involves analyzing height and weight data using linear regression. Problem 2 examines price data using linear and log transformations. Problem 3 predicts heights using linear regression. Problem 4 analyzes the properties of least squares estimates in linear regression models.

Uploaded by

Sanjana Sambana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Statistical Methods - II, Homework 1

Due: 11:59 p.m. Sunday, August 9

1. In the Htwt data, ht = height in centimeters and wt = weight in kilograms for a sample of n = 10
18-year-old girls. Interest is in predicting weight from height.

(a) Identify the predictor and response.

(b) Draw a scatterplot of wt on the vertical axis versus ht on the horizontal axis. On the basis of this
plot, does a simple linear regression model make sense for these data? Why or why not?

(c) Show that x̄ = 165.52, Ȳ = 59.47, Sxx = 472.08, Syy = 731.96 and Sxy = 274.79. Compute
estimates of the slope and the intercept for the regression of Y on x. Draw the fitted line on your
scatterplot.

2. This problem uses the UBSprices data set.

(a) Draw the plot of Y = bigmac2009 versus x = bigmac2003, the price of a Big Mac hamburger in
2009 and 2003. Give a reason why fitting simple linear regression to the figure in this problem is
not likely to be appropriate.

(b) Plot log(bigmac2009) versus log(bigmac2003) and explain why this graph is more sensibly summa-
rized with a linear regression.

(c) Without using the R function lm(), find the least-squares fit regressing log(bigmac2009) on
log(bigmac2003) and add the line in the plot in (b).

3. This problem uses the Heights data set. Interest is in predicting dheight by mheight.
(a) Use the R function lm() to fit the regression of the response on the predictor. Draw a scatterplot
of the data and add your fitted regression line.
(b) Compute the (Pearson) correlation coefficient rxy . What does the value of rxy imply about the
relationship between dheight and mheight?

4. We are now given data on n observations (xi , Yi ), i = 1, . . . , n. Assume we have a linear model, so
that E(Yi ) = β0 + β1 xi , and let b1 = Sxy /Sxx and b0 = Ȳ − b1 x̄ be the least-square estimates given in
lecture.
(a) Show that E(Sxy ) = β1 Sxx and E(Ȳ ) = β0 + β1 x̄, and use this to conclude that E(b1 ) = β1 and
E(b0 ) = β0 . In other words, these are unbiased estimators.
(b) The fitted values Ŷi = b0 + b1 xi are used as estimates of E(Yi ), and the residuals ei = Yi − Ŷi are
used as surrogates for the unobservable errors εi = Yi − E(Yi ). By assumption, E(εi ) = 0. Show
n
P
that the residuals satisfy a similar property, namely, ei = 0.
i=1

You might also like