0% found this document useful (0 votes)
48 views2 pages

Problem Set 1hgfy

This document provides 9 questions for a problem set on regression and time series models. It asks the student to: 1) Generate random samples from different distributions and plot histograms to match the shapes. 2) Generate exponential samples, calculate means, and see if the distribution of means resembles normal. 3) Generate a normal sample, calculate a confidence interval for the mean, and check if the true mean falls within it repeatedly. 4) Fit a linear regression model to predict diamond price from caret size using provided data. 5) Fit a linear model to predict ice cover from year using other provided data. 6) Check if residuals from two other data sets are normally distributed after linear

Uploaded by

Rakesh Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views2 pages

Problem Set 1hgfy

This document provides 9 questions for a problem set on regression and time series models. It asks the student to: 1) Generate random samples from different distributions and plot histograms to match the shapes. 2) Generate exponential samples, calculate means, and see if the distribution of means resembles normal. 3) Generate a normal sample, calculate a confidence interval for the mean, and check if the true mean falls within it repeatedly. 4) Fit a linear regression model to predict diamond price from caret size using provided data. 5) Fit a linear model to predict ice cover from year using other provided data. 6) Check if residuals from two other data sets are normally distributed after linear

Uploaded by

Rakesh Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Regression and Time Series Models

Problem Set 1

1. Generate a random sample of size 1000 from the following distributions:

(a) exponential(Use mean = 0.1)


(b) normal(Use mean = 2, standard deviation = 1)
(c) continuous uniform (Use range(3,5))

2. Plot histograms for the samples generated in Question 1 and match them with the shapes of
the original sampling distributions.

3. Generate a random sample of size of n = 10 from the exponential distribution with mean 3.
Calculate the mean of the generated sample. Repeat this process 100 times and in each case
record the sample mean. Plot the histogram of these sample means. Does this plot resemble
to normal distribution? If the size of the sample is changed from n = 10 to n = 150, what
is your observation. Can you think of a theoretical result that supports this observation.
Further, perform the same exercise by replacing the exponential distribution with Poisson
distribution with mean equal to 31 .

4. Generate a random sample of size n = 50 from a standard normal distribution. Calculate


the 95% confidence interval for the mean. Does the mean of the sample lie in this confidence
interval? Repeat the previous steps 100 times and see for yourself how many times the
true mean lies in the confidence interval. Does your experiment agree with the concept of
confidence interval.

5. Data about the caret size of diamonds and their corresponding price is given in the file
diamond.csv. Fit a linear regression model to predict the price of a diamond given its
caret size.

6. Global warming is an important environment issue in the contemporary world. Data about
the cover of ice on earth and its corresponding year is provided in the file ice data.csv.
Fit a linear regression model to predict the cover of ice in the year 2017.

7. Load the data given in the file data 1.csv. Fit a linear regression model for this data and
compute the residuals. Can you say the residuals are normally distributed?

8. Load the data given in the file data 2.csv. Fit a linear regression model for this data and
compute the residuals. Can you say the residuals are normally distributed?

1
9. Consider the following simple linear regression model

y = 0 + 1 x +

where N (0, 2 ). It is known that the


explanatory variable x affects the response vari-
able y via the linear relationship y = 3 + x. The measurements of the response vari-
able y are collected using four different instruments. Each instrument has different level
of accuracy. It should be noted that there is no measurement error in measuring the ex-
planatory variable. The data for each instrument is stored in the following files namely
instrument 1.csv,instrument 2.csv, instrument 3.csv andinstrument 4.csv.
Fit linear regression model for each of the data sets. Which one do you think is a more reliable
instrument and why?

You might also like