Assignment 2
Assignment 2
Assignment 2
1. A survey recorded the number of children 18 years old or younger who lived in 800
households. The data are stored in Children2024A.csv.
(a) Test whether the data set is not from Poisson distribution at α = 0.05.
(b) Based on the Poisson model found in part (a), find the probability that a randomly
selected household has at least two children 18 years old or younger.
2. In a local insurance company, 150 claim amounts (in thousand dollars) of car insur-
ance policies were randomly selected. The claim amounts are recorded and stored in
CarInsurance2024A.csv.
(a) Determine which distribution (normal, lognormal, or exponential) is the best choice
to model the above data.
(b) Plot graphs to show how the data set fits to the best model found in part (a).
(c) What are the estimated parameters of the best model found in part (a)?
3. You are given the data set SingTel2024A.csv. The data file is the record of Singapore
Telecommunication (SingTel) Limited’s (with code Z74.SI) stock price (in AdjClose col-
umn) from 6 December 2022 to 5 December 2023. Suppose an investment has 50,000
shares of on 5 December 2023 at price of $2.30 per share.
(a) Assume the daily return rate is normally distributed, calculate the one-day 98.5%
VaR for the investment.
(b) Based on the historical approach without any assumption of daily return rate
distribution, calculate the one-day 98.5% VaR for the investment.
1
4. Based on the data set SingTel2024A.csv, we would like to develop a simple linear
regression model to predict the daily price range (difference between daily high price
and daily low price), based on the size of daily trade volume. Justify whether we should
apply this simple linear regression model to this data set at the 0.05 level of significance.
If not, provide your justifications. If yes, construct the simple linear regression model.
5. The HDB2024A.csv data set consists of a random sample of 60 recently sold HDB
flats in Singapore. The selling price Price and the assessed value Value in dollars were
recorded in the data set. Suppose that we wanted to develop a model to predict selling
price based on assessed value (each house had been assessed at full value one year prior
to the study).
(a) Estimate the simple linear regression equation by the least squared method.
(b) Construct a scatter plot with the estimated simple linear regression equation.
(c) Perform a residual analysis on your results and evaluate the regression assumptions.
(d) At the 0.05 level of significance, is there evidence of a linear relationship between
the assessed value and the selling price?
(e) Construct a 95% confidence interval estimate of the population slope.
(f) Construct a 95% prediction interval of the selling price for the assessed value of 700
thousand dollars.
(g) Construct a 95% confidence interval estimate of the average selling price for the
assessed value of 750 thousand dollars.
(h) Is it appropriate to use the model to predict the selling price for the assessed value
of 1 million dollars? Justify your answer.
-END-