problem_set_1_with_solution
problem_set_1_with_solution
Question 1
Let x = (x1 , x2 , ..., xn ) be an independent sample from the distribution with density
x4 x
f (x|θ) = 5
exp(− ), x > 0,
24θ θ
1 Pn
where θ > 0 is the parameter of the distribution. Given that the sample mean x̄ = n i=1 xi =
10, find the MLE of θ.
Solution: The likelihood function is the density function of the data, which is
n
Y
L(θ) = p(x|θ) = f (xi |θ)
i=1
n
Y x4i xi
= 5
exp(− )
i=1 24θ θ
n
ℓ(θ) = log(L(θ)) = −5nlog(θ)− x̄+C
θ
Question 2
where r > 0 is a given integer number, and θ ∈ (0,1) is the unknown model parameter, and
C(x) is a constant independent of θ. Suppose that r = 5 and x̄ = 5, find the MLE of θ.
Answer The likelihood function is
n n Pn n
r xi nr xi
Y Y Y
L(θ|x1 , .., xn ) = p(xi ) = C(xi )θ (1 − θ) = θ (1 − θ) i=1 C(xi )
i=1 i=1 i=1
The log-likelihood
n
X n
Y
ℓ(θ|x1 , ..., xn ) = log L(θ|x1 , ..., xn ) = nr log(θ) + log(1 − θ) xi + log C(xi )
i=1 i=1
n
Y
= nr log(θ) + nx̄ log(1 − θ) + log C(xi )
i=1
The derivative is
∂ℓ(θ|x1 , ..., xn ) nr nx̄
= − ,
∂θ θ 1−θ
whose solution is
r 5
θb = = = 0.5.
r + x̄ 5+5
Question 3
The manager of the purchasing department of a large company would like to develop a re-
gression model to predict the average amount of time it takes to process a given number of
invoices. The following model was fit to the data: Y = β0 +β1 x+e where Y is the processing
time (in hours) and x is the number of invoices. Utilizing the output from the fit of this model
provided below, complete the following tasks.
(a) Find a 95% confidence interval for the start-up time, i.e. β0 .
Given a large data size n, the critical value tn−2,0.025 ≈ z0.025 = 1.96.
(b) Suppose that a best practice benchmark for the average processing time for an addi-
tional invoice is 0.01 hours. Test the null hypothesis H0 : β1 = 0.01 against a two-sided
alternative. Calculate the tstat for this test
. reg Time Invoices
------------------------------------------------------------------------------
Time | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Invoices | .0112916 .0008184 13.80 0.000 ------- ---------
_cons | .6417099 .1222707 5.25 0.000 ------- --------
------------------------------------------------------------------------------
which is approximated by
βb0 ±z0.025 ×std. err. of βb0 .
Page 2
Note that βb0 = 0.6417 and std. err. of βb0 is 0.1223. Hence the 95% CI is (0.4020;0.8814).
Part (b): In the linear regression model Y =β0 +β1 x+e, β1 is the average processing time
for an additional invoice. We have βb1 = 0.0113 whose std. err. is 0.0008. The tstat for
testing H0 is
βb1 −0.01
tstat = = 1.6250.
std(βb1 )
Compared with the critical value tn−2,0.025 ≈ z0.025 = 1.96, we cannot reject H0 . That is,
the average processing time for an additional invoice is still within the benchmark time of
0.01 hours.
Question 4
Suppose that (x1 ,y1 ),...,(xn ,yn ) are n observations. Consider the simple linear regression
model,
with the ei independent. Let β̂0 and β̂1 be the least squares estimates of β0 and β1 . Let ϵ̂i be
the ith residual, i.e. ϵ̂i = yi − β̂0 − β̂1 (xi −x).
(b)
n
X
(xi −x) = −1.
i=1
(c)
n
X
(xi −x) = 0.
i=1
Page 3
2. Let eb be the mean of the residuals. Please choose the correct answer:
(a) eb = n.
(b) eb = 0
(c) eb = −n.
(d) None of the above
βb0 = y− βb1 z = y.
Hence
1X 1 X
eb = ebi = yi − β̂0 − β̂1 (xi −x) = y− βb0 = 0
n i n i
Correct answer is (B).
3. Please choose the correct answer from (a)–(d) for the least squares estimate of βb0 of β0 .
(a) 0
(b) y, the sample mean of the yi .
(c) y1 +y2 +y3 .
(d) None of the above.
Page 4