SFB Exam
SFB Exam
SFB Exam
EXERCISE 1
You have the following sample of 6 observations drawn from a certain population:
5 6 6 7 9 52
d) [2 POINTS] Do you prefer to summarise these data using the mean or the median? Explain why.
e) [3 POINTS] Build the frequency distribution table for these data and produce a graphical
representation for it.
EXERCISE 2
[4 POINTS] Discuss what the covariance is, and how it differs from the correlation, both in terms
of the information that they give, and the range of values they can have.
EXERCISE 3
Explain the difference between each pair of terms:
1
EXERCISE 4
You have a standard deck of 52 cards and randomly draw two cards sequentially with replacement.
b) [2 POINTS] Focusing on the first card only, compute the probability that the first card is either
red (event A) or an ace (event B).
c) [2 POINTS] Are events A (the first card it red) and B (the first card is an ace) in the previous
question independent? Explain showing also the formulas you use.
d) [2 POINTS] Now imagine we select the two cards sequentially without replacement. Event A is
the first card is red, and event B is the second card is an ace. Are the two events independent? Explain
why in words (not using formulas).
e) [2 POINTS] How would your answer to the previous point change if event A was “the first card
is a 9” and event B was “the second card is an ace”?
EXERCISE 5
[6 POINTS] An Econometrics professor needs to compare the marks of students who did and did
not take a preliminary Statistics exam. For a random sample of 114 students who took the Statistics
exam, she found a mean Econometrics mark of 25.02 and a standard deviation of 0.64. For an
independent random sample of 123 students who did not take the Statistics exam, the mean
Econometrics mark was 25.10 and the standard deviation 0.56.
Test the null hypothesis that the two population means are equal against a two-sided alternative using
the p-value approach.
EXERCISE 6
a) [1 POINT] Explain how a “Normal distribution” differs from a “Standard Normal distribution”.
b) [2 POINTS] Explain what is the difference between a Binomial and a Bernoulli distribution.
2
EXERCISE 7
The marketing manager of a new chain of stores wants to evaluate the effects of the size of the
exhibition space in square meters (X) on the daily sales in thousands of euros (Y). For this purpose,
a random sample of 12 stores is drawn producing the following results:
a) [2 POINTS] Write the estimated linear regression model for predicting the daily sales as a
function of the size of the exhibition space.
b) [2 POINTS] Provide an estimate of the variance of the error component for the linear model
considered above.
c) [5 POINTS] Do you think it is reasonable to use the size of the exhibition space to predict the
daily sales? To answer perform an appropriate test using a significance level α equal to 0.01. Show
the detailed steps of your analysis.
d) [2 POINTS] Based on the estimated model, provide the 99% prediction interval for the daily sales
of a single store that has an exhibition space of 450 square meters.
3
EXERCISE 8
The marketing manager of a chain of stores wants to predict the spending made by their customers
at the chain’s stores. To this end, she selects a random sample of 1000 customers on which it collects
the following quantities:
Using these data, the company estimates two linear regression models whose results are shown
below:
MODEL 1
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.6996
R Square 0.4894
Adjusted R Square 0.4889
Standard Error 687.0649
Observations 1000
ANOVA
df SS MS F Significance F
Regression 1 451615196.5 451615196.5 956.6940 7.4954E-148
Residual 998 471114028.6 472058.1449
Total 999 922729225.1
4
MODEL 2
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8114
R Square 0.6584
Adjusted R Square 0.6574
Standard Error 562.5277
Observations 1000
ANOVA
df SS MS F Significance F
Regression 3 607557577.7 202519192.6 639.9977 9.558E-232
Residual 996 315171647.4 316437.397
Total 999 922729225.1
a) [1 POINT] Provide the interpretation of the estimated coefficient for the Salary variable in
MODEL 1.
b) [1 POINT] Which one of the two models above would you suggest to use for predicting the
amount spent by the company’s customers? Provide a justification for your answer.