Sample Questions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Sample questions

Question one
The following data represent the revenues (in thousand $) recorded for a sample of
patients in a one of the hospitals.
28 29 32 37 33 25 29 32 41 34

29 31 34 32 34 30 31 32 35 33
1) Does the sample contain any extreme values? Justify your answer with a suitable
test. Comment on the results
2) According to your conclusion in part (1), calculate the best central and the
best absolute dispersion measure. Comment on the results.

Question two
The salaries (in thousand $) for managers of finance department in two companies
that work in the field of oil& gas as follows.

Company A: 21 20 29 30 31 41 20 51 40 22

Company B: 78 70 63 67 60 71 76 67 65 62

where the mean for the salaries in company A is 30.50 thousand $, Median is 29.50
thousand $ and SD is 10.58 thousand $.

Do you think the salaries in each company is symmetric? Support your answer with test
for skewness.

Question Three
A real-state company that operates in Egypt since 2005 and have several projects. A
descriptive statistical analysis has been done on the prices of the units built in many zones
(A, B, C, D and E) as shown in the below table:

Table (1)

Zone Mean SD CoefVar Minimum Q1 Median Q3 Maximum IQR Skewness

A 7.75 9.03 116.52 0.22 1.23 3.24 9.89 29.17 (a) 1.31

B 10.19 13.58 (b) 0.19 1.02 3.59 24.53 54.52 23.51 1.58
C 7.08 9.52 134.54 0.14 (c) 2.54 6.16 31.13 5.28 1.43

D 7.86 9.28 118.07 0.13 0.84 2.59 10.61 27.63 9.78 1.15

E 7.88 9.47 120.25 0.18 1.16 2.47 19.67 28.03 18.51 (d)

Graph (1)

Boxplot of prices
60

50

40
prices

30

20

10

A B C D E
Zone

You have been hired to answer the following questions:

1) Find the missing values of (a), (b), (c) and (d).

2) Comment on the mean, SD, Q1, Q3, IQR, median and range of the prices for zone A.

3) Do you think that the prices in the zones are skewed, justify your answer with proper
measure from table (1).

4) Based on the boxplot for the prices by zones in graph (1), comment on the graph in
terms of existence of outliers, skewness and which zone(s) has higher outliers.
Question Four
The owner of Maumee Ford-Volvo wants to study the relationship between the age
of a car and its selling price (in thousand $). Listed below is a random sample of used
cars sold at the dealership during the last year.

Table (2)

Car Price
11 10 4 5 8 12 9 10 9 4 11 9 7 6
(in thousand $)
Car Age
7 5 10 14 6 5 7 11 10 14 4 4 12 8
(in years)

Scatter Plot for the price vs age is as follows:

Graph (2)

Scatterplot of Price vs Age


11

10

7
Price

2
5.0 7.5 10.0 12.5 15.0
Age

1) Determine both dependent and independent variables, and explain the expected
relationship between them.
2) Based on graph (2), what do you think about the relationship between both
variables?
3) Calculate the correlation coefficient and comment on the results. Does your answer
match scatter plot in graph (2)?

4) Estimates the simple regression model and comment on the estimated


coefficients.

5) Predict the price for a car which its age is 13 years? Comment on

Question Five
The dataset in the below table represents the 4-month profits (in million $) of real estate
company that operates in Egypt over the period of time between 2014-2018.
I II III

2014 15.6 20.4 29.4

2015 13.8 23.2 35

2016 17.8 19.4 30.6

2017 21.4 24.8 33.6

Answer the following questions:

1) Assuming that the values have seasonality, find the seasonal indices for the

company’s profits and comment on the results.

2) Find the deseasonalized values of company’s profits and compare it to the actual

values numerically. Comment on the results.

3) Assuming the linear trend regression model is as follows:

̂ 𝑃𝑟𝑜𝑓𝑖𝑡 = 21.10 + 0.4824 𝑇𝑖𝑚𝑒


𝐷𝑒𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙𝑖𝑠𝑒𝑑

Predict the expected value of company’s profit in summer 2019 which includes the effects
of all the components of time series. And comment on the results.
Formula Sheet

∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1(𝑥𝑖 − 𝑋̅)2


𝑋̅ = 2
𝑠 =
𝑛 𝑛−1
∑𝑛𝑖=1(𝑥𝑖 − 𝑋̅)2 𝑠
𝑠= √ 𝐶𝑉 = 𝑋 100
𝑛−1 𝑋̅

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝛽̂0 = 𝑌̅ − 𝛽̂1 𝑋̅
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ] 𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝛽̂1 =
𝑛 ∑ 𝑥 2 − (∑ 𝑥)2
First quartile location = 1/4 (n+1)
Third quartile location= 3/4 (n+1) 𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
Value of quartile= start + ratio * distance 𝑚𝑒𝑎𝑛 − 𝑚𝑒𝑑𝑖𝑎𝑛
Lower bound = Q1 – 1.5 IQR = 3( )
𝑆𝐷
Upper bound = Q3 + 1.5 IQR

You might also like