Sample Questions
Sample Questions
Sample Questions
Question one
The following data represent the revenues (in thousand $) recorded for a sample of
patients in a one of the hospitals.
28 29 32 37 33 25 29 32 41 34
29 31 34 32 34 30 31 32 35 33
1) Does the sample contain any extreme values? Justify your answer with a suitable
test. Comment on the results
2) According to your conclusion in part (1), calculate the best central and the
best absolute dispersion measure. Comment on the results.
Question two
The salaries (in thousand $) for managers of finance department in two companies
that work in the field of oil& gas as follows.
Company A: 21 20 29 30 31 41 20 51 40 22
Company B: 78 70 63 67 60 71 76 67 65 62
where the mean for the salaries in company A is 30.50 thousand $, Median is 29.50
thousand $ and SD is 10.58 thousand $.
Do you think the salaries in each company is symmetric? Support your answer with test
for skewness.
Question Three
A real-state company that operates in Egypt since 2005 and have several projects. A
descriptive statistical analysis has been done on the prices of the units built in many zones
(A, B, C, D and E) as shown in the below table:
Table (1)
A 7.75 9.03 116.52 0.22 1.23 3.24 9.89 29.17 (a) 1.31
B 10.19 13.58 (b) 0.19 1.02 3.59 24.53 54.52 23.51 1.58
C 7.08 9.52 134.54 0.14 (c) 2.54 6.16 31.13 5.28 1.43
D 7.86 9.28 118.07 0.13 0.84 2.59 10.61 27.63 9.78 1.15
E 7.88 9.47 120.25 0.18 1.16 2.47 19.67 28.03 18.51 (d)
Graph (1)
Boxplot of prices
60
50
40
prices
30
20
10
A B C D E
Zone
2) Comment on the mean, SD, Q1, Q3, IQR, median and range of the prices for zone A.
3) Do you think that the prices in the zones are skewed, justify your answer with proper
measure from table (1).
4) Based on the boxplot for the prices by zones in graph (1), comment on the graph in
terms of existence of outliers, skewness and which zone(s) has higher outliers.
Question Four
The owner of Maumee Ford-Volvo wants to study the relationship between the age
of a car and its selling price (in thousand $). Listed below is a random sample of used
cars sold at the dealership during the last year.
Table (2)
Car Price
11 10 4 5 8 12 9 10 9 4 11 9 7 6
(in thousand $)
Car Age
7 5 10 14 6 5 7 11 10 14 4 4 12 8
(in years)
Graph (2)
10
7
Price
2
5.0 7.5 10.0 12.5 15.0
Age
1) Determine both dependent and independent variables, and explain the expected
relationship between them.
2) Based on graph (2), what do you think about the relationship between both
variables?
3) Calculate the correlation coefficient and comment on the results. Does your answer
match scatter plot in graph (2)?
5) Predict the price for a car which its age is 13 years? Comment on
Question Five
The dataset in the below table represents the 4-month profits (in million $) of real estate
company that operates in Egypt over the period of time between 2014-2018.
I II III
1) Assuming that the values have seasonality, find the seasonal indices for the
2) Find the deseasonalized values of company’s profits and compare it to the actual
Predict the expected value of company’s profit in summer 2019 which includes the effects
of all the components of time series. And comment on the results.
Formula Sheet
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝛽̂0 = 𝑌̅ − 𝛽̂1 𝑋̅
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ] 𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝛽̂1 =
𝑛 ∑ 𝑥 2 − (∑ 𝑥)2
First quartile location = 1/4 (n+1)
Third quartile location= 3/4 (n+1) 𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
Value of quartile= start + ratio * distance 𝑚𝑒𝑎𝑛 − 𝑚𝑒𝑑𝑖𝑎𝑛
Lower bound = Q1 – 1.5 IQR = 3( )
𝑆𝐷
Upper bound = Q3 + 1.5 IQR