0% found this document useful (0 votes)
7 views5 pages

Finalexamcorrection 1

The document presents a detailed statistical analysis of gas consumption and salary among households and executives. It includes calculations for averages, variances, quartiles, and correlations, as well as graphical representations like histograms and scatter plots. Additionally, it discusses the reliability of a diagnostic test for a disease using conditional probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

Finalexamcorrection 1

The document presents a detailed statistical analysis of gas consumption and salary among households and executives. It includes calculations for averages, variances, quartiles, and correlations, as well as graphical representations like histograms and scatter plots. Additionally, it discusses the reliability of a diagnostic test for a disease using conditional probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Name : Groupe:

Exercice 1. A household survey of gas consumption yielded the following results

Gas consumption Frequency (fi ) med-class (xi ) Cum. Freq. Rel.Cum. Freq.
0 9 1 4.5 1 0.01
10 19 2 14.5 3 0.03
20 29 1 24.5 4 0.04
30 39 5 34.5 9 0.09
40 49 8 44.5 17 0.17
50 59 16 54.5 33 0.33
60 69 19 64.5 52 0.52
(1)
70 79 20 74.5 72 0.72
80 89 17 84.5 89 0.89
90 99 11 94.5 100 1

1. Complete the above statistical table.

2. What is the average gas consumption of these household ? Calculate the variance, standard deviation
and coe¢ cient of variation.
Since the data are grouped, we have :
P7
fi xi 6650 (0.25)
= X = Pi=1 7 = = 66.5
f
i=1 i
100

1 X
N
37800
2
= V ar(X) = PN fi (xi )2 = = 381.82 (0.5)
i=1 fi 1 i=1
99
and v
u
u 1 X
N
p
= tP N
fi (xi )2 = 381:82 = 19.54. (0.25)
i=1 fi 1 i=1

The coe¢ cient of variation is given by


19:54 (0.25)
CV = 100 = CV = 100 = 29.383%
66:5

3. Give the Q1 , Q3 ; D5 and P5 .


The …rst quartile Q1 is located in the class 50 59 and the third quartle Q3 is in 80 89 and both
are given by
n 100
Cfq1 17 (0.5)
Q1 = L1 + 4 w = 50 + 4
10 = 55
fq1 16

1
and
3n 300
4
Cfq3 4
72 (0.5)
Q3 = L1 + w = 80 + 10 = 81.765.
fq3 17
Also, D5 and P5 are located in 60 69 and 30 39 respectively, and both are given by
5n 500
Cfd5 33
D5 = L1 + 10
w = 60 + 10
10 = 68.947 (0.5)
fd5 19

and
5n 500
100
Cfp5 100
4
P5 = L 1 + w = 30 + 10 = 32. (0.5)
fp5 5
e
Notice that D5 = Q2 =the Median X
4. Plot the histogram and the Ogive plot of the above distribution then determine the mode and median
of the series graphically

(1)
(1)

5. Give the modale class then calculate the mode value.


Clearly, The highest frequency is for the class 70 b belongs and it’s given by
79 to which the mode X

b = Lmo + 1 (20 19)


X w = 70 + 10 = 72.5 (0.75)
1 + 2 (20 19) + (20 17)
Exercice 2. A statistician carries out a study on 200 executives of a company and seeks to study the
link between the age of the executives and the monthly salary (in thousands of DA) that they receive. He
presents his results in a contingency table, the statistical variable X representing age and the statistical
variable Y monthly salary
XnY [18; 22[ [22; 26[ [26; 30[ [30; 34[ [34; 38[ ni:
[20; 28[ 12 5 4 3 0 24
[28; 36[ 7 11 9 4 1 32
[36; 44[ 4 9 10 9 8 40
[44; 52[ 1 6 12 11 15 45
[52; 60[ 1 4 15 17 22 59
n:j 25 35 50 44 46 200
1. Give the percentage of executives whose salary is greater than or equal to 26,000 DA.
n:3 + n:4 + n:5 50 + 44 + 46
100% = 100% = 70% (0.5)
N 200

2
2. Among executives aged at least 36, give the percentage of those who receive less than 30,000 DA.
P5 P3
i=3 j=1 nij 4 + 9 + 10 + 1 + 6 + 12 + 1 + 4 + 15
100% = 100% = 31% (0.5)
N 200
3. Give the conditional distribution of the age variable knowing that the salary received is in class
[26,30[.
XjY 2 [26; 30[ [20; 28[ [28; 36[ [36; 44[ [44; 52[ [52; 60[
(0.5)
F requency 4 9 10 12 15

4. Plot the scatter plot between age class [20; 28[ and age class [52; 60[: Comment.

(1)

The scatter plot shows a relatively strong negative linear relation between the two classes (0.5)

5. Calculate the correlation coe¢ cient and give the regression line between the two classes then plot it
and give the value of the coe¢ cient of determination.
X1 nij of X 2 [20; 28[ 12 5 4 3 0
X2 nij of X 2 [52; 60[ 1 4 15 17 22
The above scatter plot suggested the presence of linear correlation, hence, we choose pearson’s coef-
…cient of correlation to measure it.
We
P have P P P P
x1i = 24; x2i = 59; (x1i X1 )2 = 78:8; (x2i X2 )2 = 318:8; (x1i X1 )(x2i X2 ) = 140:2:
P
Cov(X1 ; X2 ) (x1i X1 )(x2i X2 )
(X1 ; X2 ) = (X2 ; X1 ) = = qP qP
X1 : X2 2
(x1i X1 ) (x2i X2 )2
(1)
140:2
=p p = -0.88456
78:8 318:8
which indicates the existence of a strong negative linear correlation between the two variables.
Set
X1 = aX2 + b + " and X2 = a0 X1 + b0 + "0
we have
P
Cov(X1 ; X2 ) (x1i X1 )(x2i X2 ) 140:2
a= = P = = -0.43977;
V ar (X2 ) (x2i X2 )2 318:8
24 59
b = X1 aX2 = ( 0:439 77) = 9. 9893
5 5

3
and
Cov(X1 ; X2 ) 140:2
a0 = = = -1. 7792;
V ar (X1 ) 78:8
59 24
b0 = X2 aX1 = ( 1: 779 ) = 20. 339
5 5
thus
X1 = 0:439 77X2 + 9: 989 3 + " and X2 = 1: 779 2X1 + 20: 339 + "0 (1.5)
The coe¢ cient of determination in both cases is
VE (X1 ) VE (X2 )
R2 = = = (X1 ; X2 )2 = 0.78245
V ar (X1 ) V ar (X2 ) (0.5)
Exercice 3. I A diagnostic test for a disease is such that it (correctly) detects the disease in 90% of
the individuals who actually have the disease. Also, if a person does not have the disease, the test will
report that he or she does not have it with probability .9. Only 1% of the population has the disease in
question. If a person is chosen at random from the population and the diagnostic test indicates that she
has the disease
1. what is the conditional probability that she does, in fact, have the disease?
Set H : "a person has the disease" =) P (H) = 1% = 0:01; and E : "the test shows a positive result":
Thus P (EjH) = 90% = 0:9 and P (EjH c ) = 1% = 0:1:
We want to …nd P (HjE) ; using bayes’theorem, we have
(1.5)
P (EjH) P (H) 0:9 0:01
P (HjE) = = = 0.083333
P (EjH) P (H) + P (EjH c ) P (H c ) 0:9 0:01 + 0:1 0:99

2. Are you surprised by the answer? Would you call this diagnostic test reliable?
surprisingly, the test does not appear to be reliable, because even with a positive result, the probability
of having the disease is negligible
II X and Y are independent, discrete random variables whose probability functions are given in the
tables below:
x 1 2 3 y 1 2 3
1 1 1 2 1 1
P (X = x) 3 2 6
P (Y = y) 3 6 6

1. Compute the probability P (X + Y = 4).

P (X + Y = 4) = P (X = 1; Y = 3) + P (X = 2; Y = 2) + P (X = 3; Y = 1)
X and Y are indep.
= P (X = 1) P (Y = 3) + P (X = 2) P (Y = 2) + P (X = 3) P (Y = 1)
11 11 12
= + + = 0.25 (1.5)
36 26 63
2. Compute the conditional probability P (X 2jX + Y = 4)

P (X 2; X + Y = 4) P (X = 1; Y = 3) + P (X = 2; Y = 2)
P (X 2jX + Y = 4) = =
P (X + Y = 4) P (X + Y = 4)
11
36
+ 12 61
= = 0.556 (1)
0:25

4
3. Give the expectation of g (X; Y ) = X 2 Y:
Since X and Y are independent, we have E (g (X; Y )) = E (X 2 Y ) = E (X 2 ) E (Y ) with
1 1 1
E X 2 = 12 + 22 + 32 = 3.8333
3 2 6
2 1 1
E (Y ) = 1 +2 +3 = 1.5 (1.5)
3 6 6
Therefore, E (g (X; Y )) =5. 75

4. Give the CDF of the min (X; Y )


Set Z = min (X; Y ) ; then, the CDF of Z is given by

FZ (z) = P (Z z) = 1 P (Z > z)
= 1 P (min (X; Y ) > z)
= 1 P (X > z and Y > z) (1.5)
= 1 P (X > z) P (Y > z)
= 1 (1 P (X z)) (1 P (Y z))
= 1 (1 FX (z)) (1 FY (z))

You might also like