ISO Module 4 BCS301
ISO Module 4 BCS301
MODULE-4
Statistical Inference 2: Sampling variables, central limit theorem and confidences limit for unknown
mean. Test of Significance for means of two small samples, students-‘t’ distribution, and Chi-square
distribution as a test of goodness of fit. F-Distribution.
Sampling of variables: Each member of the population gives a value of variable and the population is a
frequency distribution of variables.
Thus, a random sample of size 𝑛 from the population is same as selecting n values of variables from those of the
distribution.
Sampling distribution: The probability distribution of a statistic is called a sampling distribution.
Sampling distribution of the mean: The probability distribution of 𝑋̅ is called the sampling distribution of the
mean.
The first important sampling distribution to be considered is that of the mean 𝑋̅. Suppose that a random sample
of n observations is taken from a normal population with mean 𝜇 and variance 𝜎 2 .
Each observation of 𝑋𝑖 , 𝑖 = 1, 2, . . . , 𝑛 of the random sample will then have the same normal distribution as
the population being sampled.
1
Hence, we conclude that 𝑋̅ = (𝑋1 + 𝑋2 + ··· +𝑋𝑛 ) has a normal distribution with mean,
𝑛
1
𝜇𝑋̅ = (𝜇 + 𝜇 + ⋯ + 𝜇) = 𝜇
𝑛
𝑛 𝑡𝑒𝑟𝑚𝑠
Similarly,
2
1 2 2 2
𝜎2
𝜎 𝑋̅ = 2 (𝜎 + 𝜎 + ⋯ + 𝜎 ) =
𝑛 𝑛
𝑛 𝑡𝑒𝑟𝑚𝑠
Therefore, if a population is distributed normally with mean 𝜇 and standard deviation 𝜎, then the means of all
𝜎
positive random samples of size 𝑛, are also distributed normally with mean 𝜇 and standard error 𝑛 .
√
If we are sampling from a population with unknown distribution, either finite or infinite, the sampling
𝜎2
distribution of 𝑋̅ will still be approximately normal with mean 𝜇 and variance , provided that the sample size
𝑛
is large.
This amazing result is an immediate consequence of the following theorem, called the Central Limit Theorem.
Central limit theorem: If the variable 𝑋 has a non-normal distribution with mean 𝜇 and standard deviation 𝜎,
𝑥−𝜇
then the limiting distribution of 𝑧 = 𝜎 as 𝑛 ⟶ ∞, is the standard normal distribution.
√𝑛
This theorem holds good for a sample of 25 or more which is regarded as large.
1. An electrical firm manufactures light bulbs that have a length of life that is approximately normally
distributed, with mean equal to 800 hours and a standard deviation of 40 hours. Find the probability that a
random sample of 16 bulbs will have an average life of less than 775 hours.
Solution: Given that 𝜇 = 800 and 𝜎 = 40.
The sampling distribution of 𝑥̅ will be approximately normal,
𝜎
with 𝜇 = 800 and standard error 𝜎𝑋̅ = 𝑛 = 40/ √16 = 10.
√
𝑥−𝜇 𝑥−800
𝑧= 𝜎 = .
10
√𝑛
probability that a random sample of 16 bulbs will have an average life of less than 775 hours is
𝑃(𝑥 < 775) = 𝑃(𝑧 < −2.5) = 0.5 − 𝐴(2.5) = 0.5 − 0.4938 = 0.0062.
2. The mean of a certain normal population is equal to the standard error of the mean of samples of 100 from
that distribution. Find the probability that the mean of the sample of 25 from the distribution will be negative.
Review questions:
1. Traveling between two campuses of a university in a city via shuttle bus takes, on average, 28 minutes with a
standard deviation of 5 minutes. In a given week, a bus transported passengers 40 times. What is the probability
that the average transport time was more than 30 minutes? Assume the mean time is measured to the nearest
minute.
Solution: Given that 𝜇 = 28 and 𝜎 = 5.
The sampling distribution of 𝑥̅ will be approximately normal,
𝜎
and standard error 𝜎𝑥̅ = = 5/ √40 = 0.7906.
√𝑛
𝑥−𝜇 𝑥−28
𝑧= 𝜎 = 0.7906.
√𝑛
Since the time is measured on a continuous scale to the nearest minute, an 𝑥̅ greater than 30 is equivalent to
𝑥̅ ≥ 30.5.
Probability that a random sample of 40 times will have an average transport time was more than 30 minutes is
𝑃(𝑥 > 30.5) = 𝑃(𝑧 > 3.1622) = 0.5 − 𝐴(3.1622) = 0.5 − 0.4992 = 0.0008.
2. A sample of 900 members is found to have a mean of 3.4𝑐𝑚 . Can it be reasonably regarded as truly random
sample from a large population with mean 3.25𝑐𝑚 and standard deviation 1.61cm?
Deviation of sample mean from the mean is significant, and hence it cannot be regarded as random sample.
3. An important manufacturing process produces cylindrical component parts for the automotive industry. It is
important that the process produce parts having a mean diameter of 5.0 millimeters. The engineer involved
conjectures that the population mean is 5.0 millimeters. An experiment is conducted in which 100 parts
produced by the process are selected randomly and the diameter measured on each. It is known that the
population standard deviation is σ = 0.1 millimeter. The experiment indicates a sample average diameter of 𝑥 =
5.027 millimeters. Does this sample information appear to support or refute the engineer’s conjecture?
Solution: H: Sample data 𝑥 = 5.027 support the conjecture 𝜇 = 5.0.
Given that 𝜇 = 5 and 𝜎 = 0.1.
The sampling distribution of 𝑥̅ will be approximately normal,
𝜎
with 𝑥 = 5.027 and standard error 𝜎𝑥̅ = = 0.1/ √100 = 0.01.
√𝑛
𝑥−𝜇 0.027
𝑧= 𝜎 = = 2.7 > 2.58.
0.01
√𝑛
Therefore, the data does not support the conjecture that 𝜇 = 5.0.
Review questions:
1. For what values of difference between the sample mean and mean of the population, null hypothesis is
rejected in 0.05 level of significance if the standard deviation of the population 5 and sample size is 100?
2. For what values of difference between the sample mean and mean of the population null hypothesis is
accepted in 0.01 level of significance if the standard deviation of the population 5 and sample size is 100?
3. Standard error of the all positive random samples is 1, and variance of the population is 25, then the size of
the sample is?
4. If a population is distributed normally with mean 𝜇, then 𝑋̅ has a normal distribution with mean?
5. For what values of difference between the sample mean and mean of the population, hypothesis that sample
mean is more than population mean is rejected in 0.05 level of significance if the standard deviation of the
population 5 and sample size is 100?
6. For what values of difference between the sample mean and mean of the population, hypothesis that sample
mean is less than population mean is accepted in 0.05 level of significance if the standard deviation of the
population 5 and sample size is 100?
𝑥−𝜇 𝜎 𝜎 𝜎 𝜎
|𝑧| < 𝑧0 ⟹ | 𝜎 | < 𝑧0 ⟹ 𝜇 − 𝑧0 < 𝑥 < 𝜇 + 𝑧0 and 𝑥 − 𝑧0 < 𝜇 < 𝑥 + 𝑧0 .
√𝑛 √𝑛 √𝑛 √𝑛
√𝑛
1. A soft-drink machine is regulated so that the amount of drink dispensed averages 240 milliliters with a
standard deviation of 15 milliliters. Periodically, the machine is checked by taking a sample of 40 drinks and
computing the average content. If the mean of the 40 drinks is a value within the interval 𝜇𝑥̅ ± 2𝜎𝑥̅ , the
machine is thought to be operating satisfactorily; otherwise, adjustments are made. The company official found
the mean of 40 drinks to be 𝑥 = 236 milliliters and concluded that the machine needed no adjustment. Was this
a reasonable decision?
Solution: Given that 𝜇𝑥̅ = 𝜇 = 240, 𝜎 = 15 .
𝜎
𝑛 = 40, 𝜎𝑥̅ = = 2.3717 and 𝑥 = 236.
√𝑛
If the mean of the 40 drinks is a value within the interval 𝜇𝑥̅ ± 2𝜎𝑥̅ ,
then confident limit of 𝑥 is 𝜇𝑥̅ − 2𝜎𝑥̅ ≤ 𝑥 ≤ 𝜇𝑥̅ + 2𝜎𝑥̅ ⟹ 235.26 ≤ 𝑥 ≤ 244.74
Since 𝑥 = 236, which is within the limit. Hence, yes, the decision is reasonable.
2. Traveling between two campuses of a university in a city via shuttle bus takes, on average, 28 minutes with a
standard deviation of 5 minutes. In a given week, a bus transported passengers 40 times. Find the confident
limit of the average transport time at 5% level of significance.
Solution: Given that 𝜇 = 28 and 𝜎 = 5.
The sampling distribution of 𝑥̅ will be approximately normal,
𝜎
and standard error 𝜎𝑥̅ = = 5/ √40 = 0.7906.
√𝑛
𝑥−𝜇 𝑥−28
𝑧= 𝜎 = 0.7906.
√𝑛
𝜎 𝜎
Confident limit of the average transport time is 𝜇 − 𝑧0 < 𝑥 < 𝜇 + 𝑧0
√𝑛 √𝑛
Review questions:
1. Difference between the sample mean and mean of the population 𝑧 < 0 , then the hypothesis that sample
mean is more than population mean is rejected or accepted?
2. Difference between the sample mean and mean of the population 𝑧 > 0 , then the hypothesis that sample
mean is less than population mean is rejected or accepted?
3. If 𝑧 < −2.33 , then the hypothesis that sample mean is less than population mean is rejected or accepted in
1% level?
4. If |𝑧| > 2.33 , is the alternate hypothesis that sample mean is less than population mean is accepted?
5. If |𝑧| > 1.645 , is the alternate hypothesis that sample mean is more than population mean is rejected?
If 𝑧 > 2.58, then the sampling is not simple or samples are not drawn from the same population.
If 𝑧 > 1.96, then the difference is significant at 5% level of significance.
If independent samples of size 𝑛1 and 𝑛2 are drawn at random from two different populations, discrete or
continuous, with means 𝜇1 and 𝜇2 and variances 𝜎12 and 𝜎22 , respectively, then the sampling distribution of the
differences of means, 𝑥1 − 𝑥2 , is approximately normally distributed with mean 𝜇𝑥1 −𝑥2 = 𝜇1 − 𝜇2 ,
𝜎2 𝜎2
and standard error 𝑒 = √𝑛1 + 𝑛2 .
1 2
𝑥1 ~𝑥2
If the means of different populations are same then, 𝑧 = .
𝑒
Examples:
1. Two independent experiments are run in which two different types of paint are compared. Eighteen
specimens are painted using type A, and the drying time, in hours, is recorded for each. The same is done with
type B. The population standard deviations are both known to be 1. Assuming that the mean drying time is
equal for the two types of paint, find the probability that the difference 𝑥1 − 𝑥2 in the sample is at least 15
minutes, where 𝑥1 and 𝑥2 are average drying times for samples of size 18 for type A and B respectively.
Solution: Given that 𝑛1 = 𝑛2 = 18, 𝜎1 = 𝜎2 = 1 and 𝜇1 = 𝜇2 .
𝜎2 𝜎2 1 1 1
Standard error 𝑒 = √𝑛1 + 𝑛2 = √18 + 18 = 3 .
1 2
𝑃(𝑥1 − 𝑥2 > 0.25) = 𝑃(𝑧 > 0.75) = 0.5 − A(0.75) = 0.5 − 0.2734 = 0.2266.
2. The means of simple samples of sizes 1000 and 2000 are 67.5 and 68.0 cm respectively. Can the samples are
drawn from the same population of S.D. 2.5cm?
Solution: H: Let samples are drawn from the same population of S.D. 2.5cm.
1 1
𝑒 = 𝜎√𝑛 + 𝑛 = 0.0968 ,
1 2
𝑥1 ~𝑥2 0.5
𝑧= = 0.0968 = 5.16 > 2.58. Difference is significant, hypothesis is rejected.
𝑒
Hence the samples are not drawn from the same population.
3. A sample of height of 6400 soldiers has a mean of 67.85 inches and S.D. 2.56 inches while a sample of
heights of 1600 sailors has a mean of 68.55 inches and S.D. 2.52 inches. Do the data indicate that sailors are on
the average taller than the soldiers? Test the hypothesis at 1% level of significance.
𝑥2 −𝑥1 0.7
and 𝑧 = = 0.0707 = 9.9010 > 2.33. (1% level of significance in one-tailed test).
𝑒
Review questions:
1. To test the two samples of same population, 𝜇1 − 𝜇2 = ?
2. To test the two samples of different populations with same mean, 𝜇1 − 𝜇2 = ?
1 1
3. If the standard error 𝑒 = 𝜎√𝑛 + 𝑛 then 𝑧 =?
1 2
𝜎2 𝜎2
4. If the standard error 𝑒 = √𝑛1 + 𝑛2 then 𝑧 =?
1 2
𝑥2 −𝑥1
5. If 𝜇1 − 𝜇2 = 0.5 , 𝑒 = 1 and 𝑧 = = 2.3 then null hypothesis is rejected or accepted in 5% level.?
𝑒
𝑥2 −𝑥1
6. If 𝜇1 − 𝜇2 = 0.5 , 𝑒 = 0.5 and 𝑧 = = 3 then null hypothesis is rejected or accepted in 1% level.?
𝑒
1. The television picture tubes of manufacturer A have a mean lifetime of 6.5 years and a standard deviation of
0.9 year, while those of manufacturer B have a mean lifetime of 6.0 years and a standard deviation of 0.8 year.
What is the probability that a random sample of 36 tubes from manufacturer A will have a mean lifetime that is
at least 1 year more than the mean lifetime of a sample of 49 tubes from manufacturer B?
Solution: Given that
A B
𝜇1 = 6.5 𝜇2 = 6
𝜎1 = 0.9 𝜎2 = 0.8
𝑛1 = 36 𝑛2 = 49
The probability that the mean lifetime for 36 tubes from manufacturer A will be at least 1 year longer than the
mean lifetime for 49 tubes from manufacturer B is
𝑃( 𝑥1 − 𝑥2 ≥ 1) = 𝑃(𝑧 ≥ 2.6511) = 0.5 − 𝐴(2.6511) = 0.5 − 0.4959 = 0.0041 .
2. Two independent experiments are run in which two different types of paint are compared. If someone did the
experiment 10,000 times under the condition that 𝜇1 = 𝜇2 , If the population standard deviations are both known
to be 1.0, in how many of those 10,000 experiments would there be a difference 𝑥1 − 𝑥2 that was as large as (or
larger than) 1.0?
𝜎2 𝜎2 1 1
Solution: Clearly 𝜇1 − 𝜇2 = 0 and standard error 𝑒 = √𝑛1 + 𝑛2 = √10,000 + 10,000 = 0.0141.
1 2
The probability that the difference 𝑥1 − 𝑥2 that was as large as (or larger than) 1.0 is
𝑃( 𝑥1 − 𝑥2 ≥ 1) = 𝑃(𝑧 ≥ 70.922) = 0.5 − 𝐴(70.922) = 0.5 − 0.5 = 0 .
Hence, none of the 10,000 experiments with a difference 𝑥1 − 𝑥2 more than 1.0.
3. The mean score for freshmen on an aptitude test at a certain college is 540, with a standard deviation of 50.
Assume the means to be measured to any degree of accuracy. What is the probability that two groups selected at
random, consisting of 32 and 50 students, respectively, will differ in their mean scores by (a) more than 20
points? (b) an amount between 5 and 10 points.
Solution: Given that, 𝜇 = 540 , 𝜎 = 50, ∴ 𝜇𝑥1 −𝑥2 = 𝜇1 − 𝜇2 = 0 .
1 1 1 1
𝑛1 = 32, 𝑛2 = 50 , 𝑒 = 𝜎√𝑛 + 𝑛 = 50√32 + 50 = 11.3192 .
1 2
( 𝑥1 −𝑥2 ) ( 𝑥1 −𝑥2 )
𝑧= = .
𝑒 11.3192
a) Probability that two groups will differ in their mean scores by more than 20 points
= 𝑃(𝑥1 − 𝑥2 > 20) + 𝑃(𝑥2 − 𝑥1 > 20)
= 2𝑃(𝑧 > 1.7669) = 2(0.5 − 𝐴(1.7669)) = 2(0.5 − 0.4616) = 0.0768.
DEPARTMENT OF SCIENCE & HUMANITIES /C.E.C.
9
MATHEMATICS FOR COMPUTER SCIENCE (BCS301) 2024
b) Probability that two groups will differ in their mean scores by between 5 and 10 points
= 𝑃(5 ≤ |𝑥1 − 𝑥2 | ≤ 10)
= 2𝑃(0.4417 ≤ 𝑧 ≤ 0.8835) = 2(𝐴(0.8835) − 𝐴(0.4417)) = 2(0.3106 − 0.1700) = 0.2812.
Review questions:
1. To test the two samples of different populations with same mean, 𝑧 = ?
𝑥2 −𝑥1
2. To test the two samples of different populations, if 𝑧 = then standard error 𝑒 =?
𝑒
𝜎2 𝜎2
3. If the standard error 𝑒 = √𝑛1 + 𝑛2 then 𝑧 =?
1 2
𝑥2 −𝑥1
4. If 𝜇1 − 𝜇2 = 1 , 𝑒 = 1 and 𝑧 = = 3 then null hypothesis is rejected or accepted in 1% level.?
𝑒
𝑥2 −𝑥1
5. If 𝜇1 − 𝜇2 = 0.5 , 𝑒 = 0.5 and 𝑧 = = 4 then null hypothesis is rejected or accepted in 1% level.?
𝑒
1. If all possible samples of size 16 are drawn from a normal population with mean equal to 50 and standard
deviation equal to 5, what is the probability that a sample mean 𝑥 will fall in the interval from
𝜇 − 1.9𝑒 to 𝜇 − 0.4𝑒 ? Assume that the sample means can be measured to any degree of accuracy
Solution: Given that 𝜇 = 50, 𝜎 = 5 and 𝑛 = 16.
𝑥−𝜇 𝑥−50
𝑧= 𝜎 = .
1.25
√𝑛
Probability that a random sample of 36 of these resistors will have a combined resistance of more than 1458
1458
ohms = 𝑃 ( 𝑥 > 36 ) = 𝑃( 𝑥 > 40.5 ) = 𝑃(𝑧 > 1.5) = 0.5 − 𝐴(1.5) = 0.5 − 0.4332 = 0.0668.
Lecture-7 Test of Significance for means of two small samples. Students-‘t’ distribution, as a test of
goodness of fit.
𝑺𝒕𝒖𝒅𝒆𝒏𝒕’𝒔 𝒕 − 𝑫𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏: Consider a small sample of size 𝑛, drawn from a normal population with
mean 𝜇 and S.D. 𝜎 . If 𝑥 𝑎𝑛𝑑 𝜎𝑠 be the sample mean and S.D. Then the statistic, 𝑡 is defined as
𝑥−𝜇
𝑡= √𝑛 − 1, where 𝜈 = 𝑛 − 1 denotes the degree of freedom of 𝑡.
𝜎𝑠
𝑦0
Sampling distribution for 𝑡 is called Student’s t − Distribution is given by 𝑦 = 𝜈+1 .
𝑡2 2
(1+ )
𝜈
∞
The probability 𝑃 that the value of 𝑡 will exceed 𝑡0 is 𝑃 = ∫𝑡 𝑦 𝑑𝑡
0
Where 𝑦0 is a constant such that the area under the curve is unity.
t-curve
Normal curve
0 t
Significance test of a sample mean: Given a random sample 𝑥1 , 𝑥2 , 𝑥3 , ⋯ ⋯ 𝑥𝑛 from a normal population, we
have to test the hypothesis that the mean of the population is 𝜇.
𝑥−𝜇
For this, we first calculate 𝑡 = √𝑛 − 1
𝜎𝑠
∑𝑛
1 𝑥𝑖 1 𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )
2
Where, 𝑥 = , 𝜎𝑠 2 = 𝑛−1 ∑𝑛1(𝑥𝑖 − 𝑥)2 = .
𝑛 𝑛(𝑛−1)
Then find the value of 𝑃 for the given d.f. from the table.
If the calculated 𝑡 > 𝑡0.05 , the difference between 𝑥 and 𝜇 is said to be significant at 5% level of significance.
If 𝑡 > 𝑡0.01 , the difference between 𝑥 and 𝜇 is said to be significant at 1% level of significance.
If 𝑡 < 𝑡0.05 , the data is said to be consistent with the hypothesis.
Examples:
1. A certain stimulus administered to each of 12 patients resulted in the following increases of blood pressure:
5, 2, 8, -1, 3, 0, -2, 1, 5, 0, 4, 6. Can it be concluded that the stimulus will in general be accompanied by an
increase in blood pressure.
Solution: 𝐻1 : The stimulus will increase the blood pressure.
𝐻0 : The stimulus does not change the B.P.
Taking the population to be normal with mean 0 and S.D. 𝜎 .
∑𝑛
1 𝑥𝑖
𝑥= = 2.5833,
𝑛
1
𝜎𝑠 2 = ∑𝑛1(𝑥𝑖 − 𝑥)2
𝑛−1
1
= [5.8404 + 0.3402 + 29.3406 + 12.84 + 0.1736 + 6.6734 + 21.0066 + 2.5068 + 5.8404 + 6.6734 + 2.0070 + 11.6738]
11
= 9.5378
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )
2 12×185−312
Or, 𝜎𝑠 2 = = = 9.5379.
𝑛(𝑛−1) 12×11
∴ 𝜎𝑠 = 3.0883
𝑥−𝜇 2.5833−0
Now 𝑡= √𝑛 − 1 = √11 = 2.7743.
𝜎𝑠 3.0883
For 𝜈 =11, from the table 𝑡0.05 = 1.8. for single-tailed test.
Since, 𝑡 > 𝑡0.05, the difference between 𝑥 and 𝜇 is said to be significant at 5% level of significance.
Therefore hypothesis is rejected, that is the stimulus will increase the B.P.
Review questions:
1. Find 𝑡0.025 for 𝜈 = 8 in one tailed test.
2. Find 𝑡0.05 for 𝜈 = 30 in one tailed test.
3. Find 𝑡0.1 for 𝜈 = 9 in one tailed test.
4. Find 𝑡0.005 for 𝜈 = 12 in one tailed test.
5. The t-curve is symmetrical about the line .
𝑥−𝜇 49.1111−47.5
𝑡= √𝑛 − 1 = √8 = 1.7395.
𝜎𝑥 2.6196
For 𝜈 = 8, 𝑡0.05 = 2.31 , since 𝑡 < 𝑡0.05 , the value of t is not significant at 5% level of significance. Thus
the test provides no evidence against the population mean being 47.5.
2. Ten individuals are chosen at random from a population and their heights in inches are found to be 63, 63,
66, 67, 68, 69, 70, 70, 71, 71. Test the hypothesis that the mean height of the universe is 66 inches.
(For d.f. 9, 𝑡0.05 = 2.262 )
Solution: 𝐻: 𝑥 = 𝜇 = 66.
∑𝑛
1 𝑥𝑖
∑𝑛1 𝑥𝑖 = 678, ∑𝑛𝑖=1 𝑥𝑖 2 = 46050, ∴ 𝑥 = = 67.8,
𝑛
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )
2 10×46050−6782
𝜎𝑠 2 = = = 9.0667.
𝑛(𝑛−1) 10×9
3. A machinist is making engine parts with axle diameter of 0.7 inch. A random sample of 10 parts shows mean
diameter 0.742 inch with a S.D 0.04 inch. On the basis of this sample, would you say that the work is inferior?
Solution: 𝐻1 : 𝑥 ≠ 𝜇 = 0.7 (The work is inferior)
𝐻0 : 𝑥 = 𝜇 = 7. (The work is not inferior)
i.e. there is no significant difference between 𝑥 & 𝜇 .
Given that, 𝑥 = 0.742, 𝜇 = 0.7, 𝜎𝑠 = 0.04, 𝑛 = 10.
𝑥−𝜇 0.742−0.7
𝑡= √𝑛 − 1 = √9 = 3.15.
𝜎𝑠 0.04
For 𝜈 = 9, 𝑡0.05 = 2.262 , since 𝑡 > 𝑡0.05 , the value of t is significant at 5% level of significance.
This implies that 𝑥 differs significantly from 𝜇 and null hypothesis is rejected. Hence the work is inferior.
4. For a random sample of 16 values with mean 41inches, and the sum of the squares of the deviations from the
mean is 135 𝑖𝑛𝑐ℎ𝑒𝑠 2 . Estimate the 95% confident limits for the mean of the population.
(𝑡0.05 = 2.13 for𝜈 = 15)
DEPARTMENT OF SCIENCE & HUMANITIES /C.E.C.
15
MATHEMATICS FOR COMPUTER SCIENCE (BCS301) 2024
Review questions:
1. The 𝑡-test is applicable to samples for which 𝑛 is .
2. The t-curve attains its maximum value at .
3. Find 𝑡0.0005 for 𝜈 = 13 in one tailed test.
4. Find 𝑡0.0005 for 𝜈 = 30 in one tailed test.
5. Find 𝑡0.1 for 𝜈 = 15 in one tailed test.
𝑥~𝑦 1 1
Significance test of difference between sample means: 𝑡 = , where 𝑒 = 𝜎√𝑛 + 𝑛
𝑒 1 2
𝑛 𝑛
∑1 1 𝑥𝑖 ∑1 2 𝑦𝑖
Where 𝑥= , 𝑦= ,
𝑛1 𝑛2
𝑛 𝑛 𝑛 𝑛
1 𝑛 𝑛 1 𝑛1 ∑1 1 𝑥𝑖 2 −(∑1 1 𝑥𝑖 )2 𝑛2 ∑1 2 𝑦𝑖 2 −(∑1 2 𝑦𝑖 )2
𝜎2 = 𝑛 {∑1 1(𝑥𝑖 − 𝑥)2 + ∑1 2 (𝑦𝑖 − 𝑦)2 } = 𝑛 { + }.
1 +𝑛2 −2 1 +𝑛2 −2 𝑛1 𝑛2
1
For the different standard deviation, 𝜎 2 = 𝑛 {(𝑛1 − 1)𝜎𝑥 2 + (𝑛2 − 1)𝜎𝑦 2 }
1 +𝑛2 −2
1. Two horses A and B were tested according to the time (in seconds) to run a particular race gives the
following results
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 29
Test whether you can discriminate between the two horses. (For, 𝜈 = 11, 𝑡0.05 = 2.2, 𝑡0.02 = 2.72)
𝑛 𝑛
𝑛 ∑1 1 𝑥𝑖 ∑1 2 𝑦𝑖
Solution: ∑1 1 𝑥𝑖 = 219, ∑𝑛1 2 𝑦𝑖 = 169 𝑥= = 31.2857, 𝑦 = = 28.1667 ,
𝑛1 𝑛2
1 1
𝑒 = 𝜎√𝑛 + 𝑛 = 1.2804.
1 2
2. A group of boys and girls were given an intelligence test. The mean score, S.D.s and numbers in each group
are as follows.
Boys Girls
Mean 124 121
S.D. 12 10
𝑛 18 14
1 1
∴ 𝜎 = 11.1774. 𝑒 = 𝜎√𝑛 + 𝑛 = 3.9830
1 2
𝑥~𝑦 124−121
𝑡= = = 0.7532 < 2.04 = 𝑡0.05 . (for 𝜈 = 30).
𝑒 3.9830
Null hypothesis is accepted. Mean score of boys are not significantly different from that of girls.
3. Eleven school boys were given a test in drawing. Further they were given a month’s tuition and a second test
of equal difficulty was held at the end of it. Do the marks give the evidence that students have benefitted by
extra coaching? ( For d.f. 𝜈 =10, 𝑡0.05 = 1.812 for one tailed test)
Boys 1 2 3 4 5 6 7 8 9 10 11
I-test 23 20 19 21 18 20 18 17 23 16 19
II-test 24 19 22 18 20 22 20 20 23 20 17
Solution:
∑ 𝑑 = 11, ∑ 𝑑2 = 61
∑𝑑 𝑥2 𝑥1 𝑑 = 𝑥2 − 𝑥1
𝑑= =1 24 23 1
𝑛
1 2 𝑛 ∑𝑛 2 𝑛 2 11×61−112
19 20 -1
1 𝑑𝑖 −(∑1 𝑑𝑖 )
𝜎𝑠 2 = 𝑛−1 {∑𝑛1(𝑑𝑖 − 𝑑) } = = =5. 22 19 3
𝑛(𝑛−1) 11×10
18 21 -3
∴ 𝜎𝑠 = 2.2361 . 20 18 2
𝐻1 : Students had benefitted by extra coaching. 𝑑 > 0, or ̅̅̅
𝑥2 > ̅̅̅
𝑥1 22 20 2
𝐻0 : Students have not been benefitted by extra coaching. 𝑑 ≤ 0. 20 18 2
Then the mean of the difference between the marks 𝜇 = 0. 20 17 3
𝑑−𝜇 √11 23 23 0
𝑡= √𝑛 = 2.2361 = 1.4832 < 𝑡0.05 = 1.812. 20 16 4
𝜎𝑠
17 19 -2
Hence difference is not significant. Accept 𝐻0 , reject 𝐻1 .
∑ 𝑑 = 11
There is no evidence that the students have benefitted by extra coaching.
Review questions:
1. If the two samples are of same size then expected value of difference is?
2. If the two samples are of same size, and are of different standard deviations, then 𝜎 2 =?
3. If the two samples are of same size 𝑛, then 𝜎 2 =?
4. If the two samples are of same size 𝑛, to test ̅̅̅
𝑥1 = 𝑥
̅̅̅2 the degree of freedom is?
5. If the two samples are of same size 𝑛, to test mean of differences is zero, the degree of freedom is?
CHI-SQUARE (𝝌𝟐 ) TEST: If 𝑂𝑖 and 𝐸𝑖 are observed and expected frequencies for 𝑖 = 1,2 ⋯ 𝑛.
(𝑂𝑖 −𝐸𝑖 )2
Then 𝜒2 = ∑ with 𝑛 − 1 degrees of freedom.
𝐸𝑖
𝜒2 𝜈−1
2 −
The equation of 𝜒 curve is 𝑦 = 𝑦0 𝑒 2 ( 𝜒2) 2 , where 𝜈 = 𝑛 − 1.
Goodness of fit: The value of 𝜒 2 is used to test whether the deviations of the observed frequencies from
theoretical frequencies are significant or not.
Examples:
1. In experiments on pea breeding, the following frequencies of seeds were obtained.
Round and Wrinkled Round and Wrinkled Total
yellow and yellow green and green
315 101 108 32 556
Theory predicts that the frequencies should be in proportions 9: 3: 3: 1. Examine the correspondence between
theory and experiment.
9 3 3 1
Theoretical frequencies are 16 × 556, 16 × 556, 16 × 556, 16 × 556 .
i.e. 313, 104, 104, 35.
(𝑂𝑖 −𝐸𝑖 )2 4 9 16 9
𝜒2 = ∑ = 313 + 104 + 104 + 35 = 0.5103 .
𝐸𝑖
2 2
For d.f. 𝜈 = 𝑛 − 1 = 3, 𝜒0.05 = 7.815. Since calculated value of 𝜒 2 is much less than 𝜒0.05 , there is a very
high degree of agreement between theory and experiment.
2. A set of five similar coins is tossed 320 times and the result is
No. of heads 0 1 2 3 4 5
𝑓 6 27 72 112 71 32
Test the hypothesis that the data follow a binomial distribution.
5
𝑛 𝐶
Solution: In binomial distribution 𝑃(𝑥) = 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = 32𝑥 .
No. of heads 0 1 2 3 4 5
𝑂𝑥 6 27 72 112 71 32
5
𝐶𝑥 10 50 100 100 50 10
𝐸𝑥 = × 320
32
(𝑂𝑖 −𝐸𝑖 )2 16 529 784 144 441 484
𝜒2 = ∑ = 10 + + 100 + 100 + + = 78.68 .
𝐸𝑖 50 50 10
2 2
For d.f. 𝜈 = 𝑛 − 1 = 5, 𝜒0.05 = 11.07. Since calculated value of 𝜒 2 is much greater than 𝜒0.05 , the
hypothesis that the data follow the binomial distribution is rejected.
3. The following table gives the number of aircraft accidents that occurred during the various days of the week.
Find whether the accidents are uniformly distributed over the week.
Days Sun Mon Tue Wed Thru Fri Sat Total
No. of accidents 14 16 8 12 11 9 14 84
2
For d.f. 𝜈 = 6 , 𝜒0.05 = 12.59
Solution: If 𝑂𝑖 and 𝐸𝑖 are observed and expected frequencies for 𝑖 = 1,2 ⋯ 7.
Clearly 𝐸𝑖 = 12 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖
(𝑂𝑖 −𝐸𝑖 )2 4+16+16+0+1+9+4
Then 𝜒2 = ∑ = = 4.1667 < 12.59 .
𝐸𝑖 12
2
Since calculated value of 𝜒 2 is much less than 𝜒0.05 , the accidents are uniformly distributed over the week.
4. A machine is supposed to mix peanuts, hazelnuts, cashews, and pecans in the ratio 5:2:2:1. A can containing
500 of these mixed nuts was found to have 269 peanuts, 112 hazelnuts, 74 cashews, and 45 pecans. At the 0.05
level of significance, test the hypothesis that the machine is mixing the nuts in the ratio 5:2:2:1.
Solution: Since the ratio of the peanuts, hazelnuts, cashews, and pecans is 5:2:2:1, i.e. 50%, 20%, 20% and 10%
respectively.
Therefore, expected number of peanuts, hazelnuts, cashews, and pecans out of 500 is 250, 100, 100 and 50
respectively. Observed values are 269, 112, 74, and 45 respectively.
(𝑂𝑖 −𝐸𝑖 )2 361 144 676 25
𝜒2 = ∑ = 250 + 100 + 100 + 50 = 10.144 > 7.815 .
𝐸𝑖
2
Since calculated value of 𝜒 2 is more than 𝜒0.05 , reject the hypothesis that the machine is mixing the nuts in the
ratio 5:2:2:1.
Review questions:
1. If the standard deviation of a 𝜒 2 distribution is 10, then its degree of freedom is .
2. Mean and variance of 𝜒 2 distribution with degree of freedom is 8 are and respectively.
3. For the degree of freedom 1, 𝜒 2 distribution reduces to distribution.
Lecture-11 F-Distribution.
F-Distribution: Let 𝑥1 , 𝑥2 , 𝑥3 ⋯ 𝑥𝑛1 and 𝑦1 , 𝑦2 , 𝑦3 ⋯ 𝑦𝑛1 are two independent random samples of a normal
populations with equal standard deviation 𝜎 . Let 𝑥 and 𝑦 are sample mean,
1 𝑛 1 𝑛
𝑠1 2 = 𝑛 {∑1 1(𝑥𝑖 − 𝑥)2 } and 𝑠2 2 = 𝑛 {∑1 2 (𝑦𝑖 − 𝑦)2 } be the sample variance.
1 −1 2 −1
𝑠 2 𝑠 2
Then define 𝐹 = 𝑠1 2 or 𝐹 = 𝑠2 2 depending on either 𝑠1 2 > 𝑠2 2 or 𝑠2 2 > 𝑠1 2 respectively.
2 1
This gives F-distribution (or variance ratio distribution). Clearly F-distribution depends only on 𝜈1 and 𝜈2 .
𝐹𝛼 (𝜈1 , 𝜈2 ) is the value of 𝐹 for 𝜈1 and 𝜈2 such that area to the right of 𝐹𝛼 is 𝛼.
F-distribution is useful for testing the equality of population means by comparing the sample variances.
Examples:
1. Two samples of sizes 9 and 8 give the sum of squares of deviations from their respective means equal to 160
and 91 respectively. Can these be regarded as drawn from the same normal population?
Solution: Given that ∑91(𝑥𝑖 − 𝑥)2 = 160 and ∑81(𝑦𝑖 − 𝑦)2 = 91
1 𝑛 160 1 𝑛 91
𝑠1 2 = 𝑛 {∑1 1(𝑥𝑖 − 𝑥)2 } = = 20 , 𝑠2 2 = 𝑛 {∑1 2(𝑦𝑖 − 𝑦)2 } = = 13.
1 −1 8 2 −1 7
𝑠 2 20
𝐹 = 𝑠1 2 = 13 = 1.5385.
2
From the table, 𝐹0.05 (8, 7) = 3.73. Since the calculated value of 𝐹 < 𝐹0.05 , populations variances are not
significantly different. Hence, the samples can be regarded as drawn from the same normal population.
Examine whether the samples have been drawn from normal populations having the same variance.
Given that 𝐹0.05 (6, 5) = 4.95 and 𝐹0.05 (5, 6) = 4.39 .
Solution: Let the samples have been drawn from normal populations having the same variance.
𝑛
∑1 1 𝑥𝑖 28+30+32+33+33+29+34 219
𝑥= = = = 31.2857
𝑛1 7 7
𝑛 𝑛
1 𝑛 𝑛1 ∑1 1 𝑥𝑖 2 −(∑1 1 𝑥𝑖 )2 7×6883−2192
𝑠1 2 = 𝑛 −1
{∑1 1(𝑥𝑖 − 𝑥)2 } = 𝑛1 (𝑛1 −1)
= 7×6
= 5.2381
1
𝑛
∑1 2 𝑦𝑖 29+30+30+24+27+29 169
𝑦= = = = 28.1667
𝑛2 6 6
𝑛 𝑛
1 𝑛 𝑛2 ∑1 2 𝑦𝑖 2 −(∑1 2 𝑦𝑖 )2 6×4787−1692
𝑠2 2 = 𝑛 {∑1 2(𝑦𝑖 − 𝑦)2 } = = = 5.3667
2 −1 𝑛2 (𝑛2 −1) 6×5
𝑠2 2 5.3667
𝐹= = = 1.0245.
𝑠1 2 5.2381
Given that, 𝐹0.05 (5, 6) = 4.39. Clearly the calculated value of 𝐹 < 𝐹0.05 , the samples can be regarded as
drawn from the same normal population.
Review questions:
1. If 𝑠2 2 > 𝑠1 2 , then, 𝐹 = .
𝑆12
2. If = 0.4, then, 𝐹 = .
𝑆2
𝑆12
3. If 𝑛1 = 7, 𝑛2 = 5, > 1 , to test the hypothesis whether 𝐹∝ (6, 4) is used or 𝐹∝ (4, 6) is used?
𝑆2
𝑆12
4. If 𝑛1 = 9, 𝑛2 = 7, < 1 , to test the hypothesis whether 𝐹∝ (8, 6) is used or 𝐹∝ (6, 8) is used?
𝑆2
1 𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )
2
Prove that 𝜎 2 = 𝑛−1 ∑𝑛1(𝑥𝑖 − 𝑥)2 = 𝑛(𝑛−1)
2
Proof: ∑𝑛1(𝑥𝑖 − 𝑥)2 = ∑𝑛1(𝑥𝑖 2 − 2𝑥𝑖 𝑥 + 𝑥 )
2
= ∑𝑛1(𝑥𝑖 2 − 2𝑥𝑖 𝑥 + 𝑥 )
2
= ∑𝑛1 𝑥𝑖 2 − ∑𝑛1 2𝑥𝑖 𝑥 + ∑𝑛1 𝑥
2 ∑𝑛
1 𝑥𝑖
= ∑𝑛1 𝑥𝑖 2 − 2𝑥 ∑𝑛1 𝑥𝑖 + 𝑥 ∑𝑛1 1 (Since 𝑥 = , ∑𝑛1 𝑥𝑖 = 𝑛𝑥 and ∑𝑛1 1 = 𝑛)
𝑛
2 2
= ∑𝑛1 𝑥𝑖 2 − 2𝑛𝑥 + 𝑛𝑥
2 ∑𝑛
1 𝑥𝑖 2 (∑𝑛
𝑖=1 𝑥𝑖 )
2
= ∑𝑛1 𝑥𝑖 2 − 𝑛𝑥 (Since 𝑥 = , 𝑥 = )
𝑛 𝑛2
(∑𝑛
𝑖=1 𝑥𝑖 )
2
= ∑𝑛1 𝑥𝑖 2 − 𝑛
𝑛 ∑𝑛 2
1 𝑥𝑖 − (∑𝑛
𝑖=1 𝑥𝑖 )
2
= 𝑛
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )
2
∴ 𝜎2 = .
𝑛(𝑛−1)
𝑛 𝑛 𝑛 𝑛
1 𝑛 𝑛 1 𝑛1 ∑1 1 𝑥𝑖 2 −(∑1 1 𝑥𝑖 )2 𝑛2 ∑1 2 𝑦𝑖 2 −(∑1 2 𝑦𝑖 )2
𝜎2 = 𝑛 {∑1 1(𝑥𝑖 − 𝑥)2 + ∑1 2 (𝑦𝑖 − 𝑦)2 } = 𝑛 { + }
1 +𝑛2 −2 1 +𝑛2 −2 𝑛1 𝑛2
𝑛 ∑𝑛 2 𝑛 2 𝑛 2 𝑛
1 𝑥𝑖 −(∑1 𝑥𝑖 ) +𝑛 ∑1 𝑦𝑖 −(∑1 𝑦𝑖 )
2
= 2𝑛(𝑛−1)
If the two samples are of same size, and are of different standard deviations, then
1
𝜎2 = 𝑛 {(𝑛1 − 1)𝜎𝑥 2 + (𝑛2 − 1)𝜎𝑦 2 }
1 +𝑛2 −2
1
= 2𝑛−2 {(𝑛 − 1)𝜎𝑥 2 + (𝑛 − 1)𝜎𝑦 2 }
1
= 2 {𝜎𝑥 2 + 𝜎𝑦 2 } .