Statistics Handout
Statistics Handout
Basic Probability
1. A class contains 8 boys and 7 girls. The teacher selects 3 of the children at random and without
replacement. Calculate the probability that the number of boys selected exceeds the number
of girls selected.
2. A and B throws a dice one by one. The player who throws first ‘two’ wins the game. If A starts
the game, find the probability that B wins.
3. What is the probability of having 53 Sundays in a leap year?
4. A problem of mathematics is given to the three students A, B and C, whose chances of solving
are 1/2, 1/3 and 1/4 respectively. What is the probability that the problem will be solved?
It’s an interesting theorem that establishes the relationship between two conditional probabilities.
If A1, A2,..,An are mutually exhaustive events of a sample space S and B is any arbitrary event of
𝐵
𝑃(𝐴𝑖 )𝑃( )
𝐴 𝑃(𝐴𝑖 ∩𝐵) 𝐴𝑖
S. Then 𝑃 ( 𝐵𝑖) = 𝑃(𝐵)
= 𝐵
∑𝑛
𝑖=1 𝑃(𝐴𝑖 )𝑃( )
𝐴𝑖
Decision making under uncertainty is called as Bayesian decision theory. Not going in depth, just
establish an analogy with Baye’s theorem statement itself. Every decision making model has some
nature of environment. These constraints are mutually exclusive and mutually exhaustive as well.
We may consider those as A’s and the decision that we want to take may be taken as B. Our
𝐵
objective is to determine the chances of decision B that is to calculate 𝑃(𝐵) when 𝑃(𝐴𝑖 ) and 𝑃 ( )
𝐴𝑖
are known.
Similalry, Baye’s theorem is applicable in Game Theory where we would like to find the
probability of strategy to be adopted by Player 1 under the condition of different strategies by
another player.
Probability Distributions
Random Variables
A random variable is a function maps outcome of an experiment to a real value. For example, if
you toss two coins and outcome is getting head then number of heads becomes random variable
i.e. the value of random variable will be 0, 1 or 2 heads.
You may establish analogy with frequency distribution. Here, frequency is probability of an
outcome and summation of all the probabilities is 1.
Recall frequency distribution and find expressions for mean, variance and moments for discrete
probability distribution.
Binomial Distribution
Consider the experiment of tossing 3 coins, success of getting head and failure if getting tail. It is
clear that the probability of each success is 0.5
Similar will be done for occurrence of 0, 1 and 3 heads. How many such values will come:
Total number of trials (3 tosses of coin) – 03
Total number of successes (2 heads) – 02
Total number of ways for achieving 2 successes out of 3 trials - 3𝐶2 = 3
Hence we can deduce the probability of a value x of random variable X out of n number of trials
or experiments as
P (X = x) = 𝑛𝐶𝑥 × 𝑝 𝑥 × (1 − 𝑝)𝑛−𝑥
Where, p is the probability of success in each trial.
We know that the binomial distribution and Poisson distribution are discrete probability
distribution whereas the normal distribution is the form of continuous probability distribution.
Before we start with normal distribution, we have to know that What Is Continuous Probability
Distribution And What Are Its Characteristics?
Defintions:-
Continuous variate: A variate that is not discrete i.e., which can take infinite number of values in
a given interval a x b, is called a continuous variate.
Probability Density function: Let X be a continuous random variable and let the probability of X
1 1
falling in the interval x dx, x dx be expressed by f(x)dx, where f(x) is a continuous
2 2
function of X and satisfies the following two conditions:
(i) f(x) 0 x R where R is the collection of all points in the entire range of the variable X.
b
Then the function f(x) is called the probability density function and the continuous curve
y = f(x) is called the probability curve.
P (a x b) = f ( x)dx
a
The integral also represents the area under the probability curve y = f (x), between the ordinates
x = a and x = b, and the x-axis i.e., we may understand the concept of probability in relation of
the area under the probability curve.
(i) Mean = x = x f ( x)dx
(ii) Variance = 2 = (x x )
2
f ( x)dx
r ( x a ) r f ( x)dx
'
r ( x x ) r f ( x)dx .
(iv) Mean deviation from the mean, for the above continuous probability distribution is
= | x x | f ( x)dx .
Md
1 1
f ( x)dx M f ( x)dx 2 f ( x)dx 2
d
d d2
f ( x) 0 and f ( x) 0
dx dx 2
1
f ( x) e ( x ) ( 2 2 )
2
, - < x <
(2 )
Where is the mean of the normal distribution, the standard deviation are also know as the
parameters of the normal distribution.
The probability distribution with density function given above is called Normal
distribution or the Gaussian distribution. x is called the normal variate with mean and
standard deviation and is denoted by x : N( , ).
Normal Curve:-
f(x)
O X
The graph of the normal distribution as shown above is called the normal curve. It is symmetrical
about the line x = when the ordinate has maximum value. Also mean, median and mode
coincide in the normal curve. The line x = divides the area under the normal curve about x-axis
into two equal parts. Thus median also coincides with the mean and mode. The area under the
normal curve between any two given points x = x1 and x = x2 represents the probability of values
falling into the given interval. The total area under the normal curve about x-axis is 1.
Standard Normal Variate: If x is a normal variate with mean and standard deviation , then
x
z is called standard normal variate. It has mean = 0 and standard deviation = 1. After
putting these values of parameters in the density function, we obtain
1
1 z2
f ( z) e 2
, - < z <
(2 )
1
P (- < x 0) = P (0 x < ) =
2
x
1
1 z2
As when z
, P (- < z < ) =
(2 )
e 2
dz = 1,
1
1 z2 1
Whereas P (z 0) = P (z 0) =
(2 )
e
0
2
dz =
2
.
2 x1 e 2 dz
( x ) 2 ( 2 2 )
P (x1 x x2) = e dx =
(2 ) z1
x1 x2
Where, z1 and z 2
1 z2
2 z2
1 z1 1
z2
Then P (x1 x x2) = e dz e 2 dz = P2 (z) – P1 (z)
(2 )
0 0
If x1 lies on the right side of the line of mean of the normal curve, then we may also
conclude that
1 z1 1
1 z2 1 z12 1
Similarly P (x x1) =
(2 ) 0
e 2
dz -
(2 ) 0
e 2
dz1 = - (z1)
2
0 1 z1 1
1 z2 1 z12 1
And P (x x1) = e
(2 )
2
dz +
(2 ) 0
e 2
dz1 =
2
+ (z1)
Question 3: In a normal distribution, 31% of the items are under 45 and 8% are over 64.
Find the mean and standard deviation of the distribution.
Test of Hypothesis
Types of Hypothesis
There are two types of hypothesis
Null Hypothesis
Alternative Hypothesis
A null hypothesis is a claim or statement about a population parameter that is assumed to be true
until it is declared false. An alternative hypothesis is a claim about a population parameter that
will be true if the null hypothesis is false.
Hypothesis Building
Example 1: In the past a machine has produced washers having a mean thickness of 0.050 inch.
To determine whether the machine is in proper working order a sample of 10 washers is chosen
for which the mean thickness is 0.053 inch and the standard deviation is 0.003 inch. Test the
hypothesis that the machine is in proper working order.
The null hypothesis states that a given claim about a population parameter is true. In the given
example population parameter is mean. The claim to be tested is that the machine is in proper
working order may or may not be true. The claim is true when the mean is 0.050 inches. Therefore
the null hypothesis will be that the mean is 0.050 inches therefore alternative hypothesis is mean
is not 0.050 inches. Which we write as
H0: µ = 0.050
H1 or Ha: µ ≠ 0.050
Example 2: The percentage of people who prefer specific seat in the plane where they fly. A survey
shows that 61% of the adults prefer a window seat, 38% prefer an aisle seat, and the only 1%
prefer the middle seat. These results are based on a sample of 806 adults. Suppose that the result
were true for the population of such adults at the time of the survey and that we want to check if
the current percentage of all adults who prefer the window seat when they fly is still 61%.
Suppose we take a random sample of 1000 adults and ask them which seat is their favorite when
they fly. Of them, 640 say that they prefer a window seat.
H0: p = 0.61
H1 or Ha: p ≠ 0.61
Example 3: The lapping process which is used to grind certain silicon wafers to the proper
thickness is acceptable only if σ, the population standard deviation of the thickness of dice cut
from the wafers, is at most 0.50 mil. Use the 0.05 level of significance to test the claim, if the
thickness of 15 dice cut from such wafers have a standard deviation of 0.64 mil.
H0: σ = 0.50
Types of Errors
Type I Error: A type I error occurs when a true null hypothesis is rejected. This error is denoted
by ‘α’. The value of ‘α’ represents the probability of committing this error; that is
α = P (H0 is rejected | H0 is true). The value of α represents the significance level of the test.
Type II Error: A type I error occurs when a false null hypothesis is not rejected. This error is
denoted by ‘β’. The value of ‘β’ represents the probability of committing this error; that is β = P
(H0 is not rejected | H0 is false). The value of 1 - β is called the power of the test. It represents the
probability of not making a Type II error.
Actual Situation
H0 is True H0 is False
Tails of the Test: A two-tailed test has rejection regions in both tails, a left-tailed test has the
rejection region in the left tail, and a right-tailed test has the rejection region in the right tail of the
distribution curve.
Few more questions
Example 1: In the past a machine has produced washers having a mean thickness of 0.050 inch.
To determine whether the machine is in proper working order a sample of 10 washers is chosen
for which the mean thickness is 0.053 inch and the standard deviation is 0.003 inch. Test the
hypothesis that the machine is in proper working order.
Example 2: The mayor of a large city claims that the average net worth of families living in this
city is at least $300,000. A random sample of 25 families selected from this city produced a mean
net worth of $288,000. Assume that the net worths of all families in this city have a normal
distribution with the population standard deviation of $80,000. Using the 2.5% significance level,
can you conclude that the mayor’s claim is false?
Example 3: A potential buyer of fluorescent lamp bought 50 lamps of each of two brands, viz.,
Naional lamps and Indian lamps. Upon testing these lamps, he found that the brand ‘National’
had a mean life of 1,282 hours with standard deviation 80 hours, whereas, the brand Indian had
a mean life of 1,208 hours with a standard deviation 94 hours. At 5% level of significance, can the
buyer conclude that both brands have the same Mean life?
Example 4: To compare two kinds of bumper guards, six of each kind were mounted on a certain
make of a compact car. Then each car was run into a concrete wall at 5 miles per hours and the
following are the costs of the repairs.
Test at 0.01 level of significance whether the difference between the means of these two samples
is significant.
Parametric Tests
9. The results of polls conducted 2 weeks and 4 weeks before an election, are shown in the
following table;
Two weeks before Four weeks before Total
For Candidate A 99 112 211
For Candidate B 101 88 189
Total 200 200 400
10. Fit a Poisson distribution to the following data and test the goodness of fit
x 0 1 2 3 4
f 112 73 30 4 1
11. As part of the investigation of the collapse of the roof of a building, a testing laboratory is
given all the available bolts that connected the steel structure at 3 different positions on the
roof. The forces required to shear each of these bolts (coded values) are as follows:
Position 1 90 82 79 98 83 91
Position 2 105 89 93 104 89 95 86
Position 3 83 89 80 94
Perform an analysis of variance to test at the 0.05 level of significance whether the
differences among the sample means at the 3 positions are significant.
Practice Questions
1. The table below shows the number of absences, x, in a Calculus course and the
final exam grade, y, for 7 students. Find the correlation coefficient and interpret
your result. Find the regression line of y on x.
x 1 0 2 6 4 3 3
y 85 80 70 55 90 90 95
2. The time x in years that an employee spent at a company and the employee’s
hourly pay, y, for 5 employees are listed in the table below. Calculate and interpret
the correlation coefficient r. Find the line of regression of y on x.
x 5 3 4 10 15
y 25 20 21 35 38
3. Considering x as number of hours that 10 persons studies for a French test and y
as their scores on the test. Given Σx = 100, Σy = 564, Σx2 = 1376, Σx2 = 36562 and
Σxy = 6945. Find the equation of least squares line that approximates the regression
of the test scores on the number of hours studied. Also find the correlation
coefficient between these two.