Sample Tests
Sample Tests
Points to be covered:
(A) Test of H:ypothesis-I
• Introduction and procedure
• Null and alternative hypothesis
• Simple and composite hypothesis
• Sample error, level of significance, one and two
tail.
(B) Test of Hypothesis-II
• Difference between large and small sample test
• Degree of freedom
• Condition for applying T-Test.
(C )Application of T-Test:
• Test of significance of mean
• Test of significance of difference of two mean
• Paired T-Test
(D) Large Sample test (Z test):
• Test of significance of mean
• Test of significance of difference of two mean,
• Test of significance between two S.D.
(A)Testing of Hypothesis-1
Introduction:
The main objective of taking a sample from a
population is to get reliable information about the
population.
From the information obtained from the sample,
conclusions are drawn about the population.
This is called statistical inference.
It may consist of two parts:
(1) Estimation of parameters
(2) Test of statistical hypothesis
Parameters and Statistics:
A parameter is a number that describes the
population. A parameter is a fixed number, but
in practice we do not know its value because
we cannot examine the entire population.
A statistic is a number that describes a sample.
The value of a statistic is known when we have
taken a sample, but it can change from sample
to sample. We often use a statistic to estimate
an unknown parameter.
Statistical Hypothesis:
A statistical hypothesis is an assumption about the
parameter of the population or the nature of the
population. For examples
1) The population mean μ = 25
2) The average weights of students of college A and
college B are same.
3) 20% of the students of a college are non
vegetarians.
4) The given population is a binomial population.
are some of the hypothesis.
Null Hypothesis:
A statistical hypothesis which is taken for the possible
acceptance is called a null hypothesis and it is denoted
by H0
The neutral attitude of the decision maker, before the
sample observations are taken is the key note of the null
hypothesis.
For examples:
1) Mean of the population is 60. H0 : μ = 60.
2) Means of both populations are equal. H0 : μ1 = : μ2
3) The proportion of drinkers in both the cities are equal
H0 : P1 = P2
Alternative Hypothesis:
A hypothesis is complementary to the null
hypothesis is called alternative hypothesis and
it is denoted by H1.
For examples:
(1)H1: μ ≠ 60
(2) H1 : μ1 ≠ μ2
(3) H1: P1 ≠ P2
are alternative hypothesis.
Simple Hypothesis:
A simple hypothesis is one where the value of
the parameter under Ho is a specified constant
and the value of the parameter under H1 is a
different specified constant.
For example, if you test
Ho: μ = 0 vs H1: μ = 10
then you have a simple hypothesis test.
Here you have a particular value for Ho and a
different particular value for H1.
Composite Hypothesis:
Most problems involve more than a single alternative.
Such hypotheses are called composite hypotheses.
Examples of composite hypotheses:
Ho: μ = 0 vs H1: μ ≠ 0
which is a two-sided H1.
A one-sided H1 can be written as
Ho: μ = 0 vs H1: μ > 0
or
Ho: μ = 0 vs H1: μ < 0
All of these hypotheses are composite because they
include more than one value for H1.
Sample Error (standard Error):
The standard deviation of the sample statistic is
called standard error of the statistics.
For example, if different samples of the same
size n are drawn from a population, we get
different values of sample mean (x-bar).
The S.D. of (x-bar) is called standard error of
(x-bar).
Standard error of (x-bar) will depend upon the
size of the sample and the variability of the
population.
It can be derived that S.E. = (σ / √n).
Standard errors of some well known statistics:
NO. STATISTICS S.E
1. Mean (x-bar)
n
2. Difference between two means
x 1 x2 12
n1
22
n2
3. Sample Proportion p PQ
n
4. Difference between two
PQ P2Q2
proportions
P1’ – p2’
1 1
n1 n2
Uses of S.E:
1) To test whether a given value of a statistic differs
significantly from the intended population parameter.
i.e. whether the difference between value of the
sample statistic and population parameter is
significant and the difference may be attribute to
chance.
2) To test the randomness of a sample i.e. to test whether
the given sample be regarded as a random sample
from the population.
3) To obtain confidence interval for the parameter of the
population.
4) To determine the precision of the sample estimate,
because precision of a statistic
= 1/ S.E. of the statistic
Type I and type II Errors:
In testing of a statistical hypothesis the following situations
may arise:
1) The hypothesis may be true but it is rejected by the test.
2) The hypothesis may be false but it is accepted by the test.
3) The hypothesis may be true and is accepted by the test.
4) The hypothesis may be false and is rejected by the test.
(3) and (4) are the correct decisions while (1) and (2) are
errors.
The error committed in rejecting a hypothesis which is true
is called Type-I error and its probability is denoted by α
The error committed in accepting a hypothesis which is
false is called Type-II error and its probability is denoted by
β.
Accept Reject
Ho is true Correct decision Type – I error
24 - 19
Accept H0
Reject H0 Reject H0
0.025 0.95 0.025
Z
In this example, the rejection region probabilities
are equally split between the two tails, thus the
reason for the label as a two-tailed test.
This procedure allows the possibility of rejecting
the null hypothesis, but does not specifically
address, in the sense of statistical significance,
the direction of the difference detected.
There may be situations when it would be
appropriate to consider an alternative hypothesis
where the directionality is specifically addressed.
That is we may want to be able to select between
a null hypothesis and one that explicitly goes in
one direction. Such a hypothesis test can best be
expressed as:
H0: = 170 vs H1: > 170
The expression is sometimes given as:
H0: ≤ 170 vs H1: > 170
The difference between the two has to do with how
the null hypothesis is expressed and the implication
of this expression.
The first expression above is the more theoretically
correct one and carries with it the clear connotation
that an outcome in the opposite direction of the
alternative hypothesis is not considered possible.
This is, in fact, the way the test is actually done.
The process of testing the above hypothesis is
identical to that for the two-tailed test except that all
the rejection region probabilities are in one tail.
For a test, with α = 0.05, the acceptance region would
be, for example, the area from the extreme left up to
the point below which lies 95% of the area.
The rejection region would be the 5% area in the
upper tail.
24 - 25
Critical values at important level of significance
are given below.
1% 5% 10%
Two tailed test 2.58 1.96 1.645
One tailed test 2.33 1.645 1.282
(B) Testing Of Hypothesis -2:
Introduction:
• The value of a statistics obtained from a large
sample is generally close to the parameter of the
population.
• But there are situations when one has to take a
small sample. E.g. if a new medicine is to be
introduced, a doctor cannot test the new medicine
by giving it to many patients.
• Thus he takes a small sample.
• Generally a sample having number of
observations less than or equal to 30 is regarded
as a small sample.
Difference between Large and Small sample
Sr. No. Large sample Small sample
1. The sample size is greater than 30. The sample size is 30 or less than 30
2. The value of a statistic obtain from The value of a statistic obtain from
the sample can be taken as an the sample can not be taken as an
estimate of the population estimate of the population
parameter. parameter.
(b) D. f .( S 2
) i
n 1
n
Student’s t-distribution:
• This is very important distribution was given by
W.S.Gosset in 1908.
• He published his work under the pen-name of student.
• Hence the distribution is known as student’s t
distribution.
• If x1, x2,…..,xn is a random sample of n observations
drawn from a normal population with mean μ and S.D.
σ then the distribution of
is defined as t distribution
x on n-1degree of freedom.
t
s
n 1
• The probability density function of t
distribution is1
f (t) 2 n 1
' t
1 n t
n . ( , )(1 ) 2
2 2 n
Assumption of t-distribution:
1) The population from which the sample is drawn is
normal.
2) The sample is random.
3) The population S.D. σ is not known.
Properties of t-distribution:
1) The probability curve of t-distn. is symmetrical.
2) The tails of the curve are asymptotic to x-axis.
3) When n→ ∞, t-distn tends to normal distribution.
4) The form of the t-dist. Varies with the degrees of
freedom.
Uses of t-distribution:
1) For testing the significance of the difference
between sample mean and population mean.
2) For testing the difference between means of
two samples.
3) For testing significance of the observed
correlation co-efficient.
4) For testing the significance of observed
regression co-efficient.
Student T Distribution with 1 degrees of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 2 4
x 49, (x i x) 52.
2
(B) Test of Significance of difference between
Means of Two Small Samples:
• Suppose two independent small samples of
size n1 and n2 are drawn from two normal
populations and the means of the samples are
x1-bar and x2-bar respectively.
• Under the assumption that both the population
have the same variance.
x1 x2 x1 x2 n1n2
t
1 1 S n1 n2
S
n1 n2
• where
1
S
2
{ (x i x) (x 2 x2 )
2 2
n1 n2 2
Horse B 29 30 30 24 27 29
(2) Below are given the gain in weights (in lbs) of cows fed on
two diets X and Y. Test at 5% level whether the two diets
differ as regards their effects on mean increase in weight.
Diet X 25 32 30 32 24 14 32
Diet Y 24 34 22 30 42 31 40 30 32 35
(3) For two independent samples the following
information is available.
Sample Size Mean S.D
I 10 15 3.5
II 15 16.5 4.5
t
d n 1
whered
di 1
andS (d i d )
2 2
S n n
• The value of t is computed from the given data
and compared with the table value of n-1 degree
of freedom.
Examples:
1. The sales data of an item in six shops before and after
a special promotion campaign are as under:
shops A B C D E F
Before campaign 53 28 32 48 50 42
After campaign 58 32 30 50 56 45