Statistics 1
Statistics 1
by the number of observations. A.M. is measured with the same units as that of the
observations.
Let x1, x2 , ………,xn be ‘n’ observations then the A.M is computed from the formula:
Median : The median is the middle most item that divides the distribution into two
equal parts when the items are arranged in ascending order of magnitude.
If the number of observations is odd, then median is the middle value after
case of even number of observations, there are two middle terms and median is
Mode: Mode is the value which occurs most frequently in a set of observations or
Measures of Dispersion:
Dispersion means scattering of the observations among themselves or from a
central value (Mean/ Median/ Mode) of data. We study the dispersion to have an
It can be seen that the mean yield for both varieties is 50 kg. But we can not say
that the performances of the two varieties are same. There is greater uniformity of
yields in the first variety where as there is more variability in the yields of the
second variety. The first variety may be preferred since it is more consistent in yield
performance.
Types of dispersion:
1. Range
2. Quartile Deviation
3. Mean Deviation
4. Standard Deviation and Variance
5. Coefficient of Variation
6. Standard Error
Range: It is the difference between maximum value and minimum value.
arithmetic mean of the squares of the deviations of the given values from
formula
or S.D () =
Example:
Calculate S.D. for the values 5, 6, 7, 7, 9, 4, 5.
S.D.=
= 1.55 kg.
C.V. = x10 0
The coefficient of variation will be small if the variation is small of the two
groups, the one with less C.V. said to be more consistent.
Note: 1. Standard deviation is absolute measure of dispersion
2. Coefficient of variation is relative measure of dispersion.
NORMAL DISTRIBUTION
The Normal Distribution (N.D.) was first discovered by De- Moivre as the limiting
form of the binomial model in 1733, later independently worked by Laplace and
Gauss.
a bell, it has single mode and is symmetric about its central values.
p[(x)] max =
Standard Normal Distribution: If ‘X’ is a normal random variable with Mean and
is also known as a Normal Curve or a Bell Curve (see Figure below). To draw
such a curve, one needs to specify two parameters, the mean and the standard
deviation. The graph below has a mean of zero and a standard deviation of 1, i.e.,
with a mean of zero and a standard deviation of 1 is also known as the Standard
Normal Distribution.
Testing of Hypothesis
Introduction: The estimate based on sample values do not equal to the true value
in the population due to inherent variation in the population. The samples drawn
will have different estimates compared to the true value. It has to be verified that
whether the difference between the sample estimate and the population value is
fluctuation only it can be safely said that the sample belongs to the population
under question and if the difference is real we have every reason to believe that
sample may not belong to the population under question. The following are a few
technical terms in this context.
Population follows Normal Distribution. There are two types of hypothesis, namely
and is usually denoted by H 0. (or) any statistical hypothesis under test is called null
hypothesis. It is denoted by H 0 .
1. H 0 : = 0
2.
H0: 1 = 2
2. H 1: 1 ≠ 2
of the general magnitude and the study of variation with respect to one or more
under study is called population or universe. i.e the totality of all the objects under
In practice, if parameter values are not known and the estimates based on the
sample values are generally used.
2
sample mean ( ), sample variance (s ) where =
2
and s =
with equal chance, to be included in the sample then the sampling will be called
normal population with mean and standard deviation , a simple null hypothesis
Composite Hypothesis: If the hypothesis does not specify the distribution of the
examples:
H0 : and is known
H0 : and is known
Types of Errors:
H 0 is true H 0 is false
(Wrong decision)
The probabilities of type- I and type- II errors are denoted by and respectively.
If ‘n’ is the total number of items and ‘k’ the total number of constraints then the
to risk a type- I error is known as level of significance or the size of Type- I error is
before collecting the sample information. LOS 5% means the results obtained will
be true is 95% out of 10 0 cases and the results may be wrong is 5 out of 10 0
cases.
Critical value: while testing for the difference between the means of two
populations, our concern is whether the observed difference is too large to believe
that it has occurred just by chance. But then the question is how much difference
possible to define a cut- off or threshold value such that if the difference exceeds
this value, we say that it is not an occurrence by chance and hence there is
sufficient evidence to claim that the means are different. Such a value is called the
4. The table (critical) values will be found out from the tables for a given level
of significance
5. The null hypothesis will be rejected at the given level of significance if the
value of test statistic is greater than or equal to the critical value.
‘significant’
*****