Basic Concepts of Statistics
Basic Concepts of Statistics
Basic Concepts of Statistics
of Statistics
SEVERINO B. SALERA JR
ASSO. PROF. 5
BISU, BILAR
Basics of Statistics
The arithmetic mean of Virat Kohli’s batting scores also called his Batting
Average is;
Sum of runs scored/Number of innings = 661/10
The arithmetic mean of his scores in the last 10 innings is 66.1.
Harmonic Mean
A Harmonic Progression is a sequence if the reciprocals of its terms are in
Arithmetic Progression, and harmonic mean (or shortly written as HM) can be
calculated by dividing the number of terms by reciprocals of its terms.
In particular cases, especially those involving rates and ratios, the harmonic
mean gives the most correct value of the mean. For example, if a vehicle
travels a specified distance at speed x (eg 60 km / h) and then travels again at
the speed y (e.g.40 km / h), the average speed value is the harmonic mean x,
y (Ie, 48 km / h).
Geometric Mean
The Geometric Mean (GM) is the average value or mean which
signifies the central tendency of the set of numbers by finding
the product of their values.
Basically, we multiply the numbers altogether and take out the
nth root of the multiplied numbers, where n is the total number
of values.
For example: for a given set of two numbers such as 3 and 1, the
geometric mean is equal to √(3+1) = √4 = 2.
Use of Geometric Mean
For example, suppose you have an investment which earns 10%
the first year, 50% the second year, and 30% the third year. What
is its average rate of return?
It is not the arithmetic mean, because what these numbers mean is
that on the first year your investment was multiplied (not added to)
by 1.10, on the second year it was multiplied by 1.60, and the third
year it was multiplied by 1.20. The relevant quantity is the
geometric mean of these three numbers.
The question about finding the average rate of return can be
rephrased as: "by what constant factor would your investment need
to be multiplied by each year in order to achieve the same effect as
multiplying by 1.10 one year, 1.60 the next, and 1.20 the third?"
If you calculate this geometric mean
You get approximately 1.283, so the average rate of return is about
28% (not 30% which is what the arithmetic mean of 10%, 60%, and
20% would give you).
Median
Median is the middle value of the dataset in which
the dataset is arranged in the ascending order or in
descending order.
When the dataset contains an even number of
values, then the median value of the dataset can
be found by taking the mean of the middle two
values.
If you have skewed distribution, the best measure
of finding the central tendency is the median.
The median is less sensitive to outliers (extreme
scores) than the mean and thus a better measure
than the mean for highly skewed distributions, e.g.
family income. For example mean of 20, 30, 40,
and 990 is (20+30+40+990)/4 =270. The median
of these four observations is (30+40)/2 =35. Here 3
observations out of 4 lie between 20-40. So, the
mean 270 really fails to give a realistic picture of
the major part of the data. It is influenced by
extreme value 990.
Mode
Range: It is simply the difference between the maximum value and the
minimum value given in a data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6
Variance: Deduct the mean from each data in the set then squaring each of
them and adding each square and finally dividing them by the total no of values
in the data set is the variance. Variance (σ2)=∑(X−μ)2/N
Standard Deviation: The square root of the variance is known as the standard
deviation i.e. S.D. = √σ.
Quartiles and Quartile Deviation: The quartiles are values that divide a list of
numbers into quarters. The quartile deviation is half of the distance between the
third and the first quartile.
Mean and Mean Deviation: The average of numbers is known as the mean and
the arithmetic mean of the absolute deviations of the observations from a
measure of central tendency is known as the mean deviation (also called mean
absolute deviation).
Range
It is the simplest method of measurement of dispersion.
It is defined as the difference between the largest and the
smallest item in a given distribution.
Range = Largest item (L) – Smallest item (S)
Interquartile Range
It is defined as the difference between the Upper Quartile and
Lower Quartile of a given distribution.
Interquartile Range = Upper Quartile (Q3)–Lower
Quartile(Q1)
Variance
Variance is a measure of how data points differ from the mean.
A variance is a measure of how far a set of data (numbers) are spread
out from their mean (average) value.
The more the value of variance, the data is more scattered from its
mean and if the value of variance is low or minimum, then it is less
scattered from mean. Therefore, it is called a measure of spread of
data from mean.
the formula for variance is
Var (X) = E[(X –μ) 2]
the variance is the square of standard deviation, i.e.,
Variance = (Standard deviation)2= σ2
Variance