0% found this document useful (0 votes)
70 views16 pages

Statistics Notes

Statistics is the study of collecting, organizing, analyzing, and interpreting data. It involves descriptive statistics like measures of central tendency to summarize data, and inferential statistics to draw conclusions from samples. Mathematical statistics applies mathematical techniques to understand data better and focus on applications. Descriptive statistics describes data through measures like mean, median and mode, while inferential statistics draws conclusions through statistical tests and identifying differences between groups. Data can be qualitative or quantitative and is often represented visually through graphs, charts, and other representations to analyze patterns and outliers. Common statistical models include measures of central tendency, measures of dispersion, skewness, ANOVA, degrees of freedom, and regression analysis.

Uploaded by

austin otieno
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views16 pages

Statistics Notes

Statistics is the study of collecting, organizing, analyzing, and interpreting data. It involves descriptive statistics like measures of central tendency to summarize data, and inferential statistics to draw conclusions from samples. Mathematical statistics applies mathematical techniques to understand data better and focus on applications. Descriptive statistics describes data through measures like mean, median and mode, while inferential statistics draws conclusions through statistical tests and identifying differences between groups. Data can be qualitative or quantitative and is often represented visually through graphs, charts, and other representations to analyze patterns and outliers. Common statistical models include measures of central tendency, measures of dispersion, skewness, ANOVA, degrees of freedom, and regression analysis.

Uploaded by

austin otieno
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Statistics

Statistics is a branch of mathematics that deals with the study of collecting, analyzing,
interpreting, presenting, and organizing data in a particular manner. Statistics is defined as the
process of collection of data, classifying data, representing the data for easy interpretation, and
further analysis of data. Statistics also is referred to as arriving at conclusions from the sample
data that is collected using surveys or experiments. Different sectors such as psychology,
sociology, geology, probability, and so on also use statistics to function.

Mathematical Statistics
Statistics is used mainly to gain an understanding of the data and focus on various applications.
Statistics is the process of collecting data, evaluating data, and summarizing it into a
mathematical form. Initially, statistics were related to the science of the state where it was used
in the collection and analysis of facts and data about a country such as its economy, population,
etc. Mathematical statistics applies mathematical techniques like linear algebra, differential
equations, mathematical analysis, and theories of probability.

There are two methods of analyzing data in mathematical statistics that are used on a large scale:

 Descriptive Statistics
 Inferential Statistics

Descriptive Statistics
The descriptive method of statistics is used to describe the data collected and summarize the data
and its properties using the measures of central tendencies and the measures of dispersion.

Inferential Statistics
This method of statistics is used to draw conclusions from the data. Inferential statistics requires
statistical tests performed on samples, and it draws conclusions by identifying the differences
between the 2 groups. Tests calculate the p-value that is compared with the probability of
chance(α) = 0.05. If the p-value is less than α, then it is concluded that the p-value is statistically
significant.

Data Representation in Statistics


The collection of observations and facts is known a data. These observations and facts can be in
the form of numbers, measurements, or statements. There are two different kinds of data i.e.
Qualitative data and quantitative data. Qualitative data is when the data is descriptive or
categorical and quantitative data is when the data is numerical information. Once we know
the data collection methods, we aim at representing the collected data in different forms of
graphs such as a bar graph, line graph, pie chart, stem and leaf plots, scatter plot, and so on.
Before the analysis of data, the outliers are removed that are due to the invariability in the
measurements of data. Let us look at different kinds of data representation in statistics.

Desc
Data Representation ripti
on
Bar
Grap
h

A
grou
p of
data
repre
sente
d
with
recta
ngula
r bars
with
lengt
hs
prop
ortio
nal to
the
value
s is
a bar
graph
.

The
bars
can
either
be
vertic
ally
or
horiz
ontall
y
plotte
d.
Pie
Char
t

The 
pie
chart 
is a
type
of
graph
in
whic
ha
circle
is
divid
ed
into
Secto
rs
wher
e
each
secto
r
repre
sents
a
prop
ortio
n of
the
whol
e.
Line
grap
h

The l
ine
graph 
repre
sents
the
data
in a
form
of
series
that
is
conn
ected
with
a
straig
ht
line.
Thes
e
series
are
calle
d
mark
ers.
Picto
grap
h

Data
show
n in
the
form
of
pictu
res is
a pict
ograp
h.
Picto
rial
symb
ols
for
word
s,
objec
ts, or
phras
es
can
be
repre
sente
d
with
differ
ent
numb
ers.
Histo
gram

The 
histo
gram 
is a
type
of
graph
wher
e the
diagr
am
consi
sts of
recta
ngles
, the
area
is
prop
ortio
nal to
the
frequ
ency
of a
varia
ble
and
the
width
is
equal
to the
class
inter
val.
Here
is an
exam
ple
of a
histo
gram
.

Freq
uenc
y
Distr
ibuti
on

The f
reque
ncy
distri
butio
n tabl
e in
statis
tics
show
cases
the
data
in
ascen
ding
order
along
with
their
corre
spon
ding
frequ
encie
s.

The
frequ
ency
of
the
data
is
often
repre
sente
d by
f.

Different Models of Statistics


Statistics being a broad term used in various forms, different models of statistics are used in
different forms. Listed below are a few models:

Skewness - In statistics, the word skewness refers to a measure of the asymmetry in a probability
distribution where it measures the deviation of the normal distribution curve for data. The value
of skewed distribution could be positive or negative or zero. The curve is said to be skewed when
it shifts from left to right. If the curve moves towards the right it is called a positive skewed and
if the curve moves towards the left, it is called left-skewed.

ANOVA Statistics - The word ANOVA means Analysis of Variance. The measure used in
calculating the mean difference for the given set of data is called the ANOVA statistics. This
model of statistics is used to compare the performance of stocks over a period of time.

Degrees of freedom - This model of statistics is used when the values are changed. Data that can
be moved while estimating a parameter is the degree of freedom.

Regression Analysis - In this model, the statistical process determines the relationship between
the variables. The process signifies how a dependent variable changes when an independent
variable changed.

Measures of Central Tendency in Statistics


The measure of central tendency and the measure of dispersion are considered as the basis of
descriptive statistics. The representative value for the given data is the measure of central
tendency that gives us an idea of where data points are centered. This is done to find how the
data are scattered around this centered measure. We use mean, median, and mode to find the
central measures of tendency. In our day-to-day life, we find the average height of the students,
the average income, the average score in exams, or of the player. The different measures of
central tendency for the data are:

 Arithmetic Mean
 Median
 Mode
 Geometric Mean
 Harmonic Mean
Mean, Median and Mode in Statistics
Mean is considered the arithmetic average of a data set that is found by adding the numbers in a
set and dividing by the number of observations in the data set. The middle number in the data set
while listed in either ascending or descending order is the median. Lastly, the number that occurs
the most in a data set and ranges between the highest and lowest value is the mode. For n number
of observations, we have

 Mean = ¯x=∑xn�¯=∑��
 Median = n+12�+12th term if n is odd.
 Median = n2th term +(n2+1)thterm2�2�ℎ term +(�2+1)�ℎterm2
 Mode = The value which occurs most frequently
Measures of Dispersion in Statistics
The measures of central tendency do not suffice to describe the complete information about a
given data. Thus we need to describe the variability by a value called the measure of dispersion.
The different measures of dispersion are:

 The range in statistics is calculated as the difference between the maximum value and the
minimum value of the data points.
 The quartile deviation that measures the absolute measure of dispersion. The data points are
divided into 3 quarters. Find the median of the data points. The median of the data points to
the left of this median is said to be the upper quartile and the median of the data points to the
right of this median is said to be the lower quartile. Upper quartile - lower quartile is
the interquartile range. Half of this is the quartile deviation.
 The mean deviation is the statistical measure to determine the average of the absolute
difference between the items in a distribution and the mean or median of that series.
 The standard deviation is the measure of the amount of variation of a set of values.

Mean Deviation For ungrouped data


In statistics, the frequency distributions of data can be discrete data or continuous. For n number
of individual observations x1,x2,x3,xr,.....xn�1,�2,�3,��,.....��, the mean deviation
about mean and median are calculated as follows:
Mean Deviation for ungrouped data = sum of deviation/number of observations
= ∑Ni=1(xi−¯x)n∑�=1�(��−�¯)�

Mean Deviation for Discrete Grouped data


The measurements of the data units are clearly shown in such a frequency distribution. Let there
be n distinct data points x1,x2,x3,xr,.....xn�1,�2,�3,��,.....��, occurring with
frequencies f1,f2,f3....fn�1,�2,�3....��.

a) Mean deviation about mean


 We find the
mean ¯x�¯ using ∑Ni=1(Xi−fi)∑Ni=1fi∑�=1�(��−��)∑�=1���. This is
the ratio of the sum of the products of xi�� observations and their respective
frequencies fi�� to the sum of the frequencies.
 Mean Deviation=1N∑Ni=1(xifi)1�∑�=1�(����)
 Afterwhich find the deviations of observations xi�� from the mean ¯x�¯ and get their
absolute values. i.e. |xi−¯x��−�¯| for all i = 1, 2, 3, .....n
 Mean Deviation
= ¯x=∑Ni=1fi∣xi−¯x∣∑Ni=1fi�¯=∑�=1���∣��−�¯∣∑�=1���

b) Mean deviation about median

 Find the median by arranging the observations in ascending order.


 Obtain the cumulative frequencies. Then identify the observation whose cumulative
frequency is ≥ N/2, where N = sum of frequencies.
 Thus we have arrived at the required median. To get the absolute values of the deviations
from median, we calculate MD( median)
= 1N∑Ni=1fi∣xi−M∣1�∑�=1���∣��−�∣

Mean Deviation for Continuous Grouped data


Here the data points take any value within a range and they are continuous. They can be
measured and represented by using intervals on the real number line. The frequency in which
data are arranged in classes is not countable.

a) Mean deviation about mean

The mean of the continuous frequency distribution is centered at its mid-point in each class.
Then the same procedure is followed as in the case of discrete frequency distribution.

b) Mean deviation about median

Median = l+N2−Cf×h�+�2−��×ℎ, where the median class is the class interval whose cf
is ≥ N/2, N the sum of frequencies, l, f, h, and C are, the lower limit, the frequency, the width of
the median class and C the cumulative frequency of the class just preceding the median class.
After finding the median, |xi�� - M| is obtained.

Standard Deviation and Variance


We have the other prominent methods in statistics to find the proper measure of dispersion,
known as the variance and the standard deviation. While finding the mean deviation about the
mean and the median, there arises a difficulty in taking squares of all the deviations.
 If ∑Ni=1(xi−¯x)2∑�=1�(��−�¯)2 becomes zero, while calculating the sum for the
mean, then it means there is no dispersion at all.
 If the sum is small, the observations are closer to the mean indicating a lower degree of
dispersion.
 If the sum is large, there is a higher degree of dispersion of the observations from the
mean ¯x�¯.
 Thus this sum is a reasonable indicator of the degree of dispersion. This becomes the proper
measure of dispersion, denoted as σ2, and it is termed as the variance. Thus variance is given
as σ2 = ∑Ni=1(xi−¯x)2N∑�=1�(��−�¯)2�.
 The positive square root of the variance is called the standard deviation. σ
= √∑Ni=1(xi−¯x)2N∑�=1�(��−�¯)2�.

Coefficient of Variation
We compare the coefficient of variations of two or more frequency distributions. This coefficient
of variation in statistics is the ratio of the standard deviation to the mean, expressed in
percentage.

CV = σ/ ¯x�¯ × 100.

The distribution that has a greater coefficient of variation has more variability around the central
value than the distribution having a smaller value of the coefficient of variation.

Important Notes

 The discipline of data collection and organization is called statistics. We interpret results
based on the analysis done using the measures of central tendencies and the measures of
dispersion.
 The frequency distribution of data is represented using bar graphs, histograms, pie charts,
stem and leaf plots, line graphs, or ogives.
 The data collected can be either quantitative (numerical: discrete and continuous) or
qualitative(categorical).

☛ Also Check:

 Probability and Statistics


 Data Handling
 
Download FREE Study Materials

Statistics Worksheets
Discover the wonders of Math!
Explore

Examples of Statistics
 Example 1:Compute the mean deviation about mean from the following data.

Size(x) 2 4 6 8 10

Frequency f 2 4 5 3 1

 Solution: In statistics, we know that the mean deviation about mean is calculated using the
formla: Mean = (Σ f x)/N
= [(2×2 + 4×4 + 6×5 + 8×3 +10×1)/15
 = (4 + 16+ 30 + 24 + 10)/15
= 84/15 = 5.6
 Answer: The mean deviation about mean = 5.6
 Example 2: The mean of 5 observations is 4.4 and their variance is 8.24. If 3 of the
observations are 1, 2, and 6, find the other two observations.

Solution: Let the other two observations be a and b.

Then we have (1 + 2 + 6 + a + b) /5 = 4.4


(1 + 2 + 6 + a + b) = 22
a + b = 22 -9 = 13
a+ b = 13 ----------->(1)
Given : variance = 8.24
We know that in statistics, variance is calculated as:
σ2 = ∑Ni=1(xi−¯x)2N∑�=1�(��−�¯)2� = 8.24
8.24 =(1/5) [3.42 + 2.42 + 1.62 + (a-4.4)2 + (b-4.4)2 ]
41.20 = 11.56 + 5.46 + 2.56 + a2 +b2 + 2× 4.4(a + b) + 2 ×4.42
41.20= 19.58 + 38.72 + 114.4 + a2 +b2
a2 +b2 = 97----------->(2)
Form (1), we have (a + b) 2 = 169----------->(3)
Solving (2) and (3), we have 2 ab = 72----------->(4)
(4) - (2) gives a2 +b2 -2 ab = 169 - 72 = 25
Thus a - b = 5
Substituting the values of + 5 and - 5 in (1), we get
a = 9, b = 4 (or) a = 4, b = 9
Answer: The other two observations are 4 and 9.

 Example 3: Find the standard deviation of 8,10,12,14,16.

Solution: Given N = 5

Σx= 8+10+12+14+16 = 60

¯x�¯ = 60/5 =12

In statistics, Standard deviation = √Variance

Variance = σ 2 =∑Ni=1(xi−¯x)2N∑�=1�(��−�¯)2�.

= 1/5[(8-12)2+(10-12)2+(12-12)2+(14-12)2+(16-12)2]

= 1/5[16 + 4 +0 + 4+ 16]

=40/5 = 8

Standard deviation σ = √8 = 2.83

Answer: The standard deviation of the given data = 2.83

Practice Questions on Statistics

  

  
 

FAQs on Statistics
What is Statistics?

Statistics is a branch of mathematics that deals with the study of collecting, analyzing,
interpreting, presenting, and organizing data in a particular manner. It is referred to as arriving at
conclusions of data with the use of data.

What are the Two Types of Statistics?

The two different types of statistics are:

Descriptive Statistics: It is used to summarize the data and its properties using mean and standard
deviation.
Inferential Statistics: It is used to get a conclusion from the data collected.

What is Descriptive Statistics?

Descriptive statistics describe the data features and provide summaries about the entire or sample
population. We calculate the measures of central tendencies and measures of dispersion to
summarize the data, in this type of statistics.

What is Inferential Statistics?

Inferential statistics predict and make inferences from the data is called inferential statistics.
Many statistical tests are performed to arrive at conclusions. This inferential statistics has
connections with probability and probability distribution.

How is Statistics Used in Mathematics?

Statistics is a part of applied mathematics that uses probability theory to simplify the sample data
we collect. The concept of probability comes under statistics where we can determine if the data
is true or false but mostly, the data is true.

What is the Purpose of Statistics?

Statistics helps in better understanding and accurate description. It also helps in proper planning
in the statistical study. Finally, statistics uses tables, diagrams, and graphs as representing the
information in a certain manner.

What is the Importance of Statistics in Real Life?


Statistics helps to utilize strategies to gather the information, examine them, and successfully
present the outcomes. Measurement is a significant cycle behind how we make disclosures in
science, settle on choices dependent on information, and make forecasts.

What Are Examples of Statistics?

We consider a class of students as a sample of the population of all the students in the school.
We can calculate their average score in tests, their average height, weight etc based on the data
collected. The required parameters are determined using the statistical measures are analyzed and
interpreted further, as desired. For example, the scores of the students in the previous semester
and this semester

You might also like