Basics of Statistics
Basics of Statistics
INTERVIEW
PREPARATION SERIES
Aman Namita
Harsh Ashneer
Nandu
BASICS OF
STATISTICS
Harsh, the CEO of HungerKids (A
quick bites startup, producing quick
bite items like chips, sandwiches,
cakes, and other desi beverages)
recently got an opportunity to pitch his
business on Shark Tank’s platform.
1
After the pitch was over, Harsh got back
home and his 12 years old kid(Nandu) who
watched the complete show, got curious
to understand how he could answer all the
questions so easily?
2
Harsh replied that he used basic
Statistics. Now Nandu wanted to
know about what statistics is. Harsh
takes Nandu through his pitch again.
Let’s look at his pitch and the
questions asked by the sharks
Welcome
Harsh, what
Hello sharks I am have been your
Harsh. Today I will yearly sales till
present 5 products that now?
I am manufacturing in
my company - HungerKid
Pvt. Ltd. established 5
years ago.
On average
my sales have
been 35 lakhs
per year.
3
See, my
yearly sale
have been
But what is 10,20,25,45,
Average? 75
Then why
did you say
35 lakhs per
year?
4
This number is called Average.
MEAN
Sales
Year
As it defines all other points in
graph. So, mean or average is that
central value that defines data.
Oh,
got it!
6
Product 1 Product 2 Product 3 Product 4 Product 5 Product 6 Product 7
The central value is the mean which considers all points in the data.
Let me explain it.
Yearly Sales are 10,20,25,45,75 respectively
Now, If my sales increase, the mean will also increase.
For eg-> if my sales have been 5,15,10,20,150
Then my mean will be 40 (5+15+10+20+150/5)
which is due to the extreme value.
Right? So my mean will shift if I have extremely high or
extremely low values in data. It does not depict the exact
middle value as it represents all data points.
For this reason, we use "median" in statistics to determine the
middle value. Here,
▪︎My median is 10,20,25,45,75 lakhs
▪︎"Outliers" refers to values that are either extremely small or
extremely high. As a result of the presence of such values, the
mean will shift to the left or right.
7
Let's summarize-
Statistics is a way of collecting, analyzing, and interpreting
data. In simpler terms, it's a way to make sense of
numbers and use them to solve problems.
We have also summarized and described our data. This is
“Descriptive Statistics” - a part of statistics.
Descriptive statistics is a branch of statistics that involves
summarizing and describing a set of data. It provides a way
to analyze and understand the characteristics of a dataset,
such as its central tendency, variability, and shape.
As you saw,we summarized and described everything in just one
value, right ???
So, when we describe the data based on a single value around
which the entire data revolves, this point is known as the
"central value" or the "measure of central tendency".
These measures are:
Mean
Median
Mode
8
Measures of central tendency:
9
You asked about the
central value but
what about the
values which are not
Oh, I didn’t lying in the center?
think about And those that are
it. extreme?
10
Measures of dispersion are statistical measures that
describe the spread or variability of a dataset. They
provide information on how spread out the data points
are from the central tendency of the dataset.
MEAN
Sales
Year
Let's revisit this.
Here’s the sales of the product-
Product Period 1 Period 2 Period 3 Period 4 Period 5
A 100 150 200 250 300
B 200 250 300 350 400
C 300 350 400 450 500
D 400 450 500 550 600
E 500 550 600 650 700
11
Okay, tell me Product A: 30
about the MRP Product B: 130
of 5 products Product C: 200
you have Product D: 300
Product E: 230
Do you know
why they No
asked ??
Range?
To get an idea
The range is the about the MRP
difference range of my
between the products..
largest and
smallest values in
a dataset.
It is important to measure dispersion as it provides a quick and
simple way to understand the spread of a dataset.
A larger range indicates a spread out data while a smaller range
indicates that the data is more tightly clustered around a central
value. This info is crucial to compare different datasets or to draw
conclusions about the variability of a particular variable.
So the range of my products is 300-30=270.
Ohhkay
12
What’s the
variation of
Sales for
various
products??
14
While calculating
variance, we squared
the difference
between the value and
the mean. Why did
we square each
difference?
Good
Question!!!
15
Then what is
the standard
variation?
16
Why do we square root the value of variance?
So that we get the value in absolute terms.
Understanding variance is difficult. If we look at variance
of product A, it’s 5100. Now, it’s difficult to know what
this value represents. But if we take the square root of
the variance, we get 10.51 which means that the sales of
product A are 10 units above and below the average sale
value of product A.
SD Of PRODUCT A, sqrt(variance)=sqrt(5100)= 10.51
Let’s conclude - lesser deviation means data is clustered
more towards the mean, and if the deviation is more, the
data is more away from the mean and is spread widely.
17
Which
product has
more stable
sales?
Very simple!
Using one of the
How did most crucial
you answer concepts -
this? Coefficient of
Variation
DESCRIPTIVE
STATISTICS
MEASURES OF MEASURES OF
CENTRAL TENDENCY DISPERSION
20
If you’d like us to keep
posting, support us by
sharing this post.
Give it a big thumbs up
and tag people who will
find this helpful.
Content Designer : Hanit Kaur
Graphic Designer : Adithya Prasad
Content Lead : Sumit Shukla
With Love
21