Lecture-6: Introduction To Data Science
Lecture-6: Introduction To Data Science
• Types:
– descriptive statistics
– inferential statistics
2
Descriptive Statistics: Variability
(Spread)=>First Quartile and Third Quartile
• The lower half of a data set is the set of all values that are to
the left of the median value when the data has been put into
increasing order.
• The upper half of a data set is the set of all values that are to
the right of the median value when the data has been put into
increasing order.
• The first quartile, denoted by Q1 , is the median of the lower
half of the data set. This means that about 25% of the numbers
in the data set lie below Q1 and about 75% lie above Q1 .
• The third quartile, denoted by Q3 , is the median of the upper
half of the data set. This means that about 75% of the numbers
in the data set lie below Q3 and about 25% lie above Q3 .
3
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
4
https://www.slideshare.net/Sazedur92/measures-of-dispersion-73562437
5
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
Box-and-whisker plots
6
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
Box-and-whisker plots
Five-Number Summary
Definitions:
• The minimum value of a data set is the least value in the set.
• The maximum value of a data set is the greatest value in the
set.
• The range of a data set is the distance between the maximum
and minimum value. To compute the range of a data set, we
subtract the minimum from the maximum:
range = maximum – minimum.
• The interquartile range of a data set is the distance between
the two quartiles.
Interquartile range = Q3 – Q1.
7
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
Box-and-whisker plots
8
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
Box-and-whisker plots
9
http://web.mnstate.edu/peil/MDEV102/U4/S36/S363.html
Probability
10
11
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
12
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
13
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
Disjoint (Mutually exclusive event)
14
Probability Distribution
15
Probability Density Function
16
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
Normal distribution
17
Normal distribution
18
Normal distribution
19
Normal distribution
20
Normal distribution
21
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
22
Source: Statistics And Probability Tutorial | Statistics And Probability for Data Science | Edureka
Types of Probability
23
24
25