Unit 4 Part 1
Unit 4 Part 1
Unit 4 Part 1
for
Data Sciences
what is statistics?
● It is the collection, organization, analysis and interpretation of
data.
● Statistics are mainly used to give numerical conclusions.
● For example, if anyone asks you how many people are
watching youtube, in this case, we can’t say: “many people
are watching youtube”, we have to answer in numerical terms
that give more meaning to you. We can say there are 2 billion+
monthly active users, in the same way; the users spend a daily
average of 18 minutes. This is the numerical way to conclude
the questions, and statistics is the medium used to make such
inference.
Statistics include
● Design of experiments: Used to understand Characteristics of
the dataset
● Sampling: Used to understand the samples
● Descriptive statistics: Summarization of data
● Inferential Statistics: Hypothesis way of concluding data
● Probability Theory: Likelihood estimation
Main statistical methods
● Descriptive statistics uses tools like mean and standard
deviation on a sample to summarize data.
● Inferential statistics, on the other hand, looks at data that can
randomly vary, and then draw conclusions from it.
Descriptive Statistics is distinguished from inferential statistics
by its aim to summarize the sample rather than use the data to
learn more about the Population
Descriptive statistics
● A raw dataset is difficult to describe.
● Descriptive statistics describe the dataset in a way simpler
manner through:
– The measure of central tendency (Mean, Median, Mode)
– Measure of spread (Range, Quartile, Percentiles, absolute deviation,
variance and standard deviation)
– Measure of symmetry (Skewness)
– Measure of Peakedness (Kurtosis)
Descriptive statistics using python
● import statistics as s
● s.mean(collection)
● s.mode(collection)
● s.median(collection)
● s.harmonic_mean(collection)
● s.median_low(collection)
● s.median_high(collection)
● s.variance(collection)
● s.stdev(collection)
Probability Theory
Phenomena
Deterministic Non-deterministic
Deterministic Phenomena
●
There exists a mathematical model that allows “perfect”
prediction the phenomenon's outcome.
●
Many examples exist in Physics, Chemistry (the exact
sciences).
Non-deterministic Phenomena
●
No mathematical model exists that allows “perfect” prediction
the phenomenon's outcome.
Non-deterministic Phenomena
●
may be divided into two groups.
1. Random phenomena
– Unable to predict the outcomes, but in the long-run, the
outcomes exhibit statistical regularity.
2. Haphazard phenomena
– unpredictable outcomes, but no long-run, exhibition of
statistical regularity in the outcomes.
Phenomena
Non-deterministic
Deterministic
Haphazard
Random
Random Phenomena
Examples
1. Tossing a coin – outcomes S ={Head, Tail}
Unable to predict on each toss whether is Head or Tail.
In the long run can predict that 50% of the time heads will
occur and 50% of the time tails will occur
2. Rolling a die – outcomes
S ={ , , , , , }
Unable to predict outcome but in the long run can one can
determine that each outcome will occur 1/6 of the time.
Use symmetry. Each side is the same. One side should not occur
more frequently than another side in the long run. If the die is
not balanced this may not be true.
Terminology
● The sample Space, S: for a random phenomena is the set of
all possible outcomes.
Examples
1. Tossing a coin – outcomes S ={Head, Tail}
2. Rolling a die – outcomes
S ={ }
={1, 2, 3, 4, 5, 6}
Terminology
● The event, E, is any subset of the sample space, S. i.e. any set
of outcomes (not necessarily all outcomes) of the random
phenomena
Venn
S diagram
E
Examples
={ , , }
Special Events
A B
The event A B occurs if the event A occurs or the event and B
occurs .
AB
A B
Intersection
A B
The event A B occurs if the event A occurs and the
event and B occurs .
AB
A B
Complement
A
A
The event Aoccurs if the event A does not occur
A
A
Definition: mutually exclusive
Two events A and B are called mutually exclusive if:
A B
A B
If two events A and B are are mutually exclusive then:
A B
Definition: probability of an Event E.
Suppose that the sample space S = {o1, o2, o3, … oN} has a finite
number, N, of oucomes.
Also each of the outcomes is equally likely (because of
symmetry).
Then for any event E
n E n E no. of outcomes in E
P E =
n S N total no. of outcomes
Note : the symbol n A = no. of elements of A
Thus this definition of P[E], i.e.
n E n E no. of outcomes in E
P E =
n S N total no. of outcomes
if A B =
(A and B mutually exclusive)
If two events A and B are are mutually exclusive then:
A B
P[A B] = P[A] + P[B]
i.e.
P[A or B] = P[A] + P[B]
A B
Rule The additive rule
(In general)
or
P[A or B] = P[A] + P[B] – P[A and B]
Logic A B
A B
A B
hence
P[A B] = P[A] + P[B] – P[A B]
P A B P A P B P A B
Example:
Saskatoon and Moncton are two of the cities competing for the
World university games. (There are also many others). The
organizers are narrowing the competition to the final 5 cities.
There is a 20% chance that Saskatoon will be amongst the final
5. There is a 35% chance that Moncton will be amongst the
final 5 and an 8% chance that both Saskatoon and Moncton
will be amongst the final 5. What is the probability that
Saskatoon or Moncton will be amongst the final 5.
Solution:
Let A = the event that Saskatoon is amongst the final 5.
Let B = the event that Moncton is amongst the final 5.
Given P[A] = 0.20, P[B] = 0.35, and P[A B] = 0.08
What is P[A B]?
Note: “and” ≡ , “or” ≡ .
P A B P A P B P A B
0.20 0.35 0.08 0.47
Rule for complements
2. P A 1 P A
or
P not A 1 P A
Complement
A
A
The event Aoccurs if the event A does not occur
A
A
Logic:
A and A are mutually exclusive.
and S A A
A
A
thus 1 P S P A P A
and P A 1 P A
Conditional Probability
●
Frequently before observing the outcome of a random
experiment you are given information regarding the
outcome
●
How should this information be used in prediction of the
outcome.
●
Namely, how should probabilities be adjusted to take
into account this information
●
Usually the information is given in the following form:
You are told that the outcome belongs to a given event.
(i.e. you are told that a certain event has occurred)
Definition
Suppose that we are interested in computing the probability of
event A and we have been told event B has occurred.
Then the conditional probability of A given B is defined to be:
P A B if P B 0
P A B
P B
Rationale:
If we’re told that event B has occurred then the sample space is
restricted to B.
The probability within B has to be normalized, This is achieved by
dividing by P[B]
The event A can now only occur if the outcome is in of A ∩ B.
Hence the new probability of A is:
A
P A B B
P A B
P B A∩B
An Example
The academy awards is soon to be shown.
For a specific married couple the probability that the husband
watches the show is 80%, the probability that his wife watches
the show is 65%, while the probability that they both watch the
show is 60%.
If the husband is watching the show, what is the probability that
his wife is also watching the show
Solution:
The academy awards is soon to be shown.
Let B = the event that the husband watches the show
P[B]= 0.80
Let A = the event that his wife watches the show
P[A]= 0.65 and P[A ∩ B]= 0.60
P A B 0.60
P A B 0.75
P B 0.80
Definition
Two events A and B are called independent if
P A B P A P B
Note if P B 0 and P A 0 then
P A B P A P B
P A B P A
P B P B
P A B P A P B
and P B A
P B
P A P A
Thus in the case of independence the conditional probability of an event is not affected by the knowledge of
the other event
Difference between independence
and mutually exclusive
mutually exclusive
Two mutually exclusive events are independent only in the special case where
P A 0 and P B 0. (also P A B 0
P A B P A
or P A
P B P S
S
A B
A B The ratio of the probability of the set
A within B is the same as the ratio of
the probability of the set A within
the entire sample S.
The multiplicative rule of probability
P A P B A if P A 0
P A B
P B P A B if P B 0
and
P A B P A P B
• In
an experiment, a measurement is usually
denoted by a variable such as X.
●
Both equations state that the probability that the
random variable X assumes a value in [10.8, 11.2] is
0.25.
Probability Properties
Continuous Random Variables
Probability Density Function
• Theprobability distribution or simply distribution of
a random variable X is a description of the set of the
probabilities associated with the possible values for X.
Cumulative Distribution Function
Mean and Variance
Normal Distribution
Undoubtedly, the most widely used model for
the distribution of a random variable is a
normal distribution.
Probability
Mass Function
Cumulative Distribution Function