Unit 4 Part 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

Descriptive Statistics

for
Data Sciences
what is statistics?
● It is the collection, organization, analysis and interpretation of
data.
● Statistics are mainly used to give numerical conclusions.
● For example, if anyone asks you how many people are
watching youtube, in this case, we can’t say: “many people
are watching youtube”, we have to answer in numerical terms
that give more meaning to you. We can say there are 2 billion+
monthly active users, in the same way; the users spend a daily
average of 18 minutes. This is the numerical way to conclude
the questions, and statistics is the medium used to make such
inference.
Statistics include
● Design of experiments: Used to understand Characteristics of
the dataset
● Sampling: Used to understand the samples
● Descriptive statistics: Summarization of data
● Inferential Statistics: Hypothesis way of concluding data
● Probability Theory: Likelihood estimation
Main statistical methods
● Descriptive statistics uses tools like mean and standard
deviation on a sample to summarize data.
● Inferential statistics, on the other hand, looks at data that can
randomly vary, and then draw conclusions from it.
Descriptive Statistics is distinguished from inferential statistics
by its aim to summarize the sample rather than use the data to
learn more about the Population
Descriptive statistics
● A raw dataset is difficult to describe.
● Descriptive statistics describe the dataset in a way simpler
manner through:
– The measure of central tendency (Mean, Median, Mode)
– Measure of spread (Range, Quartile, Percentiles, absolute deviation,
variance and standard deviation)
– Measure of symmetry (Skewness)
– Measure of Peakedness (Kurtosis)
Descriptive statistics using python
● import statistics as s
● s.mean(collection)
● s.mode(collection)
● s.median(collection)
● s.harmonic_mean(collection)
● s.median_low(collection)
● s.median_high(collection)
● s.variance(collection)
● s.stdev(collection)
Probability Theory
Phenomena

Deterministic Non-deterministic
Deterministic Phenomena

There exists a mathematical model that allows “perfect”
prediction the phenomenon's outcome.

Many examples exist in Physics, Chemistry (the exact
sciences).
Non-deterministic Phenomena

No mathematical model exists that allows “perfect” prediction
the phenomenon's outcome.
Non-deterministic Phenomena

may be divided into two groups.

1. Random phenomena
– Unable to predict the outcomes, but in the long-run, the
outcomes exhibit statistical regularity.

2. Haphazard phenomena
– unpredictable outcomes, but no long-run, exhibition of
statistical regularity in the outcomes.
Phenomena

Non-deterministic
Deterministic
Haphazard

Random
Random Phenomena

Examples
1. Tossing a coin – outcomes S ={Head, Tail}
Unable to predict on each toss whether is Head or Tail.
In the long run can predict that 50% of the time heads will
occur and 50% of the time tails will occur
2. Rolling a die – outcomes
S ={ , , , , , }

Unable to predict outcome but in the long run can one can
determine that each outcome will occur 1/6 of the time.
Use symmetry. Each side is the same. One side should not occur
more frequently than another side in the long run. If the die is
not balanced this may not be true.
Terminology
● The sample Space, S: for a random phenomena is the set of
all possible outcomes.

Examples
1. Tossing a coin – outcomes S ={Head, Tail}
2. Rolling a die – outcomes
S ={ }
={1, 2, 3, 4, 5, 6}
Terminology
● The event, E, is any subset of the sample space, S. i.e. any set
of outcomes (not necessarily all outcomes) of the random
phenomena
Venn
S diagram
E
Examples

1. Rolling a die – outcomes


S ={ }
={1, 2, 3, 4, 5, 6}
E = the event that an even number is rolled
= {2, 4, 6}

={ , , }
Special Events

The Null Event, The empty event - 

 = { } = the event that contains no outcomes


The Entire Event, The Sample Space - S
S = the event that contains all outcomes
The empty event, , never occurs.
The entire event, S, always occurs.
Set operations on Events
Union
Let A and B be two events, then the union of A and B is the event
(denoted by AB) defined by:

A  B = {e| e belongs to A or e belongs to B}


AB

A B
The event A  B occurs if the event A occurs or the event and B
occurs .

AB

A B
Intersection

Let A and B be two events, then the intersection of A and B is the


event (denoted by AB) defined by:

A  B = {e| e belongs to A and e belongs to B}


AB

A B
The event A  B occurs if the event A occurs and the
event and B occurs .

AB

A B
Complement

Let A be any event, then the complement of A (denoted by )


defined by: A

A = {e| e does not belongs to A}

A
A
The event Aoccurs if the event A does not occur

A
A
Definition: mutually exclusive
Two events A and B are called mutually exclusive if:

A B 

A B
If two events A and B are are mutually exclusive then:

1. They have no outcomes in common.


They can’t occur at the same time. The outcome of the random experiment can not
belong to both A and B.

A B
Definition: probability of an Event E.
Suppose that the sample space S = {o1, o2, o3, … oN} has a finite
number, N, of oucomes.
Also each of the outcomes is equally likely (because of
symmetry).
Then for any event E

n E n E no. of outcomes in E
P E =  
n S  N total no. of outcomes
Note : the symbol n  A  = no. of elements of A
Thus this definition of P[E], i.e.
n E n E no. of outcomes in E
P E =  
n S  N total no. of outcomes

Applies only to the special case when


1. The sample space has a finite no.of outcomes, and
2. Each outcome is equi-probable
If this is not true a more general definition of
probability is required.
Rule The additive rule
(Mutually exclusive events)
P[A  B] = P[A] + P[B]
i.e.
P[A or B] = P[A] + P[B]

if A  B = 
(A and B mutually exclusive)
If two events A and B are are mutually exclusive then:

1. They have no outcomes in common.


They can’t occur at the same time. The outcome of the random experiment can not
belong to both A and B.

A B
P[A  B] = P[A] + P[B]
i.e.
P[A or B] = P[A] + P[B]

A B
Rule The additive rule
(In general)

P[A  B] = P[A] + P[B] – P[A  B]

or
P[A or B] = P[A] + P[B] – P[A and B]
Logic A B
A B

A B

When P[A] is added to P[B] the outcome in A  B are counted twice

hence
P[A  B] = P[A] + P[B] – P[A  B]
P  A  B   P  A  P  B   P  A  B 

Example:
Saskatoon and Moncton are two of the cities competing for the
World university games. (There are also many others). The
organizers are narrowing the competition to the final 5 cities.
There is a 20% chance that Saskatoon will be amongst the final
5. There is a 35% chance that Moncton will be amongst the
final 5 and an 8% chance that both Saskatoon and Moncton
will be amongst the final 5. What is the probability that
Saskatoon or Moncton will be amongst the final 5.
Solution:
Let A = the event that Saskatoon is amongst the final 5.
Let B = the event that Moncton is amongst the final 5.
Given P[A] = 0.20, P[B] = 0.35, and P[A  B] = 0.08
What is P[A  B]?
Note: “and” ≡ , “or” ≡  .
P  A  B   P  A  P  B   P  A  B 
 0.20  0.35  0.08  0.47
Rule for complements

2. P  A  1  P  A

or
P  not A  1  P  A
Complement

Let A be any event, then the complement of A (denoted by )


defined by: A

A = {e| e does not belongs to A}

A
A
The event Aoccurs if the event A does not occur

A
A
Logic:
A and A are mutually exclusive.
and S  A  A

A
A

thus 1  P  S   P  A  P  A
and P  A  1  P  A
Conditional Probability

Frequently before observing the outcome of a random
experiment you are given information regarding the
outcome

How should this information be used in prediction of the
outcome.

Namely, how should probabilities be adjusted to take
into account this information

Usually the information is given in the following form:
You are told that the outcome belongs to a given event.
(i.e. you are told that a certain event has occurred)
Definition
Suppose that we are interested in computing the probability of
event A and we have been told event B has occurred.
Then the conditional probability of A given B is defined to be:

P  A  B if P  B  0
P  A B 
P  B
Rationale:
If we’re told that event B has occurred then the sample space is
restricted to B.
The probability within B has to be normalized, This is achieved by
dividing by P[B]
The event A can now only occur if the outcome is in of A ∩ B.
Hence the new probability of A is:

A
P  A  B B
P  A B 
P  B A∩B
An Example
The academy awards is soon to be shown.
For a specific married couple the probability that the husband
watches the show is 80%, the probability that his wife watches
the show is 65%, while the probability that they both watch the
show is 60%.
If the husband is watching the show, what is the probability that
his wife is also watching the show
Solution:
The academy awards is soon to be shown.
Let B = the event that the husband watches the show
P[B]= 0.80
Let A = the event that his wife watches the show
P[A]= 0.65 and P[A ∩ B]= 0.60
P  A  B 0.60
P  A B    0.75
P  B 0.80
Definition
Two events A and B are called independent if

P  A  B   P  A P  B 
Note if P  B   0 and P  A  0 then
P  A  B P  A P  B 
P  A B    P  A
P  B P  B
P  A  B P  A P  B 
and P  B A
    P  B
P  A P  A
Thus in the case of independence the conditional probability of an event is not affected by the knowledge of
the other event
Difference between independence
and mutually exclusive
mutually exclusive
Two mutually exclusive events are independent only in the special case where

P  A  0 and P  B   0. (also P  A  B   0

Mutually exclusive events are highly


A B dependent otherwise. A and B
cannot occur simultaneously. If one
event occurs the other event does not
occur.
Independent events
P  A  B   P  A P  B 

P  A  B P  A
or  P  A 
P  B P S
S
A B
A B The ratio of the probability of the set
A within B is the same as the ratio of
the probability of the set A within
the entire sample S.
The multiplicative rule of probability
 P  A P  B A if P  A  0
P  A  B  
 P  B  P  A B if P  B   0

and
P  A  B   P  A P  B 

if A and B are independent.


Random Variables

• In
an experiment, a measurement is usually
denoted by a variable such as X.

• In a random experiment, a variable whose


measured value can change (from one replicate of
the experiment to another) is referred to as a
random variable.
Probability

A probability is usually expressed in terms of a random
variable.
• For the part length example, X denotes the part
length and the probability statement can be written
in either of the following forms


Both equations state that the probability that the
random variable X assumes a value in [10.8, 11.2] is
0.25.
Probability Properties
Continuous Random Variables
Probability Density Function
• Theprobability distribution or simply distribution of
a random variable X is a description of the set of the
probabilities associated with the possible values for X.
Cumulative Distribution Function
Mean and Variance
Normal Distribution
Undoubtedly, the most widely used model for
the distribution of a random variable is a
normal distribution.

• Central limit theorem


• Gaussian distribution
Normal
Distribution
Normal Distribution
Central Limit Theorem
Discrete Random Variables

Only measurements at discrete points are possible

Probability
Mass Function
Cumulative Distribution Function

You might also like