Correlation, Probability
Correlation, Probability
Correlation
•
Pearson Correlation Coefficient
Interpretation
• Range: -1 to +1
• Positive correlation: Variables move up together
• Example: Correlation of 0.80 between Hours spent studying and test scores
• Negative correlation: As one variable moves up, the other moves
down
• Example: Correlation of -0.70 between Hours spent watching TV and physical
fitness
• Zero correlation: Variables are unrelated
• Example: Correlation of 0.02 between Shoe size and IQ score
Positive, Negative, No Correlation
Pearson Correlation Coefficient for Earlier
Example Height Weight
65 68
•
67 69
68 70
66 69
64 65
Covariance Example
• C:\code\Data Analytics\correlation-covariance-crypto-gold.py
• Interpretation: Correlation
• Interpretation: Covariance
Why Spearman Rank Correlation?
• Pearson correlation coefficient: Works well when data is linear, but
not well when the data is not linear
Spearman Rank Correlation
•
Spearman Rank Correlation Coefficient for
Height Weight
Earlier Example 65 68
67 69
• 68 70
66 69
64 65
xi Rx yi Ry di = R x - R y d i2
65 2 68 2 0 0
67 4 69 3.5 0.5 0.25
68 5 70 5 0 0
66 3 69 3.5 -0.5 0.25
64 1 65 1 0 0
Population and Sample,
Probability Theory
Sampling and Population
•
Population Sampling Sample
Event (E)
Events and Sample Space: Examples
• Consider students in our class
Presence event Activeness event
present active
absent inactive
Sample space
{(present, active), (present, inactive), (absent, active), (absent, inactive)}
Impossible Equal chance Certain
1 0 0.50 = 50%
1
2 00 0.25 = 25%
01
10
11 Imagine the team
3 000 0.125 = 12.5% actually winning four
001 consecutive matches
010
011
100 Theoretical
101
110 probability would
111 diminish to 6.25%
4 0000 0.0625 = 6.25%
0001
0010 But the team has
0011 actually won! So, the
0100
0101 empirical probability
0110 will be 100%
0111
1000
1001
1010
1011
1100
1101
1110
1111
Marginal, Joint, Conditional
Probability
Basic Terms
• Independent events: Outcome of Event A does not impact the
outcome of Event B
• We take out one marble, note its colour, put the marble back (called
replacement)
• We then take out a second marble and note its colour
• What is the probability that both are red?
• Dependent events: Outcome of Event A does impact the outcome of
Event B
• We take out one marble, note its colour, do not put the marble back (no
replacement)
• We then take out a second marble and note its colour
• What is the probability that both are red?
Basic Terms
• Marginal probability: Probability of an event irrespective of the
outcome of another variable: P(A)
• Joint probability: Events A and B are happening together, whether
they are independent or dependent: P(A,B)
• Independent events: If A and B are independent, P(A,B) = P(A) x P(B)
• Dependent events: If A and B are dependent, P(A,B) = P(A) x P(B|A), where
P(B|A) is the conditional probability of B given A
Marginal Probability
•
Joint Probability of Independent Events
•
Joint Probability of Dependent Events
• Dependent event: •
The outcome of Event
A impacts the
outcome of Event B
• A jar contains 2 blue
marbles and 3 red
marbles
• If you take two
marbles out of the jar
without putting the
first one back, what is
the probability that
they are both red?
Conditional Probability to Bayes’ Theorem
•
Bayes’ Theorem Example
• 10% of patients entering a clinic have liver disease
• 5% of patients entering a clinic are alcoholic
• Out of the patients who have liver disease, 7% are alcoholics
• A = Liver disease; So P(A) = 0.10
• B = Alcoholic; So P(B) = 0.05
• B|A = Patient is alcoholic, given that the patient has liver disease; So
P(B|A) = 0.07
• Find P(A|B), i.e. probability that the patient has liver disease, given
that the patient is alcoholic
Bayes’ Theorem Example
•
Bayes’ Theorem Example
• Rain prediction
• Overall historical probability of rain: P(R) = 0.30
• Sky condition
• P(Overcast|Rain) = 0.8
• P(Clear-sky|Rain) = 0.2
• P(Overcast) = 0.6
• Find P(Rain|Overcast), because today it is overcast
• Suppose A = Rain, B = Overcast sky condition
Bayes’ Theorem Example
•
Bayes’ Theorem
• Dangerous fires are rare (1%)
• Smoke is quite common due to barbecues (10%)
• 90% dangerous fires cause smoke
• What is the probability that we have a dangerous fire when there is
smoke?
Bayes’ Theorem
•
Conditional Probability to Bayes’ Theorem
•