Lecture_ Week 9 Statistical Inference
Lecture_ Week 9 Statistical Inference
(Reasoning)
Stats: Learning Outcomes
After this class, you will be able to:
1. Define what is meant by Correlation r and what does its value tell us.
2. Guess roughly, by looking at a scatter plot the value of r, in particular, is
it positive or negative , close to 1 (-1)?
3. Define and distinguish the terms :sample’ and ‘Population”
4. Define what is meant by the p-value.
5. State the null and alternative hypotheses for the purpose of performing
significance tests for some set of data.
6. Depending on the p-value given, state conclusions about which
hypothesis to accept.
Statistical Inference: Correlation
• A professor noted over years that students who had a better
command of English did better in the course ART 101.
• How can she test this observation? In other words, what does it take
for her to say “ Indeed, students who have better command in English
would do better in ART 101” ?.
• Note that this is a sort of inductive argument.
• There is a simple statistical concept that can help in making such
inductive inference: Correlation.
Statistical Inference: Correlation
• When a scatter plot of two quantitative variables suggests a straight-
line pattern, we say there is correlation between these variables.
• Evidently, the degree or strength of correlation is reflected by how
close to a straight line is the data scattered.
A B
• The variables in B have
Stronger correlation than
Those in A.
Statistical Inference: Correlation
• In the figure below, it seems that the data on the right have stronger
correlation.
• Wrong: same data plotted using different scales.
• So: Judging by eye is not sufficient.
Statistical Inference: Correlation
• Correlation quantified:
Statistical Inference: Correlation
• Positive correlation ( r>0) : If one variable increases the other also
increases and vice versa.
• Negative correlation ( r<0) : If one variable increases the other
decreases and vice versa.
• Correlation makes no use of the distinction between explanatory
( independent) and response ( dependent) variables. It makes no
difference which variable you call x and which you call y in calculating
the correlation.
Statistical Inference: Correlation
• Correlation requires that both variables be quantitative, so that it
makes sense to do the arithmetic indicated by the formula for r.
• The correlation r is always a number between −1 and 1:
-Values of r near 0 indicate a very weak linear relationship.
- Values of r close to −1 or 1 indicate that the points lie close to a
straight line.
- The extreme values r = −1 and r = 1 occur only when the points in a
scatterplot lie exactly along the straight line.
Statistical Inference: Correlation
• Correlation measures the strength of only the linear relationship
between two variables. Correlation does not describe curved
relationships between, no matter how strong they are.
• This does not mean that there is no relation between variables whose
scatter plot is not linear!
• It means that you cannot use r as a measure of the correlation in this
case.
• So, first plot to see that the scatterplot suggests a straight line then
calculate r.
Statistical Inference: Correlation
• Examples:
Statistical Inference: Correlation
• Examples:
Statistical Inference: Correlation-DO
• Coffee is a leading export from several developing countries. When
coffee prices are high, farmers often clear forest to plant more coffee
trees. Here are data for five years on prices paid to coffee growers in
Indonesia and the rate of deforestation in a national park that lies in a
coffee-producing region.
• Find the correlation r and make a
Scatterplot using the Pearson
Correlation Coefficient Calculator
https://www.socscistatistics.com/tests/pearson/default2.aspx
Statistical Inference: Correlation-DO
• Your results should look like this:
• r = 0.9552
Statistical Inference: Correlation-DO
• Do the two exercise posted to lms under current week using the
Pearson Correlation Coefficient Calculator at the link provide in
previous slide.
Statistical Inference: Significance
Tests
• Recall the following type of inductive inference:
• Enumerative Induction ( inductive generalization):
when we arrive at a generalization about a group of things after
observing only some members of that group.
Symbolically:
X (proportion/percent) of the observed members ( i.e. sample) of A
are B. Therefore, X (proportion/percent) of the entire group
(population) of A are B.
Statistical Inference: Significance
Tests
Conclusions
HCQ might offer no benefits in terms of decreasing the viral load and
radiological improvement in patients with COVID-19. HCQ appears to
be associated with higher odds of all-cause mortality and NAEs
Statistical Inference: Siginificance
Tests-DO
• Basket team A coach claims that his team is doing better than team B
in three-pointers. He bases his claim on the following statistics of both
teams during the last 8 games (SAMPLE).
Team A 12 7 13 7 8 7 11 9 Mean A
9.25
Team B 4 7 6 11 9 8 4 7 Mean B
7