Spurious Relationship

A spurious relationship is a mathematical relationship between two or more variables that are associated but not causally related. This can occur due to coincidence or the presence of a third, unseen confounding variable. Examples given include ice cream sales appearing related to drownings (but really both caused by heat), birth rates appearing related to stork nestings (but really both caused by weather), and football game outcomes appearing related to election outcomes (but with no causal link). Statistical analyses aim to detect and avoid concluding spurious relationships exist by controlling for potential confounding variables.

Uploaded by

isabella343

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views5 pages

Spurious Relationship

Uploaded by

isabella343

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Spurious relationship

In statistics, a spurious relationship or spurious correlation[1][2]

is a mathematical relationship in which two or more events or
variables are associated but not causally related, due to either
coincidence or the presence of a certain third, unseen factor
(referred to as a "common response variable", "confounding
factor", or "lurking variable").
Whereas a mediator is a factor in the
Examples causal chain (top), a confounder is a
spurious factor incorrectly implying
An example of a spurious relationship can be found in the time- causation (bottom)
series literature, where a spurious regression is a regression that
provides misleading statistical evidence of a linear relationship
between independent non-stationary variables. In fact, the non-stationarity may be due to the presence of a
unit root in both variables.[3][4] In particular, any two nominal economic variables are likely to be correlated
with each other, even when neither has a causal effect on the other, because each equals a real variable
times the price level, and the common presence of the price level in the two data series imparts correlation
to them. (See also spurious correlation of ratios.)

Another example of a spurious relationship can be seen by examining a city's ice cream sales. The sales
might be highest when the rate of drownings in city swimming pools is highest. To allege that ice cream
sales cause drowning, or vice versa, would be to imply a spurious relationship between the two. In reality, a
heat wave may have caused both. The heat wave is an example of a hidden or unseen variable, also known
as a confounding variable.

Another commonly noted example is a series of Dutch statistics showing a positive correlation between the
number of storks nesting in a series of springs and the number of human babies born at that time. Of course
there was no causal connection; they were correlated with each other only because they were correlated
with the weather nine months before the observations.[5]

In rare cases, a spurious relationship can occur between two completely unrelated variables without any
confounding variable, as was the case between the success of the Washington Commanders professional
football team in a specific game before each presidential election and the success of the incumbent
President's political party in said election. For 16 consecutive elections between 1940 and 2000, the
Redskins Rule correctly matched whether the incumbent President's political party would retain or lose the
Presidency. The rule eventually failed shortly after Elias Sports Bureau discovered the correlation in 2000;
in 2004, 2012 and 2016, the results of the Commanders' game and the election did not match.[6][7][8] In a
similar spurious relationship involving the National Football League, in the 1970s, Leonard Koppett noted
a correlation between the direction of the stock market and the winning conference of that year's Super
Bowl, the Super Bowl indicator; the relationship maintained itself for most of the 20th century before
reverting to more random behavior in the 21st.[9]

Hypothesis testing
Often one tests a null hypothesis of no correlation between two variables, and chooses in advance to reject
the hypothesis if the correlation computed from a data sample would have occurred in less than (say) 5% of
data samples if the null hypothesis were true. While a true null hypothesis will be accepted 95% of the time,
the other 5% of the times having a true null of no correlation a zero correlation will be wrongly rejected,
causing acceptance of a correlation which is spurious (an event known as Type I error). Here the spurious
correlation in the sample resulted from random selection of a sample that did not reflect the true properties
of the underlying population.

Detecting spurious relationships

The term "spurious relationship" is commonly used in statistics and in particular in experimental research
techniques, both of which attempt to understand and predict direct causal relationships (X → Y). A non-
causal correlation can be spuriously created by an antecedent which causes both (W → X and W → Y).
Mediating variables, (X → W → Y), if undetected, estimate a total effect rather than direct effect without
adjustment for the mediating variable M. Because of this, experimentally identified correlations do not
represent causal relationships unless spurious relationships can be ruled out.

Experiments

In experiments, spurious relationships can often be identified by controlling for other factors, including
those that have been theoretically identified as possible confounding factors. For example, consider a
researcher trying to determine whether a new drug kills bacteria; when the researcher applies the drug to a
bacterial culture, the bacteria die. But to help in ruling out the presence of a confounding variable, another
culture is subjected to conditions that are as nearly identical as possible to those facing the first-mentioned
culture, but the second culture is not subjected to the drug. If there is an unseen confounding factor in those
conditions, this control culture will die as well, so that no conclusion of efficacy of the drug can be drawn
from the results of the first culture. On the other hand, if the control culture does not die, then the researcher
cannot reject the hypothesis that the drug is efficacious.

Non-experimental statistical analyses

Disciplines whose data are mostly non-experimental, such as economics, usually employ observational data
to establish causal relationships. The body of statistical techniques used in economics is called
econometrics. The main statistical method in econometrics is multivariable regression analysis. Typically a
linear relationship such as

is hypothesized, in which is the dependent variable (hypothesized to be the caused variable), for
th
j = 1, ..., k is the j independent variable (hypothesized to be a causative variable), and is the error term
(containing the combined effects of all other causative variables, which must be uncorrelated with the
included independent variables). If there is reason to believe that none of the s is caused by y, then
estimates of the coefficients are obtained. If the null hypothesis that is rejected, then the
alternative hypothesis that and equivalently that causes y cannot be rejected. On the other hand,
if the null hypothesis that cannot be rejected, then equivalently the hypothesis of no causal effect of
on y cannot be rejected. Here the notion of causality is one of contributory causality: If the true value
, then a change in will result in a change in y unless some other causative variable(s), either
included in the regression or implicit in the error term, change in such a way as to exactly offset its effect;
thus a change in is not sufficient to change y. Likewise, a change in is not necessary to change y,
because a change in y could be caused by something implicit in the error term (or by some other causative
explanatory variable included in the model).

Regression analysis controls for other relevant variables by including them as regressors (explanatory
variables). This helps to avoid mistaken inference of causality due to the presence of a third, underlying,
variable that influences both the potentially causative variable and the potentially caused variable: its effect
on the potentially caused variable is captured by directly including it in the regression, so that effect will not
be picked up as a spurious effect of the potentially causative variable of interest. In addition, the use of
multivariate regression helps to avoid wrongly inferring that an indirect effect of, say x1 (e.g., x1 → x2 → y)
is a direct effect (x1 → y).

Just as an experimenter must be careful to employ an experimental design that controls for every
confounding factor, so also must the user of multiple regression be careful to control for all confounding
factors by including them among the regressors. If a confounding factor is omitted from the regression, its
effect is captured in the error term by default, and if the resulting error term is correlated with one (or more)
of the included regressors, then the estimated regression may be biased or inconsistent (see omitted variable
bias).

In addition to regression analysis, the data can be examined to determine if Granger causality exists. The
presence of Granger causality indicates both that x precedes y, and that x contains unique information
about y.

Other relationships
There are several other relationships defined in statistical analysis as follows.

Direct relationship
Mediating relationship
Moderating relationship

See also
Causality
Correlation does not imply causation
Illusory correlation
Model specification
Omitted-variable bias
Post hoc fallacy
Statistical model validation
One in ten rule

Literature
David A. Freedman (1983) A Note on Screening Regression Equations, The American
Statistician, 37:2, 152-155, DOI: 10.1080/00031305.1983.10482729

Footnotes
1. Burns, William C., "Spurious Correlations (https://web.archive.org/web/20190925212058/htt
p://www.burns.com/wcbspurcorl.htm)", 1997.
2. Pearl, Judea. "UCLA 81st Faculty Research Lecture Series" (http://singapore.cs.ucla.edu/LE
CTURE/lecture_sec1b.htm). singapore.cs.ucla.edu. Retrieved 2019-11-10.
3. Yule, G. Udny (1926-01-01). "Why do we Sometimes get Nonsense-Correlations between
Time-Series? A Study in Sampling and the Nature of Time-Series" (https://semanticscholar.o
rg/paper/bcaa3dd240555b9e93197f49f34531abecf439e1). Journal of the Royal Statistical
Society. 89 (1): 1–63. doi:10.2307/2341482 (https://doi.org/10.2307%2F2341482).
JSTOR 2341482 (https://www.jstor.org/stable/2341482). S2CID 126346450 (https://api.sema
nticscholar.org/CorpusID:126346450).
4. Granger, Clive W. J.; Ghysels, Eric; Swanson, Norman R.; Watson, Mark W. (2001). Essays
in Econometrics: Collected Papers of Clive W. J. Granger (https://archive.org/details/essaysi
neconomet0001gran). Cambridge University Press. ISBN 978-0521796491.
5. Sapsford, Roger; Jupp, Victor, eds. (2006). Data Collection and Analysis. Sage. ISBN 0-
7619-4362-5.
6. Hofheimer, Bill (October 30, 2012). " 'Redskins Rule': MNF's Hirdt on intersection of football
& politics" (http://www.espnfrontrow.com/2012/10/redskins-rule-mnfs-hirdt-on-intersection-of-
football-politics/). ESPN. Retrieved October 16, 2016.
7. Manker, Rob (November 7, 2012). "Redskins Rule: Barack Obama's victory over Mitt
Romney tackles presidential predictor for its first loss" (http://articles.chicagotribune.com/201
2-11-07/business/ct-talk-redskins-rule-1108-20121107_1_popular-vote-home-game-redskin
s-victory). Chicago Tribune. Retrieved November 8, 2012.
8. Pohl, Robert S. (2013). Urban Legends & Historic Lore of Washington (https://books.google.
com/books?id=rZIVBAAAQBAJ). The History Press. pp. 78–80. ISBN 978-1625846648.
9. Don Peppers. "Big Data. Super Bowl. Small Minds" (http://www.linkedin.com/today/post/artic
le/20130204035821-17102372-big-data-super-bowl-small-minds). Retrieved December 31,
2015.

References
Gumbel, E.J. (1926), "Spurious correlation and its significance to physiology", Journal of the
American Statistical Association, 21 (154): 179–194, doi:10.1080/01621459.1926.10502169
(https://doi.org/10.1080%2F01621459.1926.10502169)
Banerjee, A.; Dolado, J.; Galbraith, J. W.; Hendry, D. F. (1993). Co-Integration, Error-
Correction, and the Econometric Analysis of Non-Stationary Data. Oxford University Press.
pp. 70–81. ISBN 0-19-828810-7.
Pearl, Judea (2000). Causality: Models, Reasoning and Inference (https://archive.org/details/
causalitymodelsr0000pear). Cambridge University Press. ISBN 0521773628.

External links
Spurious correlations (http://www.tylervigen.com/spurious-correlations) – a collection of
examples

Retrieved from "https://en.wikipedia.org/w/index.php?title=Spurious_relationship&oldid=1166798804"

《Proofs》
No ratings yet
《Proofs》
330 pages
Business Statistics Unit 3-5
No ratings yet
Business Statistics Unit 3-5
113 pages
Association and Causation
100% (1)
Association and Causation
18 pages
Remote Sensing Image Processing
100% (1)
Remote Sensing Image Processing
137 pages
Spruious Regression and Ghouse Equation
No ratings yet
Spruious Regression and Ghouse Equation
23 pages
Regression C
No ratings yet
Regression C
48 pages
Causality by David Cox
No ratings yet
Causality by David Cox
21 pages
Coolidge Chapter 6
No ratings yet
Coolidge Chapter 6
57 pages
ELI5 (Not So Much) Linear Regression
No ratings yet
ELI5 (Not So Much) Linear Regression
21 pages
Granger Causality Test
100% (1)
Granger Causality Test
3 pages
Correlation Analysis PDF
No ratings yet
Correlation Analysis PDF
30 pages
Correlation Analysis
No ratings yet
Correlation Analysis
48 pages
BA1 Chapter 10
No ratings yet
BA1 Chapter 10
11 pages
CORRELATION
No ratings yet
CORRELATION
22 pages
Econ5813 Lecturenotes Lecture1 0
No ratings yet
Econ5813 Lecturenotes Lecture1 0
25 pages
Simple Linear Regression Scott M Lynch
No ratings yet
Simple Linear Regression Scott M Lynch
111 pages
Statistical Model For Agriculture (Cost and Yield Pridiction)
No ratings yet
Statistical Model For Agriculture (Cost and Yield Pridiction)
14 pages
Using Statistics To Determine Causal Relationships: Jerome P. Reiter
No ratings yet
Using Statistics To Determine Causal Relationships: Jerome P. Reiter
15 pages
Pearl 10 A
No ratings yet
Pearl 10 A
20 pages
Causal Relations Via Econometrics: Munich Personal Repec Archive
No ratings yet
Causal Relations Via Econometrics: Munich Personal Repec Archive
23 pages
BS Unit 4
No ratings yet
BS Unit 4
21 pages
Strategic Management
No ratings yet
Strategic Management
114 pages
318 Economics Eng Lesson10
No ratings yet
318 Economics Eng Lesson10
26 pages
M Api
No ratings yet
M Api
17 pages
Online Class Etiquettes and Precautions For The Students
No ratings yet
Online Class Etiquettes and Precautions For The Students
49 pages
Association
No ratings yet
Association
15 pages
Presentation of Econometrics: Topic Granger Causality Test Submitted by Qarsam Ilyas Roll No 7
No ratings yet
Presentation of Econometrics: Topic Granger Causality Test Submitted by Qarsam Ilyas Roll No 7
15 pages
Correlation and Causation Worksheet
No ratings yet
Correlation and Causation Worksheet
4 pages
Experimental Design
No ratings yet
Experimental Design
6 pages
Prof. Dr. Moustapha Ibrahim Salem Mansourms@alexu - Edu.eg 01005857099
No ratings yet
Prof. Dr. Moustapha Ibrahim Salem Mansourms@alexu - Edu.eg 01005857099
34 pages
1504677559module-33 Quadrant-I
No ratings yet
1504677559module-33 Quadrant-I
17 pages
Varahamihira
100% (2)
Varahamihira
6 pages
Freedman - Shoe Leather Statistical Model
No ratings yet
Freedman - Shoe Leather Statistical Model
24 pages
EC212: Introduction To Econometrics (Wooldridge, Ch. 1) : Tatiana Komarova
No ratings yet
EC212: Introduction To Econometrics (Wooldridge, Ch. 1) : Tatiana Komarova
28 pages
Causal Inference: 1.1 Two Types of Causal Questions
No ratings yet
Causal Inference: 1.1 Two Types of Causal Questions
19 pages
Chapter 5 Causation and Experimental Design PDF
No ratings yet
Chapter 5 Causation and Experimental Design PDF
30 pages
Correlation and Causality (Lawrence H. Rhodes)
No ratings yet
Correlation and Causality (Lawrence H. Rhodes)
6 pages
Correlational Research
No ratings yet
Correlational Research
17 pages
Granger (1988)
No ratings yet
Granger (1988)
13 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Math No Problem Textbook 1A
No ratings yet
Math No Problem Textbook 1A
152 pages
Chapter 5 Causation and Experimental Design
No ratings yet
Chapter 5 Causation and Experimental Design
30 pages
Corelation Vs Causation
100% (1)
Corelation Vs Causation
4 pages
Notes - EDA-Unit5
No ratings yet
Notes - EDA-Unit5
21 pages
Causal Inference
No ratings yet
Causal Inference
11 pages
Correlation and Its Applications in Economics
No ratings yet
Correlation and Its Applications in Economics
22 pages
Statistical Causality: Vanessa Didelez
No ratings yet
Statistical Causality: Vanessa Didelez
6 pages
On Using These Lecture Notes
No ratings yet
On Using These Lecture Notes
6 pages
Non-Gaussian Methods For Causal Structure Learning: Shohei Shimizu
No ratings yet
Non-Gaussian Methods For Causal Structure Learning: Shohei Shimizu
11 pages
1 Correlation
No ratings yet
1 Correlation
5 pages
Correlational Research
No ratings yet
Correlational Research
5 pages
Stats Ch.13 Linear Regression
No ratings yet
Stats Ch.13 Linear Regression
42 pages
Step05 Choose A Research Design
No ratings yet
Step05 Choose A Research Design
10 pages
05 Handout 1
No ratings yet
05 Handout 1
6 pages
Q1.Discuss The Various Experimental Designs As Powerful Tools To Study The Cause and Effect Relationships Amongst Variables in Research. Ans
No ratings yet
Q1.Discuss The Various Experimental Designs As Powerful Tools To Study The Cause and Effect Relationships Amongst Variables in Research. Ans
10 pages
Cause and Effect
No ratings yet
Cause and Effect
2 pages
College Task
No ratings yet
College Task
1 page
Correlation Does Not Imply Causation
No ratings yet
Correlation Does Not Imply Causation
9 pages
Ch.12 Kinematics of A Particle
No ratings yet
Ch.12 Kinematics of A Particle
146 pages
Worksheet-1 Trigonometry
No ratings yet
Worksheet-1 Trigonometry
3 pages
Stability & Determinacy of Trusses PDF
No ratings yet
Stability & Determinacy of Trusses PDF
5 pages
Lesson 1 What Is A Number Bond
100% (1)
Lesson 1 What Is A Number Bond
4 pages
Chapter 7
No ratings yet
Chapter 7
43 pages
Linkers in The English Language
No ratings yet
Linkers in The English Language
3 pages
Definition and Purpose
No ratings yet
Definition and Purpose
6 pages
Cause and Effect Relationship General Principles in Detecting Causal Relations and Mills Canons
0% (1)
Cause and Effect Relationship General Principles in Detecting Causal Relations and Mills Canons
8 pages
Corelation and Regression
No ratings yet
Corelation and Regression
5 pages
Correlation Ratio
No ratings yet
Correlation Ratio
3 pages
Quadrant Count Ratio
No ratings yet
Quadrant Count Ratio
2 pages
New Microsoft Office PowerPoint Presentation
No ratings yet
New Microsoft Office PowerPoint Presentation
27 pages
MYSYSTEM2
No ratings yet
MYSYSTEM2
9 pages
Propagation of Uncertainty
100% (1)
Propagation of Uncertainty
8 pages
Manual Ezysurf
No ratings yet
Manual Ezysurf
10 pages
Problem On Ages 411119 Crwill
No ratings yet
Problem On Ages 411119 Crwill
6 pages
Adaptive Behavior and Learning: Internet Edition
No ratings yet
Adaptive Behavior and Learning: Internet Edition
410 pages
CT605A-N Soft Computing
No ratings yet
CT605A-N Soft Computing
3 pages
Answers)
100% (1)
Answers)
12 pages
Assignment Solution
No ratings yet
Assignment Solution
6 pages
20HCC22XX: B.Tech (III Sem)
No ratings yet
20HCC22XX: B.Tech (III Sem)
2 pages
Ander
No ratings yet
Ander
2 pages
Review of SIR Calculations For Distance Protection
No ratings yet
Review of SIR Calculations For Distance Protection
7 pages
Correlation
No ratings yet
Correlation
12 pages
Gray Code
No ratings yet
Gray Code
6 pages
Concordance Correlation Coefficient
No ratings yet
Concordance Correlation Coefficient
2 pages
Cross Correlation
No ratings yet
Cross Correlation
10 pages
2009 - Ukmt
No ratings yet
2009 - Ukmt
17 pages
Covariance
No ratings yet
Covariance
5 pages
Normal Modes - Rigid Element Analysis With RBE2 and CONM2
No ratings yet
Normal Modes - Rigid Element Analysis With RBE2 and CONM2
22 pages
Quesioner Design and Analyisis
No ratings yet
Quesioner Design and Analyisis
25 pages
Axiomatic Design
No ratings yet
Axiomatic Design
2 pages
Coefficient of Multiple Correlation
No ratings yet
Coefficient of Multiple Correlation
2 pages
Illusory Correlation
No ratings yet
Illusory Correlation
6 pages
Lift (Data Mining)
No ratings yet
Lift (Data Mining)
3 pages
Core Lap
No ratings yet
Core Lap
1 page
Kinetika Kimia Orde 1
No ratings yet
Kinetika Kimia Orde 1
24 pages
Fundamentals of Computer Programming: Arrays (CLO3)
No ratings yet
Fundamentals of Computer Programming: Arrays (CLO3)
17 pages
Stratified Sampling
No ratings yet
Stratified Sampling
4 pages
Correlation Function
No ratings yet
Correlation Function
2 pages
OTS Matrices Determinants PDF
No ratings yet
OTS Matrices Determinants PDF
5 pages
Art - Cient.solucion Analitica - Infiltracion.earth Dam - Alexandria University PDF
No ratings yet
Art - Cient.solucion Analitica - Infiltracion.earth Dam - Alexandria University PDF
5 pages
EC3114 Autumn 2022 Coursework
No ratings yet
EC3114 Autumn 2022 Coursework
2 pages

Spurious Relationship

Uploaded by

Spurious Relationship

Uploaded by

Spurious relationship

In statistics, a spurious relationship or spurious correlation[1][2]

Detecting spurious relationships

Non-experimental statistical analyses

Retrieved from "https://en.wikipedia.org/w/index.php?title=Spurious_relationship&oldid=1166798804"

You might also like