0% found this document useful (0 votes)

41 views4 pages

Influential Observation

This document discusses influential observations in statistics. An influential observation is one whose removal would noticeably change the results of a statistical calculation, particularly the parameter estimates in regression analysis. Several methods are described for measuring influence, including DFBETA which measures the difference in parameter estimates with and without the observation. High leverage points and outliers are also discussed as atypical observations that can strongly influence the regression line. The bottom datasets in Anscombe's quartet provide examples of influential points and outliers.

Uploaded by

sophia787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views4 pages

Influential Observation

Uploaded by

sophia787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Influential observation

In statistics, an influential observation

is an observation for a statistical
calculation whose deletion from the
dataset would noticeably change the
result of the calculation.[1] In particular,
in regression analysis an influential
observation is one whose deletion has
a large effect on the parameter
estimates.[2]

Assessment
Various methods have been proposed
for measuring influence.[3][4] Assume
an estimated regression ,
where is an n×1 column vector for In Anscombe's quartet the two datasets on the bottom both contain
the response variable, is the n×k influential points. All four sets are identical when examined using
design matrix of explanatory variables simple summary statistics, but vary considerably when graphed. If
(including a constant), is the n×1 one point is removed, the line would look very different.
residual vector, and is a k×1 vector
of estimates of some population
parameter . Also define , the projection matrix of . Then we have the
following measures of influence:

1. , where denotes the coefficients estimated

with the i-th row of deleted, denotes the i-th value of matrix's
main diagonal. Thus DFBETA measures the difference in each parameter estimate with and
without the influential point. There is a DFBETA for each variable and each observation (if
there are N observations and k variables there are N·k DFBETAs).[5] Table shows DFBETAs
for the third dataset from Anscombe's quartet (bottom left chart in the figure):
x y intercept slope

10.0 7.46 -0.005 -0.044

8.0 6.77 -0.037 0.019

13.0 12.74 -357.910 525.268

9.0 7.11 -0.033 0

11.0 7.81 0.049 -0.117

14.0 8.84 0.490 -0.667

6.0 6.08 0.027 -0.021

4.0 5.39 0.241 -0.209
12.0 8.15 0.137 -0.231

7.0 6.42 -0.020 0.013

5.0 5.73 0.105 -0.087

2. DFFITS - difference in fits

3. Cook's D measures the effect of removing a data point on all the parameters combined.[2]

Outliers, leverage and influence

An outlier may be defined as a data point that differs significantly from other observations.[6][7] A high-
leverage point are observations made at extreme values of independent variables.[8] Both types of atypical
observations will force the regression line to be close to the point.[2] In Anscombe's quartet, the bottom
right image has a point with high leverage and the bottom left image has an outlying point.

See also
Influence function (statistics)
Outlier
Leverage
Partial leverage
Regression analysis
Cook's distance § Detecting highly influential observations
Anomaly detection

References
1. Burt, James E.; Barber, Gerald M.; Rigby, David L. (2009), Elementary Statistics for
Geographers (https://books.google.com/books?id=p7YMOPuu8ugC&pg=PA513), Guilford
Press, p. 513, ISBN 9781572304840.
2. Everitt, Brian (1998). The Cambridge Dictionary of Statistics (https://archive.org/details/camb
ridgediction00ever_0). Cambridge, UK New York: Cambridge University Press. ISBN 0-521-
59346-8.
3. Winner, Larry (March 25, 2002). "Influence Statistics, Outliers, and Collinearity Diagnostics"
(http://stat.ufl.edu/~winner/sta6127/influence.doc).
4. Belsley, David A.; Kuh, Edwin; Welsh, Roy E. (1980). Regression Diagnostics: Identifying
Influential Data and Sources of Collinearity (https://books.google.com/books?id=GECBEUJ
VNe0C&pg=PA11). Wiley Series in Probability and Mathematical Statistics. New York: John
Wiley & Sons. pp. 11–16. ISBN 0-471-05856-4.
5. "Outliers and DFBETA" (http://www.albany.edu/faculty/kretheme/PAD705/SupportMat/DFBE
TA.pdf) (PDF). Archived (https://web.archive.org/web/20130511013229/http://www.albany.ed
u/faculty/kretheme/PAD705/SupportMat/DFBETA.pdf) (PDF) from the original on May 11,
2013.
6. Grubbs, F. E. (February 1969). "Procedures for detecting outlying observations in samples".
Technometrics. 11 (1): 1–21. doi:10.1080/00401706.1969.10490657 (https://doi.org/10.108
0%2F00401706.1969.10490657). "An outlying observation, or "outlier," is one that appears
to deviate markedly from other members of the sample in which it occurs."
7. Maddala, G. S. (1992). "Outliers" (https://books.google.com/books?id=nBS3AAAAIAAJ&pg=
PA89). Introduction to Econometrics (https://archive.org/details/introductiontoec00madd/pag
e/89) (2nd ed.). New York: MacMillan. pp. 89 (https://archive.org/details/introductiontoec00m
add/page/89). ISBN 978-0-02-374545-4. "An outlier is an observation that is far removed
from the rest of the observations."
8. Everitt, B. S. (2002). Cambridge Dictionary of Statistics. Cambridge University Press.
ISBN 0-521-81099-X.

Further reading
Dehon, Catherine; Gassner, Marjorie; Verardi, Vincenzo (2009). "Beware of 'Good' Outliers
and Overoptimistic Conclusions". Oxford Bulletin of Economics and Statistics. 71 (3): 437–
452. doi:10.1111/j.1468-0084.2009.00543.x (https://doi.org/10.1111%2Fj.1468-0084.2009.0
0543.x). S2CID 154376487 (https://api.semanticscholar.org/CorpusID:154376487).
Kennedy, Peter (2003). "Robust Estimation" (https://books.google.com/books?id=B8I5SP69
e4kC&pg=PA372). A Guide to Econometrics (Fifth ed.). Cambridge: The MIT Press.
pp. 372–388. ISBN 0-262-61183-X.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Influential_observation&oldid=1159896875"

ECO609
No ratings yet
ECO609
16 pages
FEF and FEF-IV - 5 - 6 - 2017
No ratings yet
FEF and FEF-IV - 5 - 6 - 2017
13 pages
Er Za 2009
100% (1)
Er Za 2009
9 pages
Time Series
No ratings yet
Time Series
22 pages
Wilcox Functions
No ratings yet
Wilcox Functions
117 pages
Lesson 3 Overview Problems and Outliers
No ratings yet
Lesson 3 Overview Problems and Outliers
31 pages
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
No ratings yet
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
48 pages
Singapore Lecture4
No ratings yet
Singapore Lecture4
42 pages
Detecting Influential Observations in DEA WILSON
No ratings yet
Detecting Influential Observations in DEA WILSON
19 pages
Ch6slides Leverage Influence
No ratings yet
Ch6slides Leverage Influence
25 pages
Stats101A - Chapter 3
No ratings yet
Stats101A - Chapter 3
54 pages
Statistics Study Notes
No ratings yet
Statistics Study Notes
71 pages
VAR
No ratings yet
VAR
25 pages
Chapter 4
No ratings yet
Chapter 4
10 pages
Appliedmath 02 00018 v2
No ratings yet
Appliedmath 02 00018 v2
28 pages
Outlier Detection Algorithms
No ratings yet
Outlier Detection Algorithms
38 pages
ES4 Slides
No ratings yet
ES4 Slides
21 pages
Thu 1340 Engle
No ratings yet
Thu 1340 Engle
36 pages
LM04 Extensions of Multiple Regression IFT Notes
No ratings yet
LM04 Extensions of Multiple Regression IFT Notes
17 pages
2024 10 30 621016 Full
No ratings yet
2024 10 30 621016 Full
17 pages
CQF January 2017 M5L6 Blank PDF
100% (3)
CQF January 2017 M5L6 Blank PDF
122 pages
Econometrics - Exercise Set 4 (Solution)
No ratings yet
Econometrics - Exercise Set 4 (Solution)
16 pages
Lecture 20: Outliers and Influential Points
No ratings yet
Lecture 20: Outliers and Influential Points
11 pages
Multiple Linear Regression: Diagnostics: Statistics 203: Introduction To Regression and Analysis of Variance
No ratings yet
Multiple Linear Regression: Diagnostics: Statistics 203: Introduction To Regression and Analysis of Variance
16 pages
Outliers and Influential Points
No ratings yet
Outliers and Influential Points
14 pages
04 - Panel Data PDF
No ratings yet
04 - Panel Data PDF
84 pages
Learning Satisfaction of Students and Academic Performance
75% (4)
Learning Satisfaction of Students and Academic Performance
37 pages
4-Regression Diagnostics SAS
No ratings yet
4-Regression Diagnostics SAS
12 pages
The Prediction of Default With Outliers - Robust Logistic Regression
No ratings yet
The Prediction of Default With Outliers - Robust Logistic Regression
21 pages
(English) Leverage and Influential Points in Simple Linear Regression (DownSub - Com)
No ratings yet
(English) Leverage and Influential Points in Simple Linear Regression (DownSub - Com)
5 pages
Assignment Regression Techniques
No ratings yet
Assignment Regression Techniques
12 pages
Outlier Detection in Multivariate Data: Applied Mathematical Sciences, Vol. 9, 2015, No. 47, 2317 - 2324
No ratings yet
Outlier Detection in Multivariate Data: Applied Mathematical Sciences, Vol. 9, 2015, No. 47, 2317 - 2324
8 pages
2025 More On Panels
No ratings yet
2025 More On Panels
17 pages
Technical Report: Detecting Influential Data in DEA: 1 The Measure of Influence
No ratings yet
Technical Report: Detecting Influential Data in DEA: 1 The Measure of Influence
11 pages
Detecting Efficient and Inefficient Outliers in Data Envelopment Analysis
No ratings yet
Detecting Efficient and Inefficient Outliers in Data Envelopment Analysis
22 pages
Problem Set - Chapter 4
No ratings yet
Problem Set - Chapter 4
3 pages
Chapter6 Regression Diagnostic For Leverage and Influence
No ratings yet
Chapter6 Regression Diagnostic For Leverage and Influence
10 pages
Regression Packet
No ratings yet
Regression Packet
27 pages
DS Module 05
No ratings yet
DS Module 05
5 pages
Robust Detection of Multiple Outliers in A Multivariate Data Set
No ratings yet
Robust Detection of Multiple Outliers in A Multivariate Data Set
30 pages
Türkan Et Al (2011) - Outlier Detection by Regression Diagnostics Based On Robust Parameter Estimates
No ratings yet
Türkan Et Al (2011) - Outlier Detection by Regression Diagnostics Based On Robust Parameter Estimates
9 pages
F24 Lab-01
No ratings yet
F24 Lab-01
4 pages
Outliers Influence
No ratings yet
Outliers Influence
6 pages
Basic Regression Analysis 3
No ratings yet
Basic Regression Analysis 3
6 pages
Rohan 20QM30011 AMSM Assignment Ch8
No ratings yet
Rohan 20QM30011 AMSM Assignment Ch8
11 pages
Finding The Outliers That Matter
No ratings yet
Finding The Outliers That Matter
10 pages
Home Work 1: Group Member Student Name ID Contribution
No ratings yet
Home Work 1: Group Member Student Name ID Contribution
7 pages
Outliers and Influential Observations
No ratings yet
Outliers and Influential Observations
5 pages
Regression Analysis
No ratings yet
Regression Analysis
9 pages
Guidelines For Project Work On Field - 1 PDF
No ratings yet
Guidelines For Project Work On Field - 1 PDF
10 pages
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
No ratings yet
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
19 pages
Estadistica, Articulo, Analyzing Outliers: Influential or Nuisance?
No ratings yet
Estadistica, Articulo, Analyzing Outliers: Influential or Nuisance?
3 pages
Moreno - Final Examination
No ratings yet
Moreno - Final Examination
4 pages
Cheat Sheet
No ratings yet
Cheat Sheet
3 pages
Cooks
No ratings yet
Cooks
5 pages
The Stata Command All Commands Concerning Fixed and Random Effect
No ratings yet
The Stata Command All Commands Concerning Fixed and Random Effect
10 pages
2023 Level II Key Facts and Formula Sheet (KFFS)
No ratings yet
2023 Level II Key Facts and Formula Sheet (KFFS)
14 pages
ABOD
No ratings yet
ABOD
1 page
GRADE Level: Grade 12 Subject: Inquiries, Investigations and Immersion Most Essential Learning Competencies Duration CG Code
No ratings yet
GRADE Level: Grade 12 Subject: Inquiries, Investigations and Immersion Most Essential Learning Competencies Duration CG Code
15 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
8 pages
The Nature of Quantiative Research
No ratings yet
The Nature of Quantiative Research
44 pages
Detection of Autism Spectrum Disorder
No ratings yet
Detection of Autism Spectrum Disorder
52 pages
Amurado (Bandoneón) (Manuscrito)
No ratings yet
Amurado (Bandoneón) (Manuscrito)
2 pages
Less On: Understanding Data and Ways To Systematically Collect Data
No ratings yet
Less On: Understanding Data and Ways To Systematically Collect Data
48 pages
Qa QC Procedures
No ratings yet
Qa QC Procedures
6 pages
Research Article Social Interdependence in Close Relationships: The Actor-Partner-Interdependence - Investment Model (API-IM)
No ratings yet
Research Article Social Interdependence in Close Relationships: The Actor-Partner-Interdependence - Investment Model (API-IM)
13 pages
Forecasting
No ratings yet
Forecasting
5 pages
Pr2 Quarter 1 Nature of Inquiry and Research Lesson 1 Introduction To Qualitative Research
No ratings yet
Pr2 Quarter 1 Nature of Inquiry and Research Lesson 1 Introduction To Qualitative Research
29 pages
Ps 7
No ratings yet
Ps 7
9 pages
Тест 2
No ratings yet
Тест 2
27 pages
Lim 2024 What Is Quantitative Research An Overview and Guidelines
No ratings yet
Lim 2024 What Is Quantitative Research An Overview and Guidelines
24 pages
Taguchi Methods
No ratings yet
Taguchi Methods
7 pages
FORECAST ACCURACY - Supply Chain Management
No ratings yet
FORECAST ACCURACY - Supply Chain Management
29 pages
National Guidelines Volume 2
No ratings yet
National Guidelines Volume 2
104 pages
Skittles Project 2-6
No ratings yet
Skittles Project 2-6
8 pages
Biostat
No ratings yet
Biostat
2 pages
Course Syllabus - GMU Spring 2018 - BUS 310 - Section 15 - Business Analytics II
100% (1)
Course Syllabus - GMU Spring 2018 - BUS 310 - Section 15 - Business Analytics II
7 pages
Artificial Intelligence Markup Language
No ratings yet
Artificial Intelligence Markup Language
4 pages
Entity Based Sentiment Classifier For Social Media Analysis
No ratings yet
Entity Based Sentiment Classifier For Social Media Analysis
66 pages
GPT 3
No ratings yet
GPT 3
14 pages
Extreme Value Theory
No ratings yet
Extreme Value Theory
8 pages
Python
No ratings yet
Python
3 pages
Open Cog
No ratings yet
Open Cog
4 pages
Beginner Guide Matplotlib Data Visualization Exploration Python
No ratings yet
Beginner Guide Matplotlib Data Visualization Exploration Python
13 pages
PARRY
No ratings yet
PARRY
2 pages
Edanalytix - Experienced Professionals - Analytics & DS Opportunities
No ratings yet
Edanalytix - Experienced Professionals - Analytics & DS Opportunities
13 pages
Mvreg - Multivariate Regression
No ratings yet
Mvreg - Multivariate Regression
7 pages
Jabberwacky
No ratings yet
Jabberwacky
2 pages
Biometric Testing Report - Pad Algorithm - Template - Finance
No ratings yet
Biometric Testing Report - Pad Algorithm - Template - Finance
20 pages
Total Quality Management
No ratings yet
Total Quality Management
8 pages
Mycroft (Software)
No ratings yet
Mycroft (Software)
5 pages
Artificial Linguistic Internet Computer Entity
No ratings yet
Artificial Linguistic Internet Computer Entity
3 pages
Taguchi Loss Function
No ratings yet
Taguchi Loss Function
2 pages
STA8005: Multivariate Analysis For High-Dimensional Data Tutorial - Week 3 (MVN)
No ratings yet
STA8005: Multivariate Analysis For High-Dimensional Data Tutorial - Week 3 (MVN)
1 page
Enterprise Feedback Management
No ratings yet
Enterprise Feedback Management
5 pages
Nyanumba CV
No ratings yet
Nyanumba CV
3 pages
Advanced Econometrics: Professor: Sukjin Han
No ratings yet
Advanced Econometrics: Professor: Sukjin Han
12 pages
Rolled Throughput Yield
No ratings yet
Rolled Throughput Yield
1 page
Atul - Birla - DAF
No ratings yet
Atul - Birla - DAF
1 page
Resume 1
No ratings yet
Resume 1
2 pages
A-level Physics Revision: Cheeky Revision Shortcuts
From Everand
A-level Physics Revision: Cheeky Revision Shortcuts
Scool Revision
3/5 (10)
Start Predicting In A World Of Data Science And Predictive Analysis
From Everand
Start Predicting In A World Of Data Science And Predictive Analysis
Matthew Abbitt
No ratings yet
Calculus and Statistics
From Everand
Calculus and Statistics
Michael C. Gemignani
4/5 (1)

Influential Observation

Uploaded by

Influential Observation

Uploaded by

Influential observation

In statistics, an influential observation

1. , where denotes the coefficients estimated

10.0 7.46 -0.005 -0.044

8.0 6.77 -0.037 0.019

9.0 7.11 -0.033 0

11.0 7.81 0.049 -0.117

6.0 6.08 0.027 -0.021

7.0 6.42 -0.020 0.013

5.0 5.73 0.105 -0.087

2. DFFITS - difference in fits

Outliers, leverage and influence

Retrieved from "https://en.wikipedia.org/w/index.php?title=Influential_observation&oldid=1159896875"

You might also like