0% found this document useful (0 votes)

26 views23 pages

Multiple Imputation Presentation

The document discusses multiple imputation (MI) as a method for handling missing data, emphasizing the importance of understanding the types of missing data (MCAR, MAR, MNAR) and the implications for analysis. It provides guidelines for implementing MI, including determining the appropriate number of imputations and the variables to include in the imputation model. The talk encourages discussion on best practices and alternatives to MI, highlighting the need for transparency in the imputation process.

Uploaded by

Wiji Astuti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views23 pages

Multiple Imputation Presentation

Uploaded by

Wiji Astuti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

MULTIPLE IMPUTATION

Adrienne D. Woods
Methods Hour Brown Bag
April 14, 2017
A COLLECTIVIST APPROACH TO BEST
PRACTICES

• As I began learning about MI last semester, I realized that there are a lot of
guidelines that are not often followed…
• …or, if they are, nobody reports what they did!
• …or, guidelines that are outdated and/or different across disciplines

• This talk is…

• Focused primarily on large samples (ECLS-K ~21,400)…
• …on issues associated with MNAR data
• …in the hopes of sharing what I’ve learned (and mitigating future frustration)
• …Open to debate/discussion!
THE WHY: MISSING DATA!

DISCUSS: Why might you choose to impute data?

• Most commonly, folks impute due to issues of power associated

with reduced sample size
• Several methods of dealing with missing data…but also, several less
efficient/poorer alternatives than MI (i.e., mean substitution)
• “Missing by design” studies
THE WHY: TYPES OF MISSING DATA

• Missing Completely at Random

• Missing at Random
• Missing Not at Random
• DISCUSS: How do you define this?
THE WHY: TYPES OF MISSING DATA

• Missing Not at Random

• Graham (2009): “non-ignorable missingingess”

SMOKING1 PROGRAM SMOKING2

• Tabachnick & Fidell (2013): MNAR is related to the DV, as

determined by significant t-tests with the DV
• η2 for effect sizes in large samples
THE WHY: TYPES OF MISSING DATA

• Missing Not at Random

• Issue: no way to truly determine MAR vs. MNAR in your data

“[Controlling] variables that help account for the mechanisms resulting in missing data (e.g., race/ethnicity, age,
gender, SES)…leads to a reasonable assumption of missing at random (MAR).” Hibel, Farkas, & Morgan, 2010

Is this good enough?

Even if researchers have MNAR data, they typically still impute…

• T&F (2013) recommend modeling predictors of missingness alongside other variables as dummies
• In small samples with nonnormality, MI performed similarly to FIML (Shin, Davison, & Long, 2016)
• But, *estimates will still be biased!*
THE WHAT: WHAT IS MULTIPLE
IMPUTATION?

“To the uninitiated, multiple imputation is a bewildering technique that

differs substantially from conventional statistical approaches. As a result,
the first-time user may get lost in a labyrinth of imputation models, missing
data mechanisms, multiple versions of the data, pooling, and so on.”

– Van Buuren & Groothuis-Oudshoorn (2011)

THE WHAT: WHAT IS MULTIPLE
IMPUTATION?

• Single imputation methods (mean replacement,

regression, etc.) assume perfect estimation of
imputed values and ignore between-imputation
variability
• May result in artificially small standard errors and
increased likelihood of Type I errors, and are only
appropriate for MCAR data
• Imputed values from single imputation always lie
right on the regression line; but, real data always
deviate from the regression line by some amount
• MI creates several datasets with estimated values
for missing information
• Incorporates uncertainty into the standard errors of
imputed values by accounting for variability between
imputed solutions

Acock, 2005; Graham, 2009; Hibel, Farkas, & Morgan, 2010; Schafer, 1999
THE WHAT: WHAT IS MULTIPLE
IMPUTATION?

Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

BEFORE VS. AFTER MI
THE WHAT: WHAT IS MULTIPLE
IMPUTATION?
Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

THE HOW: GUIDELINES FOR MI

1. Decide whether data are

MAR or MNAR – latter
requires additional
modeling assumptions
2. Form of imputation model
• Depends on scale of each
variable to be imputed
• Incorporates knowledge
about relationship between
variables
Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

THE HOW: GUIDELINES FOR MI

3. Which variables should you include as predictors in the imputation model?

• Any variables you plan to use in later analyses (including controls)
• General advice: use as many as possible (could get unwieldy!)
• Although, some (i.e., Kline, 2005; Hardt, Herke, & Leonhart, 2012) believe that this
introduces more imprecision, especially if the auxiliary variable explains less than
10% of the variance in missingness on Y… thoughts?
AN EXAMPLE…

Math Competency School Belongingness

Attempt 1 Attempt 2 Attempt 1 Attempt 2
Std. B (SE) Std. B (SE) Std. B (SE) Std. B (SE)
Constant 0.54 (.61) 1.39 (.75) 1.97 (.43)*** 2.08 (.54)***
Male 0.06 (.06) 0.05 (.06) -0.04 (.04) -0.04 (.04)
Black 0.23 (.09)** 0.13 (.07) -0.10 (.06) -0.05 (.05)
Hispanic 0.04 (.07) 0.03 (.07) -0.08 (.05) -0.05 (.05)
Asian -0.06 (.15) -0.01 (.14) 0.02 (.10) 0.02 (.09)
K-8 Read Gain -0.22 (.15) -0.22 (.13) -0.01 (.10) 0.08 (.10)
K-8 Math Gain 0.83 (.17)*** 0.78 (.16)*** 0.09 (.02) 0.07 (.11)
Special Ed. Dosage 0.08 (.03)** 0.07 (.03)* 0.04 (.02) + 0.05 (.02)*
Special Ed. Recency 0.01 (.03) 0.02 (.02) -0.01 (.02) -0.01 (.02)
+p < .10, *p < .05, **p < .01, **p < .001

Stata Code (second attempt)

What I changed: mi impute chained (pmm, knn(10)) R1_KAGE WKSESL WKMOMED C7SDQRDC
- Accidentally left out three variables that I wanted to use C7SDQMTC C7SDQINT C7LOCUS C7CONCPT belong peers C1R4RSCL C1R4MSCL
in my analysis model as autoregressive controls (bolded) readgain mathgain C5SDQRDC C5SDQMTC C5SDQINT C6SDQRDC C6SDQMTC
- Both m = 70 C6SDQINT C5SDQPRC C6SDQPRC T1LEARN T1CONTRO T1INTERP T1INTERN
- Predictors of interest are Special Ed. Dosage and Special T1EXTERN P1NUMSIB (logit) youngma retained single headst premat (ologit)
Ed. Recency (did not impute into the latter) C7HOWFAR C7LONLY C7SAD sped_dos = sped_rec race_r gender, add(1) rseed(53421)
burnin(100) dots force augment
Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

THE HOW: GUIDELINES FOR MI

4. Imputing variables that are functions of other (incomplete) variables

• Sum scores, interaction variables, ratios, etc…
• DON’T transform! (could impute outliers; Graham, 2009)
• Standardized variables??? (my guess is no…)
5. Order in which variables should be imputed
6. Setup of starting imputations and the number of iterations
• Includes k-nearest neighbors if using predictive mean matching
Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

THE HOW: GUIDELINES FOR MI

7. How many multiply imputed datasets, m, should you create?

• Previously, m = 3-5 considered acceptable in social sciences
• But, your estimates can change, especially if you have MNAR data…
i.e., in m = 3, p = .04… in m = 10, p = .08
• “Impute one dataset, see how long it takes, and then base your decision about m on time
constraints and software capability.” (Van Buuren & Groothuis-Oudshoorn, 2011)
NO.
New rule: more is better!
• “Setting m too low may result in large simulation error, especially if the fraction of missing
information is high.”
THE HOW: GUIDELINES FOR MI

• Fraction of Missing Information (FMI)

• Statistical formula based on the amount of missing data in the simplest case (Rubin, 1987)
• Rule of thumb: set m equal to the number of incomplete cases, which will typically be
less than the FMI
• Relative efficiency of imputations: FMI/m ~= .01
• Annoying in that this depends on m, but m depends on FMI (Spratt et al., 2010)
• But, you could impute a few datasets, check FMI, then impute again…then check FMI
again! (White, Royston,Wood, 2011; Graham, Olchowski, & Gilreath, 2007)
AN EXAMPLE…
First, imputed one dataset to make sure the code worked without error. Then, imputed up to m = 4 to check FMI:
Multiple-imputation estimates Imputations = 4
Multinomial logistic regression Number of obs = 4,359
Average RVI = 0.2141 FMI/m = 0.6596/4 = .165
Largest FMI = 0.6596
DF adjustment: Large sample DF: min = 8.65
avg = 143,247.46
max = 1.94e+07
Model F test: Equal FMI F( 165,15025.7) = 4.43
Within VCE type: Robust Prob > F = 0.0000

Then, imputed another 46 datasets to get to m = 50, and checked FMI again:

Multiple-imputation estimates Imputations = 50

Multinomial logistic regression Number of obs = 4,359
Average RVI = 0.1927 FMI/m = 0.3521/50 = .007
Largest FMI = 0.3521
DF adjustment: Large sample DF: min = 402.64
avg = 28,528.17
max = 813,522.80
Model F test: Equal FMI F( 145,259060.3) = 4.81
Within VCE type: Robust Prob > F = 0.0000
SOFTWARE PACKAGES

• R – mice package
• Completely syntax-based, can get out of hand for uninitiated/beginners
• STATA – multiple imputation feature
• Subsequent data analyses conducted with “mi estimate:” as the precursor to code
• SPSS – multiple imputation feature
• Creates one dataset or imputes X separate datasets (useful for HLM, for example)
• But, limited in options
• e.g., can’t manipulate knn
CO-CONSTRUCTED KNOWLEDGE &
DISCUSSION:

Main Take-Aways:
• First, always know what type of missing data you are working with
• Base m on FMI – rule of thumb is FMI/m < .01
• Know your analysis model beforehand and include at least all analysis variables in imputation model
(including interaction terms)
• Above all, be explicit about your choices.
• Include software you used to impute, auxiliary variables, etc.
• If not written out in actual manuscript, add to appendices!

…Other discussion points or best practices?

…What might be some alternatives to multiple imputation that folks could use, and why?
THANK YOU! 
REFERENCES

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual review of
psychology, 60, 549-576.
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed? Some
practical clarifications of multiple imputation theory. Prevention Science, 8(3), 206-213.
Rubin, D. B. (1987). Comment. Journal of the American Statistical Association, 82(398), 543-546.
Shin, T., Davison, M. L., & Long, J. D. (2016). Maximum Likelihood Versus Multiple Imputation for Missing Data in Small Longitudinal
Samples With Nonnormality.
Spratt, M., Carpenter, J., Sterne, J. A., Carlin, J. B., Heron, J., Henderson, J., & Tilling, K. (2010). Strategies for
multiple imputation in longitudinal studies. American journal of epidemiology, 172(4), 478-487.
Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics (6th Ed.). Pearson.
Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal
of statistical software, 45(3).
White, I. R., Royston, P., & Wood, A. M. (2011). Multiple imputation using chained equations: issues and guidance
for practice. Statistics in medicine, 30(4), 377-399.
RELATIVE EFFICIENCY OF M

“The variability between sets of imputations depends on both the number of imputations used and the
fraction of missing information. However, the fraction of missing information is itself estimated using the
between- and within-imputation variances, and thus may have substantial variability when estimated from
small numbers of imputations. Monte Carlo variation among sets of small numbers of imputations can be
substantial enough to materially affect conclusions, particularly where the original data set is small. One
approach might be to estimate the Monte Carlo variation and use that to decide the appropriate number of
imputations.” (p. 486, Spratt et al., 2010)

“The early literature focused on efficiency, and the conclusion was that you could usually get by with three
to five data sets. Schafer (1999) upped that number slightly when he stated that “Unless rates of missing
information are unusually high, there tends to be little or no practical benefit to using more than five to ten
imputations.” That conclusion was based on Rubin’s formula for relative efficiency: 1/(1+F/M) where F is the
fraction of missing information and M is the number of imputations. Thus, even with 50% missing
information, five imputed data sets would produce point estimates that were 91% as efficient as those based
on an infinite number of imputations. Ten data sets would yield 95% efficiency. But what’s good enough for
efficiency isn’t necessarily good enough for standard error estimates, confidence intervals, and p-values.”
(Allison, 2012)

All Formulas in One: Quantitative Aptitude Ebook by Lucid Math
100% (1)
All Formulas in One: Quantitative Aptitude Ebook by Lucid Math
26 pages
2019 Multiple Imputations
No ratings yet
2019 Multiple Imputations
27 pages
Multiple Imputation w2 2024
No ratings yet
Multiple Imputation w2 2024
45 pages
White 2010
No ratings yet
White 2010
23 pages
How Many Imputations Are Really Needed? Some Practical Clarifications of Multiple Imputation Theory
No ratings yet
How Many Imputations Are Really Needed? Some Practical Clarifications of Multiple Imputation Theory
8 pages
Schafer SMMR 1999 MI Primer
No ratings yet
Schafer SMMR 1999 MI Primer
14 pages
Introduction To Multiple Imputation: Francis Bursa
No ratings yet
Introduction To Multiple Imputation: Francis Bursa
16 pages
MI - Summary Stat
No ratings yet
MI - Summary Stat
25 pages
Flexible Imputation of Missing Data
100% (3)
Flexible Imputation of Missing Data
444 pages
Multiple Imputation Method
No ratings yet
Multiple Imputation Method
72 pages
Multiple Imputation: Julia Kozlitina Steve Robertson April 26, 2006
No ratings yet
Multiple Imputation: Julia Kozlitina Steve Robertson April 26, 2006
23 pages
DADM S5 Imputation of Missing Data
No ratings yet
DADM S5 Imputation of Missing Data
15 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
SPSS For Starters, Part 2
100% (15)
SPSS For Starters, Part 2
16 pages
Boots Trapping
No ratings yet
Boots Trapping
157 pages
Multiple Imputation of Missing Data
No ratings yet
Multiple Imputation of Missing Data
495 pages
MICE
No ratings yet
MICE
4 pages
Modern Method Web in Ar May 2012
No ratings yet
Modern Method Web in Ar May 2012
45 pages
01 Dealing With Missing Data The Art and Science of Imputation
No ratings yet
01 Dealing With Missing Data The Art and Science of Imputation
26 pages
Natalie Loxton Data Screening
No ratings yet
Natalie Loxton Data Screening
36 pages
Missing Data
No ratings yet
Missing Data
71 pages
Real Statistics Examples Regression 2
No ratings yet
Real Statistics Examples Regression 2
377 pages
ChiSquare Examples
No ratings yet
ChiSquare Examples
8 pages
Missing Data Techniques - UCLA
No ratings yet
Missing Data Techniques - UCLA
66 pages
ISDS 361B Test 1 Review
No ratings yet
ISDS 361B Test 1 Review
5 pages
Multiple
No ratings yet
Multiple
30 pages
Advanced Handling of Missing Data: One-Day Workshop
No ratings yet
Advanced Handling of Missing Data: One-Day Workshop
38 pages
Pima Tutorial
No ratings yet
Pima Tutorial
8 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
McCombe Etal Supplementary Materials 2021
No ratings yet
McCombe Etal Supplementary Materials 2021
6 pages
Unit 2 Notes - Docx-3
No ratings yet
Unit 2 Notes - Docx-3
14 pages
Jornadas de Estad Istica Aplicada, Universidad de Chimborazo, Riobamba, Ecuador, 10 - 13th June 2013
No ratings yet
Jornadas de Estad Istica Aplicada, Universidad de Chimborazo, Riobamba, Ecuador, 10 - 13th June 2013
28 pages
R. Van Buuren 2014 FCS - Chapter in Book Fitzmaurice Et Al
No ratings yet
R. Van Buuren 2014 FCS - Chapter in Book Fitzmaurice Et Al
41 pages
Multiple Imputation in Practice
No ratings yet
Multiple Imputation in Practice
11 pages
2 Way Anova
No ratings yet
2 Way Anova
13 pages
Unit - 3 - R Programming
No ratings yet
Unit - 3 - R Programming
16 pages
Index PDF
No ratings yet
Index PDF
19 pages
Handling Missing Values
No ratings yet
Handling Missing Values
182 pages
Multiple Testing Multiple Testing: Statistical Inference
No ratings yet
Multiple Testing Multiple Testing: Statistical Inference
19 pages
Data Screening Assumptions
No ratings yet
Data Screening Assumptions
29 pages
Simultaneous Statistical Inference With Applications in The Life Sciences Full Ebook Access
No ratings yet
Simultaneous Statistical Inference With Applications in The Life Sciences Full Ebook Access
14 pages
Practical Missing Data Analysis in SPSS
No ratings yet
Practical Missing Data Analysis in SPSS
19 pages
What Do You Do With Missing Data?
No ratings yet
What Do You Do With Missing Data?
17 pages
MR Boostraping
No ratings yet
MR Boostraping
324 pages
What Is MICE and How It Works-2850 Citations
No ratings yet
What Is MICE and How It Works-2850 Citations
10 pages
Multiple Imputation and Its Application 2nd Edition ISBN 1119756081, 9781119756088 Complete DOCX Download
No ratings yet
Multiple Imputation and Its Application 2nd Edition ISBN 1119756081, 9781119756088 Complete DOCX Download
16 pages
Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis
No ratings yet
Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis
30 pages
Manova 1
No ratings yet
Manova 1
68 pages
Fuzzy Imputation Test
No ratings yet
Fuzzy Imputation Test
31 pages
Frequencies: Notes
No ratings yet
Frequencies: Notes
30 pages
Most Important Findings 1zm31 Per Subject
No ratings yet
Most Important Findings 1zm31 Per Subject
24 pages
Missing Data Analysis With Mice - Firouzeh Noghrehchi - 2015
No ratings yet
Missing Data Analysis With Mice - Firouzeh Noghrehchi - 2015
13 pages
Handling Data With Three Types of Missing Values
No ratings yet
Handling Data With Three Types of Missing Values
33 pages
BMC Genetics: Imputation Methods For Missing Data For Polygenic Models
No ratings yet
BMC Genetics: Imputation Methods For Missing Data For Polygenic Models
4 pages
Multiple Imputation IN: Mplus
No ratings yet
Multiple Imputation IN: Mplus
19 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Imputation: - Applied Multivariate Analysis & Statistical Learning
No ratings yet
Imputation: - Applied Multivariate Analysis & Statistical Learning
17 pages
SPSS For Starters, Part 2 ISBN 9400748035, 9789400748033 Scribd Full Download
No ratings yet
SPSS For Starters, Part 2 ISBN 9400748035, 9789400748033 Scribd Full Download
14 pages
5
No ratings yet
5
23 pages
Journal of Statistical Software: Reviewer: Abdolvahab Khademi University of Massachusetts
No ratings yet
Journal of Statistical Software: Reviewer: Abdolvahab Khademi University of Massachusetts
4 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
2017 Interim Big Fork River
No ratings yet
2017 Interim Big Fork River
10 pages
Ahc 1
No ratings yet
Ahc 1
6 pages
Controlling Stress in Student Life
100% (2)
Controlling Stress in Student Life
2 pages
Cell Phone and Negative Effects
No ratings yet
Cell Phone and Negative Effects
1 page
CH 3-Nuclear Energy
No ratings yet
CH 3-Nuclear Energy
19 pages
Why You Should Try To Traveling Alone
No ratings yet
Why You Should Try To Traveling Alone
2 pages
What Is Fracking? Does Fracking Cause Environmental Problems?
No ratings yet
What Is Fracking? Does Fracking Cause Environmental Problems?
2 pages
CH 9-Water As Solvent
No ratings yet
CH 9-Water As Solvent
12 pages
Project VBA: How and Why It Can Make You A Project Guru!
No ratings yet
Project VBA: How and Why It Can Make You A Project Guru!
14 pages
e173e01748436895588d98e68888233a
No ratings yet
e173e01748436895588d98e68888233a
10 pages
Assessment Task 2: Activity No. 1
No ratings yet
Assessment Task 2: Activity No. 1
5 pages
Egyptian Heaven and Hell Volume II
No ratings yet
Egyptian Heaven and Hell Volume II
314 pages
Ground Improvement Methods
No ratings yet
Ground Improvement Methods
32 pages
SSC Cpo
No ratings yet
SSC Cpo
1 page
Srs Report
No ratings yet
Srs Report
24 pages
Semi Detailed LP 2
No ratings yet
Semi Detailed LP 2
3 pages
One-Way ANOVA: (Independent Group and Repeated Measures)
No ratings yet
One-Way ANOVA: (Independent Group and Repeated Measures)
36 pages
Ft-757gx2 User Hb9fax
No ratings yet
Ft-757gx2 User Hb9fax
37 pages
Business Finance - ADM - Module 1 Q1 WK 1 To 2 Introduction To Financial Management 3
No ratings yet
Business Finance - ADM - Module 1 Q1 WK 1 To 2 Introduction To Financial Management 3
37 pages
Mahabharata12 Shanti
No ratings yet
Mahabharata12 Shanti
960 pages
Pengaruh Lingkungan Kos-Kosan Terhadap Motivasi Belajar Mahasiswa Stakpn Ambon
No ratings yet
Pengaruh Lingkungan Kos-Kosan Terhadap Motivasi Belajar Mahasiswa Stakpn Ambon
14 pages
Iot Sem 5
No ratings yet
Iot Sem 5
45 pages
Four Dimension of Cloud Cube Model
No ratings yet
Four Dimension of Cloud Cube Model
2 pages
R S Aggarwal Solution Class 11 Maths Chapter 31 Probability Exercise 31A
No ratings yet
R S Aggarwal Solution Class 11 Maths Chapter 31 Probability Exercise 31A
9 pages
Paidout Policies
No ratings yet
Paidout Policies
2 pages
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
No ratings yet
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
2 pages
Tagdon Reso On Pipe Hose
80% (5)
Tagdon Reso On Pipe Hose
1 page
Alyssamari Aurereyes
No ratings yet
Alyssamari Aurereyes
2 pages
Theory HRV 1
No ratings yet
Theory HRV 1
94 pages
3.7 3.7 Firms' Costs, Revenue and Objectives
No ratings yet
3.7 3.7 Firms' Costs, Revenue and Objectives
34 pages
Water Ingress Analysis and Splash Protection Evaluation For Vehicle Wading Using Non-Classical CFD Simulation
No ratings yet
Water Ingress Analysis and Splash Protection Evaluation For Vehicle Wading Using Non-Classical CFD Simulation
13 pages
3.2. Perspectives On Listening Ho
No ratings yet
3.2. Perspectives On Listening Ho
35 pages
Accresm Research Sample
No ratings yet
Accresm Research Sample
46 pages
Of Plymouth Plantation PDF
100% (2)
Of Plymouth Plantation PDF
4 pages
To 15a8-4-10-3 Navair 03-30ak-103
No ratings yet
To 15a8-4-10-3 Navair 03-30ak-103
42 pages
IELTS Simon Speaking Part 3 9dee133876
No ratings yet
IELTS Simon Speaking Part 3 9dee133876
37 pages
Fertility: Overview, 2012 To 2016: Report On The Demographic Situation in Canada
No ratings yet
Fertility: Overview, 2012 To 2016: Report On The Demographic Situation in Canada
19 pages

Multiple Imputation Presentation

Uploaded by

Multiple Imputation Presentation

Uploaded by

MULTIPLE IMPUTATION

• This talk is…

DISCUSS: Why might you choose to impute data?

• Most commonly, folks impute due to issues of power associated

• Missing Completely at Random

• Missing Not at Random

SMOKING1 PROGRAM SMOKING2

• Tabachnick & Fidell (2013): MNAR is related to the DV, as

• Missing Not at Random

Is this good enough?

Even if researchers have MNAR data, they typically still impute…

“To the uninitiated, multiple imputation is a bewildering technique that

– Van Buuren & Groothuis-Oudshoorn (2011)

• Single imputation methods (mean replacement,

Van Buuren & Groothuis-Oudshoorn (2011): Seven Choices

THE HOW: GUIDELINES FOR MI

1. Decide whether data are

THE HOW: GUIDELINES FOR MI

3. Which variables should you include as predictors in the imputation model?

Math Competency School Belongingness

Stata Code (second attempt)

THE HOW: GUIDELINES FOR MI

4. Imputing variables that are functions of other (incomplete) variables

THE HOW: GUIDELINES FOR MI

7. How many multiply imputed datasets, m, should you create?

• Fraction of Missing Information (FMI)

Multiple-imputation estimates Imputations = 50

…Other discussion points or best practices?

You might also like