0% found this document useful (0 votes)

8 views4 pages

MICE

The document discusses various R packages for multiple imputation, with a focus on the MICE package, which uses Multivariate Imputation via Chained Equations to handle missing data. It explains how MICE assumes missing data is Missing at Random (MAR) and provides methods for imputing missing values based on other variables. Practical examples are given, including generating missing values, visualizing missing data patterns, and performing imputations using the MICE package.

Uploaded by

Hemant sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

MICE

Uploaded by

Hemant sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

PACKAGES USED FOR MULTIPLE IMPUTATION

List of R Packages
1. MICE
2. Amelia
3. missForest
4. Hmisc
5. mi

MICE Package
MICE (Multivariate Imputation via Chained Equations) is one of the commonly used
package by R users. Creating multiple imputations as compared to a single imputation
(such as mean) takes care of uncertainty in missing values.

MICE assumes that the missing data are Missing at Random (MAR), which means that
the probability that a value is missing depends only on observed value and can be
predicted using them. It imputes data on a variable by variable basis by specifying an
imputation model per variable.

For example: Suppose we have X1, X2….Xk variables. If X1 has missing values, then it
will be regressed on other variables X2 to Xk. The missing values in X1 will be then
replaced by predictive values obtained. Similarly, if X2 has missing values, then X1, X3
to Xk variables will be used in prediction model as independent variables. Later, missing
values will be replaced with predicted values.

By default, linear regression is used to predict continuous missing values. Logistic

regression is used for categorical missing values. Once this cycle is complete, multiple
data sets are generated. These data sets differ only in imputed missing values.
Generally, it’s considered to be a good practice to build models on these data sets
separately and combining their results.

Precisely, the methods used by this package are:

1. PMM (Predictive Mean Matching) – For numeric variables

2. logreg(Logistic Regression) – For Binary Variables( with 2 levels)
3. polyreg(Bayesian polytomous regression) – For Factor Variables (>= 2 levels)
4. Proportional odds model (ordered, >= 2 levels)

Let’s understand it practically now.

> path <- "../Data/Tutorial"
> setwd(path)

#load data
> data <- iris

#Get summary
> summary(iris)

Since, MICE assumes missing at random values. Let’s seed missing values in our data
set using prodNA function. You can access this function by installing missForest
package.

#Generate 10% missing values at Random

> iris.mis <- prodNA(iris, noNA = 0.1)

#Check missing values introduced in the data

> summary(iris.mis)

I’ve removed categorical variable. Let’s here focus on continuous values. To treat
categorical variable, simply encode the levels and follow the procedure below.

#remove categorical variables

> iris.mis <- subset(iris.mis, select = -c(Species))
> summary(iris.mis)

#install MICE
> install.packages("mice")
> library(mice)

mice package has a function known as md.pattern(). It returns a tabular form of missing
value present in each variable in a data set.

> md.pattern(iris.mis)

Let’s understand this table. There are 98 observations with no missing values. There are
10 observations with missing values in Sepal.Length. Similarly, there are 13 missing
values with Sepal.Width and so on.

This looks ugly. Right ? We can also create a visual which represents missing values. It
looks pretty cool too. Let’s check it out.
> install.packages("VIM")
> library(VIM)
> mice_plot <- aggr(iris.mis, col=c('navyblue','yellow'),
numbers=TRUE, sortVars=TRUE,
labels=names(iris.mis), cex.axis=.7,
gap=3, ylab=c("Missing data","Pattern"))

Let’s quickly understand this. There are 67% values in the data set with no missing
value. There are 10% missing values in Petal.Length, 8% missing values in Petal.Width
and so on. You can also look at histogram which clearly depicts the influence of missing
values in the variables.

Now, let’s impute the missing values.

> imputed_Data <- mice(iris.mis, m=5, maxit = 50, method = 'pmm', seed =
500)
> summary(imputed_Data)

Multiply imputed data set

Call:
mice(data = iris.mis, m = 5, method = "pmm", maxit = 50, seed = 500)
Number of multiple imputations: 5
Missing cells per column:
Sepal.Length Sepal.Width Petal.Length Petal.Width
13 14 16 15
Imputation methods:
Sepal.Length Sepal.Width Petal.Length Petal.Width
"pmm" "pmm" "pmm" "pmm"
VisitSequence:
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 2 3 4
PredictorMatrix:
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length 0 1 1 1
Sepal.Width 1 0 1 1
Petal.Length 1 1 0 1
Petal.Width 1 1 1 0
Random generator seed value: 500

Here is an explanation of the parameters used:

1. m – Refers to 5 imputed data sets

2. maxit – Refers to no. of iterations taken to impute missing values
3. method – Refers to method used in imputation. we used predictive mean
matching.

#check imputed values

> imputed_Data$imp$Sepal.Width

Since there are 5 imputed data sets, you can select any using complete() function.

#get complete data ( 2nd out of 5)

> completeData <- complete(imputed_Data,2)

Also, if you wish to build models on all 5 datasets, you can do it in one go
using with() command. You can also combine the result from these models and obtain a
consolidated output using pool() command.

#build predictive model

> fit <- with(data = iris.mis, exp = lm(Sepal.Width ~ Sepal.Length +
Petal.Width))

#combine results of all 5 models

> combine <- pool(fit)
> summary(combine)

Please note that I’ve used the command above just for demonstration purpose. You can
replace the variable values at your end and try it.

Training Needs Analysis For Teachers
100% (4)
Training Needs Analysis For Teachers
4 pages
Missing Data
No ratings yet
Missing Data
71 pages
Multiple Imputation of Missing Data
No ratings yet
Multiple Imputation of Missing Data
495 pages
Handling Missing Values
No ratings yet
Handling Missing Values
182 pages
Mice Lectures
No ratings yet
Mice Lectures
109 pages
Flexible Imputation of Missing Data
100% (3)
Flexible Imputation of Missing Data
444 pages
RJwrapper
No ratings yet
RJwrapper
24 pages
Multiple
No ratings yet
Multiple
30 pages
How To Structure Your Sales Organization For Maximum Efficiency
0% (1)
How To Structure Your Sales Organization For Maximum Efficiency
12 pages
Unit - 3 - R Programming
No ratings yet
Unit - 3 - R Programming
16 pages
2019 Multiple Imputations
No ratings yet
2019 Multiple Imputations
27 pages
Churn Assignment
No ratings yet
Churn Assignment
11 pages
Schafer SMMR 1999 MI Primer
No ratings yet
Schafer SMMR 1999 MI Primer
14 pages
Chapter 3
No ratings yet
Chapter 3
58 pages
Journal of Statistical Software: Multiple Imputation With Diagnostics (Mi) in R: Opening Windows Into The Black Box
No ratings yet
Journal of Statistical Software: Multiple Imputation With Diagnostics (Mi) in R: Opening Windows Into The Black Box
31 pages
Introduction To Multiple Imputation: Francis Bursa
No ratings yet
Introduction To Multiple Imputation: Francis Bursa
16 pages
Journal of Statistical Software: Imputation With The R Package VIM
No ratings yet
Journal of Statistical Software: Imputation With The R Package VIM
16 pages
Multiple Imputation: Julia Kozlitina Steve Robertson April 26, 2006
No ratings yet
Multiple Imputation: Julia Kozlitina Steve Robertson April 26, 2006
23 pages
Multiple Imputation Presentation
No ratings yet
Multiple Imputation Presentation
23 pages
White 2010
No ratings yet
White 2010
23 pages
Multiple Imputation in Practice
No ratings yet
Multiple Imputation in Practice
11 pages
Multiple Imputation w2 2024
No ratings yet
Multiple Imputation w2 2024
45 pages
Appendix
No ratings yet
Appendix
12 pages
Mida (AE)
No ratings yet
Mida (AE)
12 pages
Missing Data Analysis With Mice - Firouzeh Noghrehchi - 2015
No ratings yet
Missing Data Analysis With Mice - Firouzeh Noghrehchi - 2015
13 pages
Jds 1135
No ratings yet
Jds 1135
13 pages
6 Different Ways To Compensate For Missing Values in A Dataset
No ratings yet
6 Different Ways To Compensate For Missing Values in A Dataset
12 pages
8 Hron Et Al 2010
No ratings yet
8 Hron Et Al 2010
13 pages
McCombe Etal Supplementary Materials 2021
No ratings yet
McCombe Etal Supplementary Materials 2021
6 pages
Data Cleaning - Project Work
No ratings yet
Data Cleaning - Project Work
10 pages
01 Dealing With Missing Data The Art and Science of Imputation
No ratings yet
01 Dealing With Missing Data The Art and Science of Imputation
26 pages
Data Mining Methods
No ratings yet
Data Mining Methods
17 pages
Final Data Lab
No ratings yet
Final Data Lab
21 pages
Investigation and Comparison Missing Data Imputation Methods
No ratings yet
Investigation and Comparison Missing Data Imputation Methods
73 pages
Full Information Multiple Imputationfor Linear Regression Modelwith Missing Response Variable
No ratings yet
Full Information Multiple Imputationfor Linear Regression Modelwith Missing Response Variable
6 pages
Data Cleaning
No ratings yet
Data Cleaning
4 pages
A Robust Missing Value Imputation Method Mifoimpute For Incomplete Molecular Descriptor Data and Comparative Analysis With Other Missing Value Imputation Methods
No ratings yet
A Robust Missing Value Imputation Method Mifoimpute For Incomplete Molecular Descriptor Data and Comparative Analysis With Other Missing Value Imputation Methods
12 pages
Package Baboon': July 2, 2014
No ratings yet
Package Baboon': July 2, 2014
14 pages
A Comparison of Six Methods For Missing Data Imputation 2155 6180 1000224 PDF
No ratings yet
A Comparison of Six Methods For Missing Data Imputation 2155 6180 1000224 PDF
6 pages
Data Scinece Practical File
No ratings yet
Data Scinece Practical File
23 pages
WEEK
No ratings yet
WEEK
17 pages
Platias2020 Greece
No ratings yet
Platias2020 Greece
10 pages
Dejong 2014
No ratings yet
Dejong 2014
19 pages
Data Science
No ratings yet
Data Science
15 pages
Data - Preprocessing - 2
No ratings yet
Data - Preprocessing - 2
10 pages
DADM S5 Imputation of Missing Data
No ratings yet
DADM S5 Imputation of Missing Data
15 pages
A Comparative Study of Imputation Techniques For Missing Values in Healthcare Diagnostic Datasets
No ratings yet
A Comparative Study of Imputation Techniques For Missing Values in Healthcare Diagnostic Datasets
17 pages
Missing Data Analysis: University College London, 2015
No ratings yet
Missing Data Analysis: University College London, 2015
37 pages
Imputation
No ratings yet
Imputation
2 pages
Da Thoery
No ratings yet
Da Thoery
24 pages
Irisdataset withLegend.R
No ratings yet
Irisdataset withLegend.R
3 pages
Da Lab File 2
No ratings yet
Da Lab File 2
13 pages
Imputation
No ratings yet
Imputation
10 pages
Journal of Statistical Software: Reviewer: Abdolvahab Khademi University of Massachusetts
No ratings yet
Journal of Statistical Software: Reviewer: Abdolvahab Khademi University of Massachusetts
4 pages
Modern Method Web in Ar May 2012
No ratings yet
Modern Method Web in Ar May 2012
45 pages
Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis
No ratings yet
Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis
30 pages
Unit 2 Notes - Docx-3
No ratings yet
Unit 2 Notes - Docx-3
14 pages
BAN5
No ratings yet
BAN5
2 pages
Missing Data Imputation Using Singular Value Decomposition
No ratings yet
Missing Data Imputation Using Singular Value Decomposition
6 pages
Simulating Multivariate Structures
No ratings yet
Simulating Multivariate Structures
3 pages
6 Different Ways To Compensate For Missing Values in A Dataset
No ratings yet
6 Different Ways To Compensate For Missing Values in A Dataset
6 pages
Abdellah's Nursing Theory
No ratings yet
Abdellah's Nursing Theory
6 pages
Stages of Faith
100% (2)
Stages of Faith
2 pages
Pdfs Pre Level Sci PDF
No ratings yet
Pdfs Pre Level Sci PDF
2 pages
8
100% (6)
8
534 pages
Top 10 Voip Ip Telephony Interview Questions and Answers With An Extra C# Softphone Development Tutorial
No ratings yet
Top 10 Voip Ip Telephony Interview Questions and Answers With An Extra C# Softphone Development Tutorial
9 pages
Activity-Based Costing: Learning Objectives
No ratings yet
Activity-Based Costing: Learning Objectives
44 pages
Referencing
No ratings yet
Referencing
31 pages
Haifa Al Buainain
No ratings yet
Haifa Al Buainain
36 pages
Lesson Plan Form: Ashland University
No ratings yet
Lesson Plan Form: Ashland University
2 pages
Experimental Research (Scientific Inquiry) : Mcgraw-Hill
No ratings yet
Experimental Research (Scientific Inquiry) : Mcgraw-Hill
38 pages
PG Syllabus Cbcs-Final 2018-21
No ratings yet
PG Syllabus Cbcs-Final 2018-21
38 pages
Gongs Mobilization
No ratings yet
Gongs Mobilization
5 pages
Lorenz K Companion in The Bird's World
No ratings yet
Lorenz K Companion in The Bird's World
29 pages
Train Signal (Lab 11) - Network Security
No ratings yet
Train Signal (Lab 11) - Network Security
103 pages
LP For FO
No ratings yet
LP For FO
21 pages
Enterprise Network Products Recommended Version List 2015Q4
No ratings yet
Enterprise Network Products Recommended Version List 2015Q4
36 pages
7E'S SCI 8 DLL August 5, 2019
No ratings yet
7E'S SCI 8 DLL August 5, 2019
2 pages
Problem Solving and Algorithms: Problems, Solutions, and Tools
No ratings yet
Problem Solving and Algorithms: Problems, Solutions, and Tools
12 pages
EN Vetra Sliding System Technical Sheet
No ratings yet
EN Vetra Sliding System Technical Sheet
3 pages
Jurnal Ilmiah Tesis PDF
No ratings yet
Jurnal Ilmiah Tesis PDF
15 pages
Are The Objects That Perform Actions in A Scratch Project. Ans: Sprites
No ratings yet
Are The Objects That Perform Actions in A Scratch Project. Ans: Sprites
7 pages
The Javascript Switch Statement: Syntax
No ratings yet
The Javascript Switch Statement: Syntax
5 pages
Pastoral - Rogers PCT vs. Ellis REB Therapy
No ratings yet
Pastoral - Rogers PCT vs. Ellis REB Therapy
5 pages
HW 1 W
No ratings yet
HW 1 W
3 pages
Operations Strategy
No ratings yet
Operations Strategy
4 pages
A Handbook of Varieties of English
No ratings yet
A Handbook of Varieties of English
2 pages
3.decision Making and Looping
No ratings yet
3.decision Making and Looping
3 pages
CASE 01 - Walker Wire Products Co.
No ratings yet
CASE 01 - Walker Wire Products Co.
2 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet