0% found this document useful (0 votes)

11 views9 pages

Plyr Package in R Programming

The Plyr package in R is designed for data manipulation, allowing users to split, apply, and combine data using functions like ddply(), ldply(), adply(), join(), and summarise(). It facilitates tasks such as aggregating data, transforming datasets, and joining data frames based on common columns. Users can install and load the package to perform various data operations efficiently.

Uploaded by

sibi00424

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views9 pages

Plyr Package in R Programming

Uploaded by

sibi00424

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Plyr Package in R Programming

What is Plyr Package?

Plyr is a package for data manipulation in R that provides a set of functions for
splitting, applying, and combining data. It is based on the concept of split-apply-
combine, where a dataset is first split into smaller subsets, a function is applied to
each subset, and the results are then combined into a single output. This process is
useful for tasks such as aggregating data, summarizing data, and transforming data.

Installing and Loading Plyr Package:

Before using the plyr package, it needs to be installed and loaded into R. The
package can be installed using the following command:

install.packages("plyr")

After the package is installed, it can be loaded into R using the following command:
library(plyr)

1. Splitting Data using ddply( ) functions:

The ddply( ) function is a powerful tool for splitting data frames into smaller
subsets, applying a function to each subset, and then combining the results into a
new data frame. The name “ddply” stands for “split, apply, and combine”, which
summarizes the three main steps of the function. Here are the main arguments of
ddply():
Syntax:
Parameters: `data`
object:The input data frame that you want to split and process.

Syntax:
Parameters: `variables`
object:One or more grouping variables that define how the data should be split.

Syntax:
Parameters: `fun`
object:A function that you want to apply to each subset of the data frame.
Syntax:
Parameters: `…`
object:Additional arguments that are passed to the function specified in fun. ere’s an
example of how to use ddply() to calculate the mean miles per gallon (mpg) of cars
in the mtcars dataset, grouped by the number of cylinders in the engine:

library(plyr)

# Using ddply to group by number of cylinders and calculate mean mpg

ddply(mtcars, .(cyl), summarise, mean_mpg = mean(mpg))

In this example, ddply() is used to group the mtcars dataset by the cyl variable
(number of cylinders), and then the summarise() function is used to calculate the
mean mpg for each group. The resulting output is a data frame with two columns:
cyl and mean_mpg.

2. Combining the results using ldply( ) function:

The ldply() function is used to convert a list of data frames or vectors into a single
data frame, with each element of the list becoming a row of the output data frame.
The name “ldply” stands for “list and bind data frames”, which summarizes the
main action of the function. Finally, the ldply() function returns a data frame that
contains all the elements of the input list, stacked on top of each other. Here are the
main arguments of ldply():
Syntax:
Parameters: `data`
object:The input list that you want to convert to a data frame.
Syntax:
Parameters: `.fun`
object:An optional function that you want to apply to each element of the list before
converting it to a data frame.

Syntax:
Parameters: `…`
object:Additional arguments that are passed to the function specified in .fun.

Example:
library(plyr)

# Create a list of data frames

countries_1 <- data.frame(country = c("USA", "Canada", "Mexico"), population =
c(328, 37, 130))
countries_2 <- data.frame(country = c("Brazil", "Argentina", "Chile"), population =
c(211, 45, 19))
countries_list <- list(countries_1, countries_2)

# Use ldply() to combine the list of data frames into a single data frame
combined_df <- ldply(countries_list, data.frame)

# View the resulting data frame

combined_df
In this example, we first create a list of two data frames (countries_1 and
countries_2) using data.frame() function. Then, we combine these data frames into a
list called countries_list. Finally, we use ldply() function to combine all the data
frames in countries_list into a single data frame called combined_df. The resulting
data frame contains information about all the countries in the original data frames.

3. Combining Data using adply( ) function:

The adply() function is used to apply a function to each subset of a data frame and
then combines the results into a new data frame. The a in adply() stands for
“array”, meaning that it can be used with arrays of any dimensions. The
arguments for adply() are:

Syntax:
Parameters: `data`
object:the input data frame or array.

Syntax:
Parameters: `margins`
object:the dimensions of the array to split over (in this example, we used 2 to split
over the second dimension)

Syntax:
Parameters: `FUN`
object:the function to apply to each subset of the array (in this example, we used an
anonymous function that calculates the mean and standard deviation of each
column)
Syntax:
Parameters: `…`
object:additional arguments to pass to the function specified in FUN (if any)

Example:

library(plyr)

# Create a sample matrix

mat <- matrix(1:9, nrow = 3)

# Display created matrix

mat

# Use adply() to calculate the sum of each row

result <- adply(mat, 1, function(x) sum(x))

# View the result

Result

In this example, the adply() function is used to apply the sum() function to each row
of the matrix mat. The second argument (1) specifies that we want to apply the
function to each subset of the array consisting of one row and all columns. The third
argument is an anonymous function that calculates the sum of each row. The
resulting result data frame has one column and three rows (one for each row in mat).
The values in each row correspond to the sum of that row.

4. Join Two Data Frames using join( ) function:

join() is a function from the plyr package in R that is used to join two data frames by
a common column. The join() function takes several arguments, including:

Syntax:
Parameters: `x`, `y`
object: Data frames join.

Syntax:
Parameters: `by`
object: The column(s) to join the data frames .

Syntax:
Parameters: `type`
object: The type of join to perform (e.g. “inner”, “outer”, “left”, “right”).

Syntax:
Parameters: `suffix`
object:A character vector to append to overlapping variable names (defaults to
c(“.x”, “.y”))

Example:

library(plyr)

# Create two sample data frames

df1 <- data.frame(
id = c(1, 2, 3),
name = c("Alice", "Bob", "Charlie")
)

df2 <- data.frame(

id = c(2, 3, 4),
age = c(25, 30, 35)
)

# Print the created dataset

df1
df2

# Use join() to combine the data frames

result <- join(df1, df2, by = "id")

# View the result

result

In this example, the join() function is used to combine two data frames (df1 and df2)
based on a common column (id). The by argument specifies the name of the
common column. The resulting result data frame has three columns (id, name, age)
and two rows (one for each matching value of id in df1 and df2). The values in the
name and age columns correspond to the names and ages of the individuals with the
matching id value.

5. Summary Statistics using summarise( ) function:

The summarise() function in the plyr package of R is used to aggregate data and
calculate summary statistics by groups. The summarise() function takes several
arguments, including:

Syntax:
Parameters: `data`
object: The data frame to summarize.

Syntax:
Parameters: `…`
object: a list of expressions that calculate summary statistics (e.g. mean(value),
sd(value), etc.)

Example:

# Load the plyr package

library(plyr)

# Create a data frame with two columns: group and value

df <- data.frame(group = c("A", "A", "B", "B", "B"), value = c(2, 4, 6, 8, 10))

# Summarize the data by group, calculating the

# mean and standard deviation of the value column
summary_df <- summarise(group_by(df, group), mean = mean(value), sd = sd(value))

# Print the summary data frame to the console

summary_df
In this code, We first use the group_by() function from plyr to group the data by the
group column, and pass the resulting grouped data frame to the summarise()
function from plyr. We calculate the mean and standard deviation of the value
column using the mean() and sd() functions, respectively, and give the resulting
columns the names mean and sd. The resulting summary_df data frame will have a
row for each group in the original df data frame, with columns for group, mean, and
sd.

R Most Important Question
No ratings yet
R Most Important Question
12 pages
Assignment 2 Tidyr
No ratings yet
Assignment 2 Tidyr
2 pages
M2_DAR_
No ratings yet
M2_DAR_
46 pages
DV Lab
No ratings yet
DV Lab
52 pages
Tutorial 1 - R Programming
No ratings yet
Tutorial 1 - R Programming
40 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
Unit 4
No ratings yet
Unit 4
27 pages
Module IV
No ratings yet
Module IV
43 pages
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
No ratings yet
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
58 pages
P6ADBMS
No ratings yet
P6ADBMS
34 pages
R Programming Cont..
No ratings yet
R Programming Cont..
24 pages
Dar lecture 7
No ratings yet
Dar lecture 7
24 pages
8 R Basics 3
No ratings yet
8 R Basics 3
27 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
r 2m
No ratings yet
r 2m
34 pages
Bdo Co1 Session 4
No ratings yet
Bdo Co1 Session 4
43 pages
r file code
No ratings yet
r file code
16 pages
Machine Learning - Unit IV Notes
No ratings yet
Machine Learning - Unit IV Notes
18 pages
Summarizing Data
No ratings yet
Summarizing Data
13 pages
P-AI IA3
No ratings yet
P-AI IA3
13 pages
Advanced R Programming Tidyverse Packages Notes
No ratings yet
Advanced R Programming Tidyverse Packages Notes
12 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
40 pages
R Language - Experiment 1 (21-01-25)
No ratings yet
R Language - Experiment 1 (21-01-25)
8 pages
Group Manipulation and Data Reshaping in R
No ratings yet
Group Manipulation and Data Reshaping in R
10 pages
Matrix, Dataframes, List
No ratings yet
Matrix, Dataframes, List
8 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
Obejcts in R A13
No ratings yet
Obejcts in R A13
8 pages
BT1101 - R Code Cheatsheet 1.0
No ratings yet
BT1101 - R Code Cheatsheet 1.0
12 pages
R Packages Dplyr Sem-III 2021
No ratings yet
R Packages Dplyr Sem-III 2021
13 pages
Apply family in R
No ratings yet
Apply family in R
10 pages
CRM Cheat Sheet
No ratings yet
CRM Cheat Sheet
7 pages
Apply, Lapply, Sapply, Tapply Function in R With Examples
No ratings yet
Apply, Lapply, Sapply, Tapply Function in R With Examples
10 pages
Tidyverse: Core Packages in Tidyverse
No ratings yet
Tidyverse: Core Packages in Tidyverse
8 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
DS Lab
No ratings yet
DS Lab
31 pages
MTech R Notes
No ratings yet
MTech R Notes
14 pages
R study material I
No ratings yet
R study material I
8 pages
R Imp Funtions
No ratings yet
R Imp Funtions
10 pages
Lab 1- Basic functions in R and plotting
No ratings yet
Lab 1- Basic functions in R and plotting
8 pages
R Module 6 - Data Summarization
No ratings yet
R Module 6 - Data Summarization
25 pages
A Quick Introduction To Plyr: 1 Why Use Apply Functions Instead of For Loops?
No ratings yet
A Quick Introduction To Plyr: 1 Why Use Apply Functions Instead of For Loops?
6 pages
Assignment 2 Tidyr
No ratings yet
Assignment 2 Tidyr
2 pages
Introduction to R for Business Analytics(1)
No ratings yet
Introduction to R for Business Analytics(1)
7 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
R
No ratings yet
R
13 pages
R-Programming-Cheat-Sheet
No ratings yet
R-Programming-Cheat-Sheet
7 pages
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
No ratings yet
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
10 pages
MIT 302 - Statistical Computing II - Tutorial 02
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 02
5 pages
BMR Assignment: Tidyr
No ratings yet
BMR Assignment: Tidyr
3 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
Importing The Files
No ratings yet
Importing The Files
14 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
UL2
No ratings yet
UL2
2 pages
Exercise Dataframe
No ratings yet
Exercise Dataframe
6 pages
R
No ratings yet
R
15 pages
3110003 PPS GTU Study Material Presentations Unit-7 08022021073302AM
No ratings yet
3110003 PPS GTU Study Material Presentations Unit-7 08022021073302AM
18 pages
Clock & Calendar
No ratings yet
Clock & Calendar
29 pages
The Philosophy of Science A Companion
100% (10)
The Philosophy of Science A Companion
764 pages
R Reference Card
No ratings yet
R Reference Card
1 page
SW1_03
No ratings yet
SW1_03
22 pages
A Deeper Look at Machine Learning-Based Cryptanalysis
No ratings yet
A Deeper Look at Machine Learning-Based Cryptanalysis
32 pages
Kinematics of Fluid Flow
No ratings yet
Kinematics of Fluid Flow
63 pages
Resume El Marjou Youssef 1727816159
No ratings yet
Resume El Marjou Youssef 1727816159
1 page
Leupold Tactical Milling Reticle Manual
100% (2)
Leupold Tactical Milling Reticle Manual
36 pages
Mr. Fourcan Karim Mazumder Faculty Dept. of Computer Science and Engineering
No ratings yet
Mr. Fourcan Karim Mazumder Faculty Dept. of Computer Science and Engineering
37 pages
Week 12directions Psy 7104
No ratings yet
Week 12directions Psy 7104
3 pages
Predictive Analytics in Employee Churn A Systematic Literature Review
No ratings yet
Predictive Analytics in Employee Churn A Systematic Literature Review
11 pages
Shayak
No ratings yet
Shayak
6 pages
1 s2.0 S0965997816306809 Main
No ratings yet
1 s2.0 S0965997816306809 Main
15 pages
New Computational Methods in Power System Reliability David Elmakias (Ed.)
No ratings yet
New Computational Methods in Power System Reliability David Elmakias (Ed.)
14 pages
Ip Lab Manual (Python) 2019-20
No ratings yet
Ip Lab Manual (Python) 2019-20
16 pages
Quantity Take Off
No ratings yet
Quantity Take Off
14 pages
Ravi Sir 7.2-F
No ratings yet
Ravi Sir 7.2-F
7 pages
Third Space Learning - Nov 2022 Higher Paper 1 Mark Scheme - Edexcel
No ratings yet
Third Space Learning - Nov 2022 Higher Paper 1 Mark Scheme - Edexcel
9 pages
Nomenclature Presentation
No ratings yet
Nomenclature Presentation
34 pages
02 Algebraic Fractional Equations
No ratings yet
02 Algebraic Fractional Equations
9 pages
Q1 Consider The Following Mass-Spring Damper: Experimet No: 02
No ratings yet
Q1 Consider The Following Mass-Spring Damper: Experimet No: 02
6 pages
Problem Set 3
No ratings yet
Problem Set 3
5 pages
Non-Linear Recursive Backtracking: Li Yin April 5, 2019
No ratings yet
Non-Linear Recursive Backtracking: Li Yin April 5, 2019
18 pages
AIATS Planner For SS - 2122
No ratings yet
AIATS Planner For SS - 2122
3 pages
Arnold's Cat Map: Introduction: Chaos
No ratings yet
Arnold's Cat Map: Introduction: Chaos
7 pages
(Guddu) PHD Statement of Purpose - Avinash Nayak - Caltech
No ratings yet
(Guddu) PHD Statement of Purpose - Avinash Nayak - Caltech
2 pages
For Learners: TLE-9 (ICT) Second Quarter, Week 2 Day 1-4
No ratings yet
For Learners: TLE-9 (ICT) Second Quarter, Week 2 Day 1-4
15 pages
Energy Rating of PV Modules: Comparison of Methods and Approach
No ratings yet
Energy Rating of PV Modules: Comparison of Methods and Approach
4 pages
H. Bateman - The Structure of The Aether
No ratings yet
H. Bateman - The Structure of The Aether
11 pages
RSRP Vs RSRQ Vs Sinr
No ratings yet
RSRP Vs RSRQ Vs Sinr
15 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet

Plyr Package in R Programming

Uploaded by

Plyr Package in R Programming

Uploaded by

Plyr Package in R Programming

What is Plyr Package?

Installing and Loading Plyr Package:

1. Splitting Data using ddply( ) functions:

# Using ddply to group by number of cylinders and calculate mean mpg

2. Combining the results using ldply( ) function:

# Create a list of data frames

# View the resulting data frame

3. Combining Data using adply( ) function:

# Create a sample matrix

# Display created matrix

# Use adply() to calculate the sum of each row

# View the result

4. Join Two Data Frames using join( ) function:

# Create two sample data frames

df2 <- data.frame(

# Print the created dataset

# Use join() to combine the data frames

# View the result

5. Summary Statistics using summarise( ) function:

# Load the plyr package

# Create a data frame with two columns: group and value

# Summarize the data by group, calculating the

# Print the summary data frame to the console

You might also like