How to Write Multiple Excel Files From Column Values - R programming
Last Updated :
28 Apr, 2025
A data frame is a cell-based structure comprising rows and columns belonging to the same or different data types. Each cell in the data frame is associated with a unique value, either a definite value or a missing value, indicated by NA. The data frame structure is in complete accordance with the Excel sheet structure. Therefore, the data frame values can be easily written into the Excel sheet, specified by the XLSX extension.
The XLSX package in R Programming Language can be used to perform read, write, and manipulation operations with Excel files. It is one of the most prominent packages available in all the varied environments to perform Excel document(xls and xlsx) formatting. It has to be downloaded and installed and downloaded into the working space using the command:
install.packages("xlsx")
Approach 1: Using the split() method
A data frame can be declared as containing the set of values using the in-built data.frame() method in R. The data frame may contain distinct or repetitive values in one or more columns. The split() method can be used to segregate the input data frame into multiple subsets of it depending on the values contained in the argument columns. The number of groups that are created depends on the number of distinct values obtained as a result of the application of the column condition. This method has the following syntax :
Syntax:
split ( dataframe , function)
Arguments :
- dataframe - The input data frame to be split
- function - The factor on the basis of which the data frame is to be divided.
The resultant data frames can then be subjected to the lapply() method iterating up to the number of data frames obtained. This method in R returns a list of the same length as that of the supplied data frame, each element of which is obtained as the result of applying FUN to the corresponding element of the input data frame.
Syntax:
lapply (X, FUN, ...)
Arguments:
- X - an atomic vector or data frame to apply the function on
- FUN - the function to be applied over each element of X.
Each time an iteration over the subset data frame is performed. the contents of the data frame are written onto an xlsx sheet whose complete path is provided and concatenated with the corresponding distinct cell value of the input column in the split() method. The number of sheets created on the path is equivalent to the number of different grouped data frames created using the split() method.
The following code snippet uses the column col3 of the data frame and the number of sheets corresponds to the unique values of the column:
R
# invoking the required packages
library(xlsx)
#creating a data frame
data_frame <- data.frame(col1=c(1:10),
col2=c("Anna","Mindy","Bindy",
"Tindy","Ron",
"Charles","Zoe","Dan",
"Lincoln","Burrows"),
col3=c("CS","CA","Eco","Eco","CA",
"Eco","CS","CS","CS","Eco"))
# segregating data based on the col3 values
data_mod <- split(data_frame, data_frame$col3)
# printing the obtained groups
print("Segregated dataframes")
print(data_mod)
Output :
[1] "Segregated dataframes"
$CA
col1 col2 col3
2 2 Mindy CA
5 5 Ron CA
$CS
col1 col2 col3
1 1 Anna CS
7 7 Zoe CS
8 8 Dan CS
9 9 Lincoln CS
$Eco
col1 col2 col3
3 3 Bindy Eco
4 4 Tindy Eco
6 6 Charles Eco
10 10 Burrows Eco
Now, let's create the XLSX files iteratively.
R
# getting the size of different
# number of groups obtained
size <- length(data_mod)
#creating number of lists equivalent
# to the size of the generated groups
lapply(1:size,
function(i)
write.xlsx(data_mod[[i]],
file = paste0("/Users/mallikagupta/Desktop/",
names(data_mod[i]), ".xlsx")))
Output:
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
Explanation:
Since there are three distinct groups belonging to the col3 of the data_frame, therefore, three different excel sheets are created at the specified path location, with the name corresponding to the distinct col3 value of that particular group. The entire rows of the input data frame are copied to the particular excel sheet that is specified.
Excel files created by using the data frame
Excel files with CA value as col3
Excel files with CS value as col3
Excel files with Eco value as col3Approach 2: Using the group_split() method
The group_split() method can also be used to divide the input data frame into multiple subsets based on the columns used as arguments. Columns or logical conditions can also be specified as the argument of this method. The method can also be supplied using the piping operator to the data frame.
Syntax:
group_split ( dataframe , function)
Arguments :
- dataframe - The input data frame to be split
- function - The factor on the basis of which the data frame is to be divided.
In the following code snippet, a condition is applied over column col4 in the input data frame. The condition checks if the value of col4 is greater than 2. Based on the boolean TRUE or FALSE value, two groups are made. Therefore, two excel files are created, one with a col4 value <=2 and the other with a col4 value >2 respectively.
R
# invoking the required packages
library("xlsx")
library("dplyr")
# creating a data frame
data_frame <- data.frame(col1 = c(1:10),
col2=c("Anna","Mindy","Bindy",
"Tindy","Ron",
"Charles","Zoe","Dan",
"Lincoln","Burrows"),
col3=c("CS","CA","Eco","Eco","CA",
"Eco","CS","CS","CS","Eco"),
col4=c(1,3,2,2,3,4,1,4,1,2))
# segregating data based on the boolean condition of
# whether the col3 values is greater that 2 or not
data_mod <- data_frame %>%
group_split(col4>2)
# printing the different groups created
print("Segregated data frames")
# two groups are created
print(data_mod)
Output:
[1] "Segregated data frames"
<list_of<
tbl_df<
col1 : integer
col2 : character
col3 : character
col4 : double
col4 > 2: logical
>[2]>
[[1]]
# A tibble: 6 × 5
col1 col2 col3 col4 `col4 > 2`
<int> <chr> <chr> <dbl> <lgl>
1 1 Anna CS 1 FALSE
2 3 Bindy Eco 2 FALSE
3 4 Tindy Eco 2 FALSE
4 7 Zoe CS 1 FALSE
5 9 Lincoln CS 1 FALSE
6 10 Burrows Eco 2 FALSE
[[2]]
# A tibble: 4 × 5
col1 col2 col3 col4 `col4 > 2`
<int> <chr> <chr> <dbl> <lgl>
1 2 Mindy CA 3 TRUE
2 5 Ron CA 3 TRUE
3 6 Charles Eco 4 TRUE
4 8 Dan CS 4 TRUE
R
# getting the size of different
# number of groups obtained
size <- length(data_mod)
# creating number of lists equivalent
# to the size of the generated groups
lapply(1:size,
function(i)
write.xlsx(data_mod[[i]],
file = paste0("/Users/mallikagupta/Desktop/new_",
i, ".xlsx")))
Output :
[[1]]
NULL
[[2]]
NULL
Explanation :
The last column in the returned set of Tibbles is created with the header "col4>2", and it points to two types of values, a logical TRUE and a FALSE value respectively. The col4 value is validated each time and then the entire row is copied onto the respective table.
Excel files with 3, and 4 values as col4
Excel files with 1, 2 values as col4
Similar Reads
Non-linear Components
In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Spring Boot Tutorial
Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Class Diagram | Unified Modeling Language (UML)
A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
Steady State Response
In this article, we are going to discuss the steady-state response. We will see what is steady state response in Time domain analysis. We will then discuss some of the standard test signals used in finding the response of a response. We also discuss the first-order response for different signals. We
9 min read
Backpropagation in Neural Network
Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and
9 min read
Polymorphism in Java
Polymorphism in Java is one of the core concepts in object-oriented programming (OOP) that allows objects to behave differently based on their specific class type. The word polymorphism means having many forms, and it comes from the Greek words poly (many) and morph (forms), this means one entity ca
7 min read
Decorators in Python
In Python, decorators are a powerful and flexible way to modify or extend the behavior of functions or methods, without changing their actual code. A decorator is essentially a function that takes another function as an argument and returns a new function with enhanced functionality. Decorators are
10 min read
3-Phase Inverter
An inverter is a fundamental electrical device designed primarily for the conversion of direct current into alternating current . This versatile device , also known as a variable frequency drive , plays a vital role in a wide range of applications , including variable frequency drives and high power
13 min read
What is Vacuum Circuit Breaker?
A vacuum circuit breaker is a type of breaker that utilizes a vacuum as the medium to extinguish electrical arcs. Within this circuit breaker, there is a vacuum interrupter that houses the stationary and mobile contacts in a permanently sealed enclosure. When the contacts are separated in a high vac
13 min read
AVL Tree Data Structure
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one. The absolute difference between the heights of the left subtree and the right subtree for any node is known as the balance factor of
4 min read