Aggregate data using custom functions using R
Last Updated :
12 Apr, 2024
In this article, we will explore various methods to aggregate data using custom functions by using the R Programming Language.
What is a custom function?
Custom functions are an essential part of R programming, which allows users to create reusable blocks of code tailored to their specific needs. These functions encapsulate a series of operations, making code readable, and easier to maintain.
How to aggregate data using custom functions
The aggregate function in R is designed to aggregate data in a data frame. R language offers various methods to aggregate data by using custom functions. By using these methods provided by R, it is possible to aggregate data easily. Some of the methods to aggregate data using custom functions are:
Aggregating data by sum using the custom function
This method is used to aggregate data by sum using the custom function. In the below example, we created a data frame and performed mean by using the custom function .
R
# creating data frame
df <- data.frame(
date = as.Date(c("2024-01-01", "2024-01-15", "2024-02-10", "2024-02-20", "2024-03-20",
"2024-03-15")),
sold = c(100, 150, 200, 250,300,350)
)
print("The original dataframe is")
print(df)
# Custom function to result
result = function(x) {
return(sum(x))
}
print("After calculating the sum is")
sales_permonth <- aggregate(sold ~ format(date, "%Y-%m"),
data = df, FUN = result)
print(sales_permonth)
Output:
[1] "The original dataframe is"
date sold
1 2024-01-01 100
2 2024-01-15 150
3 2024-02-10 200
4 2024-02-20 250
5 2024-03-20 300
6 2024-03-15 350
[1] "Aggregating data per month is"
format(date, "%Y-%m") sold
1 2024-01 250
2 2024-02 450
3 2024-03 650
In the below example, we created a data frame and performed sum by using the custom function .
R
goods=c("a","b","c","d","b","c","a")
prices=c(100,200,300,400,500,600,700)
#creating data frame
df = data.frame(goods,prices)
print(df)
print("After calculating the sum is")
res = aggregate(prices ~ goods , data = df, FUN = sum)
print(res)
Output:
goods prices
1 a 100
2 b 200
3 c 300
4 d 400
5 b 500
6 c 600
7 a 700
[1] "Aggregating data by sum is"
goods prices
1 a 800
2 b 700
3 c 900
4 d 400
Aggregating data by mean using the custom function
This method is used to aggregate data by mean using the custom function. In the below example, we created a data frame and performed mean by using the custom function .
R
names=c("a","a","b","c","c","b")
scores=c(100,95,90,80,85,70)
# creating data frame
df = data.frame(names,scores)
print("The original dataframe is")
print(df)
# calculating mean
cal_mean = function(x) {
return(mean(x))
}
print("After calculating the mean is")
result = aggregate(scores ~names, data = df,
FUN = cal_mean)
print(result)
Output:
[1] "The original dataframe is"
names scores
1 a 100
2 a 95
3 b 90
4 c 80
5 c 85
6 b 70
[1] "After calculating the mean is"
names scores
1 a 97.5
2 b 80.0
3 c 82.5
In the below example, we created a data frame and performed mean by using the custom function.
R
team = c("csk", "rcb", "rcb", "srh", "srh","csk",'csk')
run_rate= c(80, 85, 70, 85, 85, 86, 95)
# creating data frame
df = data.frame(team, run_rate)
print("The original dataframe is")
print(df)
cal_mean = function(x) {
return(mean(x))
}
print("After calculating the mean is")
# Aggregating data by group
result <- aggregate(run_rate ~ team, data = df,
FUN = cal_mean)
print(result)
Output:
[1] "The original dataframe is"
team run_rate
1 csk 80
2 rcb 85
3 rcb 70
4 srh 85
5 srh 85
6 csk 86
7 csk 95
[1] "After calculating the mean is"
team run_rate
1 csk 87.0
2 rcb 77.5
3 srh 85.0
Aggregating data by median using the Custom Function
This method is used to aggregate data by median using the custom function. In the below example, we created a data frame and performed median by using the custom function.
R
# Sample data
prices <- data.frame(
category = c("A", "A","A", "B", "B","B", "C", "C","C"),
values = c(10, 15, 20, 23, 30, 25, 40, 55, 60)
)
print("The original dataframe is")
print(prices)
# calculating median
cal_median = function(x) {
return(median(x))
}
result = aggregate(values ~ category,
data = prices, FUN = cal_median)
print("After calculating the median is")
print(result)
Output:
[1] "The original dataframe is"
category values
1 A 10
2 A 15
3 A 20
4 B 23
5 B 30
6 B 25
7 C 40
8 C 55
9 C 60
[1] "After calculating the median is"
category values
1 A 15
2 B 25
3 C 55
In the below example, we created a data frame and performed median by using the custom function.
R
name=c("a","b","c","b","a","b")
r_no=c(350,355,355,360,365,370)
# creating data frame
product_prices = data.frame(name, r_no )
print("The original dataframe is")
print(product_prices)
# To calculate median
calculate_median = function(x) {
return(median(x))
}
res<- aggregate(r_no~ name, data = product_prices,
FUN = calculate_median)
print(res)
Output:
[1] "The original dataframe is"
name r_no
1 a 350
2 b 355
3 c 355
4 b 360
5 a 365
6 b 370
name r_no
1 a 357.5
2 b 360.0
3 c 355.0
Aggregating data by standard deviation using the Custom Function
This method is used to aggregate data by standard deviation using the custom function. In the below example, we created a data frame and performed standard deviation by using the custom function.
R
batch = c("x", "y", "x", "y", "x","x")
number = c(20, 35, 20, 34, 25,40)
df <- data.frame(batch, number)
print(df)
cus_sd <- function(x) {
return(sd(x, na.rm = TRUE))
}
res = aggregate(number ~ batch, data = df, FUN = cus_sd)
print(res)
Output:
batch number
1 x 20
2 y 35
3 x 20
4 y 34
5 x 25
6 x 40
batch number
1 x 9.4648472
2 y 0.7071068
In the below example, we created a data frame and performed standard deviation by using the custom function.
R
names = c("raju", "ravi", "rakesh", "raju", "rakesh","ravi")
cgpa = c(7.5, 8.5, 7.0, 9.5, 8.8, 8.0)
df <- data.frame(names, cgpa)
print(df)
cus_sd <- function(x) {
return(sd(x, na.rm = TRUE))
}
print("After calculating the standard deviation is")
res = aggregate( cgpa ~ names, data = df, FUN = cus_sd)
print(res)
Output:
names cgpa
1 raju 7.5
2 ravi 8.5
3 rakesh 7.0
4 raju 9.5
5 rakesh 8.8
6 ravi 8.0
[1] "After calculating the standard deviation is"
names cgpa
1 raju 1.4142136
2 rakesh 1.2727922
3 ravi 0.3535534
Conclusion
In Conclusion, we learned about how to aggregate data by using the custom functions using R. R language offers versatile tools while handling with custom functions.
Similar Reads
Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Class Diagram | Unified Modeling Language (UML) A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
Steady State Response In this article, we are going to discuss the steady-state response. We will see what is steady state response in Time domain analysis. We will then discuss some of the standard test signals used in finding the response of a response. We also discuss the first-order response for different signals. We
9 min read
Backpropagation in Neural Network Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and
9 min read
Polymorphism in Java Polymorphism in Java is one of the core concepts in object-oriented programming (OOP) that allows objects to behave differently based on their specific class type. The word polymorphism means having many forms, and it comes from the Greek words poly (many) and morph (forms), this means one entity ca
7 min read
3-Phase Inverter An inverter is a fundamental electrical device designed primarily for the conversion of direct current into alternating current . This versatile device , also known as a variable frequency drive , plays a vital role in a wide range of applications , including variable frequency drives and high power
13 min read
What is Vacuum Circuit Breaker? A vacuum circuit breaker is a type of breaker that utilizes a vacuum as the medium to extinguish electrical arcs. Within this circuit breaker, there is a vacuum interrupter that houses the stationary and mobile contacts in a permanently sealed enclosure. When the contacts are separated in a high vac
13 min read
AVL Tree Data Structure An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one. The absolute difference between the heights of the left subtree and the right subtree for any node is known as the balance factor of
4 min read
CTE in SQL In SQL, a Common Table Expression (CTE) is an essential tool for simplifying complex queries and making them more readable. By defining temporary result sets that can be referenced multiple times, a CTE in SQL allows developers to break down complicated logic into manageable parts. CTEs help with hi
6 min read