0% found this document useful (0 votes)

10 views11 pages

Package GRR': Topics Documented

The 'grr' package provides alternative implementations of several base R functions, such as sort, order, and match, which are designed to be faster or simpler. Key functions include convertBase for number base conversions, extract for efficient data extraction, and matches for comprehensive value matching. The package is maintained by Craig Varrichio and requires R version 3.0.0 or higher.

Uploaded by

eltroll799

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

Package GRR': Topics Documented

Uploaded by

eltroll799

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Package ‘grr’

October 13, 2022

Title Alternative Implementations of Base R Functions
Version 0.9.5
Author Craig Varrichio <[email protected]>
Maintainer Craig Varrichio <[email protected]>
Description Alternative implementations of some base R functions, including sort, or-
der, and match. Functions are simplified but can be faster or have other advantages.
Depends R (>= 3.0.0)
License GPL-3
RoxygenNote 5.0.1
NeedsCompilation yes
Repository CRAN
Date/Publication 2016-08-26 20:35:38

R topics documented:
convertBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
extract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
grr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
order2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
sample2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
sort2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Index 11

1
2 extract

convertBase Convert string representations of numbers in any base to any other

base.

Description
Convert string representations of numbers in any base to any other base.

Usage
convertBase(x, base1 = 10, base2 = 10)

Arguments
x a vector of integers or strings to be converted
base1 the base of x
base2 the base of the output

extract Extract/return parts of objects

Description
Alternative to built-in Extract or [. Allows for extraction operations that are ambivalent to the data
type of the object. For example, extract(x,i) will work on lists, vectors, data frames, matrices,
etc.

Usage
extract(x, i = NULL, j = NULL)
extract 3

Arguments

x object from which to extract elements

i, j indices specifying elements to extract. Can be numeric, character, or logical
vectors.

Details

Extraction is 2-100x faster on data frames than with the built in operation - but does not preserve
row names.

Examples
#Typically about twice as fast on normal subselections
orders<-data.frame(orderNum=1:1e5,
sku=sample(1e3, 1e5, TRUE),
customer=sample(1e4,1e5,TRUE))
a<-sample(1e5,1e4)
system.time(b<-orders[a,])
system.time(c<-extract(orders,a))
rownames(b)<-NULL
rownames(c)<-NULL
identical(b,c)

#Speedup increases to 50-100x with oversampling

a<-sample(1e5,1e6,TRUE)
system.time(b<-orders[a,])
system.time(c<-extract(orders,a))
rownames(b)<-NULL
rownames(c)<-NULL
identical(b,c)

#Can create function calls that work for multiple data types
alist<-as.list(1:50)
avector<-1:50
extract(alist,1:5)
extract(avector,1:5)
extract(orders,1:5)#'

## Not run:
orders<-data.frame(orderNum=as.character(sample(1e5, 1e6, TRUE)),
sku=sample(1e3, 1e6, TRUE),
customer=sample(1e4,1e6,TRUE))
system.time(a<-sample(1e6,1e7,TRUE))
system.time(b<-orders[a,])
system.time(c<-extract(orders,a))

## End(Not run)
4 matches

grr Alternative Implementations of Base R Functions

Description
Alternative implementations of some base R functions, including sort, order, and match. Functions
are simplified but can be faster or have other advantages. See the documentation of individual
functions for details and benchmarks.

Details
Note that these functions cannot be considered drop-in replacements for the functions in base R.
They do not implement all the same parameters and do not work for all data types. Utilize these
with caution in specialized applications that require them.

matches Value Matching

Description
Returns a lookup table or list of the positions of ALL matches of its first argument in its second and
vice versa. Similar to match, though that function only returns the first match.

Usage
matches(x, y, all.x = TRUE, all.y = TRUE, list = FALSE, indexes = TRUE,
nomatch = NA)

Arguments
x vector. The values to be matched. Long vectors are not currently supported.
y vector. The values to be matched. Long vectors are not currently supported.
all.x logical; if TRUE, then each value in x will be included even if it has no matching
values in y
all.y logical; if TRUE, then each value in y will be included even if it has no matching
values in x
list logical. If TRUE, the result will be returned as a list of vectors, each vector
being the matching values in y. If FALSE, result is returned as a data frame with
repeated values for each match.
indexes logical. Whether to return the indices of the matches or the actual values.
nomatch the value to be returned in the case when no match is found. If not provided
and indexes=TRUE, items with no match will be represented as NA. If set to
NULL, items with no match will be set to an index value of length+1. If in-
dexes=FALSE, they will default to NA.
matches 5

Details
This behavior can be imitated by using joins to create lookup tables, but matches is simpler and
faster: usually faster than the best joins in other packages and thousands of times faster than the
built in merge.
all.x/all.y correspond to the four types of database joins in the following way:

left all.x=TRUE, all.y=FALSE

right all.x=FALSE, all.y=TRUE
inner all.x=FALSE, all.y=FALSE
full all.x=TRUE, all.y=TRUE

Note that NA values will match other NA values.

Examples
one<-as.integer(1:10000)
two<-as.integer(sample(1:10000,1e3,TRUE))
system.time(a<-lapply(one, function (x) which(two %in% x)))
system.time(b<-matches(one,two,all.y=FALSE,list=TRUE))

#Only retain items from one with a match in two

b<-matches(one,two,all.x=FALSE,all.y=FALSE,list=TRUE)
length(b)==length(unique(two))

one<-round(runif(1e3),3)
two<-round(runif(1e3),3)
system.time(a<-lapply(one, function (x) which(two %in% x)))
system.time(b<-matches(one,two,all.y=FALSE,list=TRUE))

one<-as.character(1:1e5)
two<-as.character(sample(1:1e5,1e5,TRUE))
system.time(b<-matches(one,two,list=FALSE))
system.time(c<-merge(data.frame(key=one),data.frame(key=two),all=TRUE))

## Not run:
one<-as.integer(1:1000000)
two<-as.integer(sample(1:1000000,1e5,TRUE))
system.time(b<-matches(one,two,indexes=FALSE))
if(requireNamespace("dplyr",quietly=TRUE))
system.time(c<-dplyr::full_join(data.frame(key=one),data.frame(key=two)))
if(require(data.table,quietly=TRUE))
system.time(d<-merge(data.table(data.frame(key=one))
,data.table(data.frame(key=two))
,by='key',all=TRUE,allow.cartesian=TRUE))

one<-as.character(1:1000000)
two<-as.character(sample(1:1000000,1e5,TRUE))
system.time(a<-merge(one,two)) #Times out
system.time(b<-matches(one,two,indexes=FALSE))
if(requireNamespace("dplyr",quietly=TRUE))
6 order2

system.time(c<-dplyr::full_join(data.frame(key=one),data.frame(key=two)))#'
if(require(data.table,quietly=TRUE))
{
system.time(d<-merge(data.table(data.frame(key=one))
,data.table(data.frame(key=two))
,by='key',all=TRUE,allow.cartesian=TRUE))
identical(b[,1],as.character(d$key))
}

## End(Not run)

order2 Ordering vectors

Description
Simplified implementation of order. For large vectors, typically is about 3x faster for numbers and
20x faster for characters.

Usage
order2(x)

Arguments
x a vector of class numeric, integer, character, factor, or logical. Long vectors are
not supported.

Examples
chars<-as.character(sample(1e3,1e4,TRUE))
system.time(a<-order(chars))
system.time(b<-order2(chars))
identical(chars[a],chars[b])

ints<-as.integer(sample(1e3,1e4,TRUE))
system.time(a<-order(ints))
system.time(b<-order2(ints))
identical(ints[a],ints[b])

nums<-runif(1e4)
system.time(a<-order(nums))
system.time(b<-order2(nums))
identical(nums[a],nums[b])

logs<-as.logical(sample(0:1,1e6,TRUE))
system.time(a<-order(logs))
system.time(b<-order2(logs))
identical(logs[a],logs[b])
sample2 7

facts<-as.factor(as.character(sample(1e3,1e4,TRUE)))
system.time(a<-order(facts))
system.time(b<-order2(facts))
identical(facts[a],facts[b])

#How are special values like NA and Inf handled?

#For numerics, values sort intuitively, with the important note that NA and
#NaN will come after all real numbers but before Inf.
(function (x) x[order2(x)])(c(1,2,NA,NaN,Inf,-Inf))
#For characters, values sort correctly with NA at the end.
(function (x) x[order2(x)])(c('C','B',NA,'A'))
#For factors, values sort correctly with NA at the end.
(function (x) x[order2(x)])(as.factor(c('C','B',NA,'A')))

#Ordering a data frame using order2

df<-data.frame(one=as.character(1:4e5),
two=sample(1:1e5,4e5,TRUE),
three=sample(letters,4e5,TRUE),stringsAsFactors=FALSE)
system.time(a<-df[order(df$one),])
system.time(b<-df[order2(df$one),])
system.time(a<-df[order(df$two),])
system.time(b<-df[order2(df$two),])

## Not run:
chars<-as.character(sample(1e5,1e6,TRUE))
system.time(a<-order(chars))
system.time(b<-order2(chars))

ints<-as.integer(sample(1e5,1e6,TRUE))
system.time(result<-order(ints))
system.time(result<-order2(ints))

nums<-runif(1e6)
system.time(result<-order(nums))
system.time(result<-order2(nums))

logs<-as.logical(sample(0:1,1e7,TRUE))
system.time(result<-order(logs))
system.time(result<-order2(logs))

facts<-as.factor(as.character(sample(1e5,1e6,TRUE)))
system.time(a<-order(facts))
system.time(b<-order2(facts))
identical(facts[a],facts[b])

## End(Not run)
8 sample2

sample2 A wrapper for sample.int and extract that makes it easy to quickly
sample rows from any object, including Matrix and sparse matrix ob-
jects.

Description
Row names are not preserved.

Usage
sample2(x, size, replace = FALSE, prob = NULL)

Arguments
x object from which to extract elements
size a positive number, the number of items to choose.
replace Should sampling be with replacement?
prob A vector of probability weights for obtaining the elements of the vector being
sampled.

Examples

#Sampling from a list

l1<-as.list(1:1e6)
b<-sample2(l1,1e5)

#Sampling from a data frame

orders<-data.frame(orderNum=sample(1e5, 1e6, TRUE),
sku=sample(1e3, 1e6, TRUE),
customer=sample(1e4,1e6,TRUE),stringsAsFactors=FALSE)

a<-sample2(orders,250000)

#With oversampling sample2 can be much faster than the alternatives,

#with the caveat that it does not preserve row names.
system.time(a<-sample2(orders,2000000,TRUE))
system.time(b<-orders[sample.int(nrow(orders),2000000,TRUE),])
## Not run:

system.time(c<-dplyr::sample_n(orders,2000000,replace=TRUE))

#Can quickly sample for sparse matrices while preserving sparsity

sm<-rsparsematrix(20000000,10000,density=.0001)
sm2<-sample2(sm,1000000)

## End(Not run)
sort2 9

sort2 Sorting vectors

Description
Simplified implementation of sort. For large vectors, typically is about 2x faster for numbers and
20x faster for characters and factors.

Usage
sort2(x)

Arguments
x a vector of class numeric, integer, character, factor, or logical. Long vectors are
not supported.

Examples
chars<-as.character(sample(1e3,1e4,TRUE))
system.time(a<-sort(chars))
system.time(b<-sort2(chars))
identical(a,b)

ints<-as.integer(sample(1e3,1e4,TRUE))
system.time(a<-sort(ints))
system.time(b<-sort2(ints))
identical(a,b)

nums<-runif(1e4)
system.time(a<-sort(nums))
system.time(b<-sort2(nums))
identical(a,b)

logs<-as.logical(sample(0:1,1e6,TRUE))
system.time(result<-sort(logs))
system.time(result<-sort2(logs))

facts<-as.factor(as.character(sample(1e3,1e4,TRUE)))
system.time(a<-sort(facts))
system.time(b<-sort2(facts))
identical(a,b)

#How are special values like NA and Inf handled?

#For numerics, values sort intuitively, with the important note that NA and
#NaN will come after all real numbers but before Inf.
sort2(c(1,2,NA,NaN,Inf,-Inf))
#For characters, values sort correctly with NA at the end.
sort2(c('C','B',NA,'A'))
#For factors, values sort correctly with NA at the end
10 sort2

sort2(as.factor(c('C','B',NA,'A')))

## Not run:
chars<-as.character(sample(1e5,1e6,TRUE))
system.time(a<-sort(chars))
system.time(b<-sort2(chars))

ints<-as.integer(sample(1e5,1e6,TRUE))
system.time(result<-sort(ints))
system.time(result<-sort2(ints))

nums<-runif(1e6)
system.time(result<-sort(nums))
system.time(result<-sort2(nums))

logs<-as.logical(sample(0:1,1e7,TRUE))
system.time(result<-sort(logs))
system.time(result<-sort2(logs))

facts<-as.factor(as.character(sample(1e5,1e6,TRUE)))
system.time(a<-sort(facts))
system.time(b<-sort2(facts))

## End(Not run)
Index

as.hexmode, 2
as.octmode, 2

convertBase, 2

Extract, 2
extract, 2, 8

grr, 4
grr-package (grr), 4

match, 4
matches, 4
merge, 5

order, 6
order2, 6

sample.int, 8
sample2, 7
sort, 9
sort2, 9
strtoi, 2

Cloud Storage and Local Storage
No ratings yet
Cloud Storage and Local Storage
15 pages
R Cheat Sheet PDF
100% (1)
R Cheat Sheet PDF
38 pages
Mastering Paypal
100% (2)
Mastering Paypal
7 pages
R Programming by Rober D. Peng
No ratings yet
R Programming by Rober D. Peng
179 pages
Automate Period Opening
No ratings yet
Automate Period Opening
4 pages
10774A ENU TrainerHandbook Part2
50% (2)
10774A ENU TrainerHandbook Part2
480 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
1.sequential Circuits
No ratings yet
1.sequential Circuits
119 pages
Vbuilder Manual PDF
No ratings yet
Vbuilder Manual PDF
303 pages
R Lesson (1 of 2) PDF
No ratings yet
R Lesson (1 of 2) PDF
182 pages
Asychronisation OM
No ratings yet
Asychronisation OM
94 pages
Assignment
No ratings yet
Assignment
5 pages
Rprogramming PDF
No ratings yet
Rprogramming PDF
182 pages
64t64r Massive Mimo Remote Radio Unit
100% (1)
64t64r Massive Mimo Remote Radio Unit
2 pages
Curriculam Vitae: Anand Kumar - Peela: 91-9030189253 Objective
No ratings yet
Curriculam Vitae: Anand Kumar - Peela: 91-9030189253 Objective
3 pages
Windows Sysadmin Interview Questions
No ratings yet
Windows Sysadmin Interview Questions
64 pages
R Examples
No ratings yet
R Examples
56 pages
GSM Channels
No ratings yet
GSM Channels
44 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
R Cheat Sheet 3 PDF
No ratings yet
R Cheat Sheet 3 PDF
2 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
N2 Data in R
No ratings yet
N2 Data in R
7 pages
These
No ratings yet
These
170 pages
Using IdeaMaker 2.6.0
No ratings yet
Using IdeaMaker 2.6.0
94 pages
Chapter 1 Introduction To R
No ratings yet
Chapter 1 Introduction To R
33 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
R Programming Swirl
No ratings yet
R Programming Swirl
22 pages
R/Rpad Reference Card: Slicing and Extracting Data
No ratings yet
R/Rpad Reference Card: Slicing and Extracting Data
5 pages
100 Teaching Applications For Teachers
No ratings yet
100 Teaching Applications For Teachers
19 pages
R Introduction II
No ratings yet
R Introduction II
45 pages
Assignment 5: Int X 5, y 10
No ratings yet
Assignment 5: Int X 5, y 10
7 pages
Ascii Codes
No ratings yet
Ascii Codes
7 pages
R Reference Card
No ratings yet
R Reference Card
6 pages
R Reference Card
100% (4)
R Reference Card
4 pages
R Lab
No ratings yet
R Lab
114 pages
FM Based Long Range Remote Control: Project Work ON
No ratings yet
FM Based Long Range Remote Control: Project Work ON
42 pages
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
No ratings yet
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
15 pages
HTC Companies Brochure July2023
No ratings yet
HTC Companies Brochure July2023
44 pages
Case Study 1 in Project Management (Nokia)
No ratings yet
Case Study 1 in Project Management (Nokia)
12 pages
Structural and Dynamic Analysis of Optimized Four Bar Mechanism Considering Counterweight in Coupler Link - ScienceDirect
No ratings yet
Structural and Dynamic Analysis of Optimized Four Bar Mechanism Considering Counterweight in Coupler Link - ScienceDirect
1 page
2 LS Nav - Setup
No ratings yet
2 LS Nav - Setup
41 pages
Use The English For Life Test CD-ROM: How To..
No ratings yet
Use The English For Life Test CD-ROM: How To..
1 page
Bio503 Version
No ratings yet
Bio503 Version
256 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
STATA - Subject Table of Contents
No ratings yet
STATA - Subject Table of Contents
15 pages
Cloud Computing Overview
No ratings yet
Cloud Computing Overview
8 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
Creative and Digital Technologies: Entry 2012
No ratings yet
Creative and Digital Technologies: Entry 2012
32 pages
9691 Computing: MARK SCHEME For The October/November 2008 Question Paper
No ratings yet
9691 Computing: MARK SCHEME For The October/November 2008 Question Paper
6 pages
Xmlschema
No ratings yet
Xmlschema
22 pages
RStudio
No ratings yet
RStudio
60 pages
R Reference Card
No ratings yet
R Reference Card
6 pages
R Program Cheat Sheet 1
No ratings yet
R Program Cheat Sheet 1
2 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
234 Slides 01
No ratings yet
234 Slides 01
21 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
Section 03
No ratings yet
Section 03
20 pages
Data Anlytics Using R Notes
No ratings yet
Data Anlytics Using R Notes
14 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
DSF Gourav-2
No ratings yet
DSF Gourav-2
30 pages
Reshape2 - R - Flexibly Reshape Data - A Reboot of The Reshape Package
No ratings yet
Reshape2 - R - Flexibly Reshape Data - A Reboot of The Reshape Package
14 pages
Basics of R Programming - Part 2
No ratings yet
Basics of R Programming - Part 2
7 pages
R Cheatsheet Base R
No ratings yet
R Cheatsheet Base R
2 pages
Advantages of R Programming Language:: Extensive Libraries
No ratings yet
Advantages of R Programming Language:: Extensive Libraries
34 pages
R Programing
No ratings yet
R Programing
12 pages
CS Dept Practical Exams Timetable 2024
No ratings yet
CS Dept Practical Exams Timetable 2024
4 pages
R Course Notes
No ratings yet
R Course Notes
10 pages
About R Language
No ratings yet
About R Language
15 pages
Base R
No ratings yet
Base R
9 pages
R Statistical Package
No ratings yet
R Statistical Package
63 pages
R Master Sheet - All Codes, Inbuilt Functions and Packages Needed For The Course
No ratings yet
R Master Sheet - All Codes, Inbuilt Functions and Packages Needed For The Course
2 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
Exp1 Aa-B2.44
No ratings yet
Exp1 Aa-B2.44
8 pages
FE418 RLectureNotes1
No ratings yet
FE418 RLectureNotes1
15 pages
R Programming
No ratings yet
R Programming
30 pages
R Data Types and Objects
No ratings yet
R Data Types and Objects
4 pages
Tidy Verse
No ratings yet
Tidy Verse
76 pages
Complex Data Type in R
No ratings yet
Complex Data Type in R
8 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Excel Techniques
From Everand
Excel Techniques
Online Trainees
2/5 (1)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
More on C# in Front Office
From Everand
More on C# in Front Office
Xing Zhou
No ratings yet
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Core Concepts in Real Analysis
From Everand
Core Concepts in Real Analysis
Roshan Trivedi
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet

Package GRR': Topics Documented

Uploaded by

Package GRR': Topics Documented

Uploaded by

Package ‘grr’

October 13, 2022

convertBase Convert string representations of numbers in any base to any other

extract Extract/return parts of objects

x object from which to extract elements

#Speedup increases to 50-100x with oversampling

grr Alternative Implementations of Base R Functions

matches Value Matching

left all.x=TRUE, all.y=FALSE

Note that NA values will match other NA values.

#Only retain items from one with a match in two

order2 Ordering vectors

#How are special values like NA and Inf handled?

#Ordering a data frame using order2

#Sampling from a list

#Sampling from a data frame

#With oversampling sample2 can be much faster than the alternatives,

#Can quickly sample for sparse matrices while preserving sparsity

sort2 Sorting vectors

#How are special values like NA and Inf handled?

You might also like