SlideShare a Scribd company logo
Cheat Sheet
Updated: 09/16
* Matches at least 0 times
+ Matches at least 1 time
? Matches at most 1 time; optional string
{n} Matches exactly n times
{n,} Matches at least n times
{,n} Matches at most n times
{n,m} Matches between n and m times
> string <- c("Hiphopopotamus", "Rhymenoceros", "time for bottomless lyrics")
> pattern <- "t.m"
grep(pattern, string)
[1] 1 3
grep(pattern, string, value = TRUE)
[1] "Hiphopopotamus"
[2] "time for bottomless lyrics“
grepl(pattern, string)
[1] TRUE FALSE TRUE
stringr::str_detect(string, pattern)
[1] TRUE FALSE TRUE
regexpr(pattern, string)
find starting position and length of first match
gregexpr(pattern, string)
find starting position and length of all matches
stringr::str_locate(string, pattern)
find starting and end position of first match
stringr::str_locate_all(string, pattern)
find starting and end position of all matches
regmatches(string, regexpr(pattern, string))
extract first match [1] "tam" "tim"
regmatches(string, gregexpr(pattern, string))
extracts all matches, outputs a list
[[1]] "tam" [[2]] character(0) [[3]] "tim" "tom"
stringr::str_extract(string, pattern)
extract first match [1] "tam" NA "tim"
stringr::str_extract_all(string, pattern)
extract all matches, outputs a list
stringr::str_extract_all(string, pattern, simplify = TRUE)
extract all matches, outputs a matrix
stringr::str_match(string, pattern)
extract first match + individual character groups
stringr::str_match_all(string, pattern)
extract all matches + individual character groups
sub(pattern, replacement, string)
replace first match
gsub(pattern, replacement, string)
replace all matches
stringr::str_replace(string, pattern, replacement)
replace first match
stringr::str_replace_all(string, pattern, replacement)
replace all matchesstrsplit(string, pattern) or stringr::str_split(string, pattern)
pattern
string
^ Start of the string
$ End of the string
b Empty string at either edge of a word
B NOT the edge of a word
< Beginning of a word
> End of a word
[[:digit:]] or d Digits; [0-9]
D Non-digits; [^0-9]
[[:lower:]] Lower-case letters; [a-z]
[[:upper:]] Upper-case letters; [A-Z]
[[:alpha:]] Alphabetic characters; [A-z]
[[:alnum:]] Alphanumeric characters [A-z0-9]
w Word characters; [A-z0-9_]
W Non-word characters
[[:xdigit:]] or x Hexadec. digits; [0-9A-Fa-f]
[[:blank:]] Space and tab
[[:space:]] or s Space, tab, vertical tab, newline,
form feed, carriage return
S Not space; [^[:space:]]
[[:punct:]] Punctuation characters;
!"#$%&’()*+,-./:;<=>?@[]^_`{|}~
[[:graph:]]
Graphical char.;
[[:alnum:][:punct:]]
[[:print:]]
Printable characters;
[[:alnum:][:punct:]s]
[[:cntrl:]] or c Control characters; n, r etc.
. Any character except n
| Or, e.g. (a|b)
[…] List permitted characters, e.g. [abc]
[a-z] Specify character ranges
[^…] List excluded characters
(…) Grouping, enables back referencing using
N where N is an integer
n New line
r Carriage return
t Tab
v Vertical tab
f Form feed
(?=) Lookahead (requires PERL = TRUE),
e.g. (?=yx): position followed by 'xy'
(?!) Negative lookahead (PERL = TRUE);
position NOT followed by pattern
(?<=) Lookbehind (PERL = TRUE), e.g.
(?<=yx): position following 'xy'
(?<!)
Negative lookbehind (PERL = TRUE);
position NOT following pattern
?(if)then If-then-condition (PERL = TRUE); use
lookaheads, optional char. etc in if-clause
?(if)then|else If-then-else-condition (PERL = TRUE)
*see, e.g. http://www.regular-expressions.info/lookaround.html
http://www.regular-expressions.info/conditional.html
By default R uses POSIX extended regular
expressions. You can switch to PCRE regular
expressions using PERL = TRUE for base or by
wrapping patterns with perl() for stringr.
All functions can be used with literal searches
using fixed = TRUE for base or by wrapping
patterns with fixed() for stringr.
All base functions can be made case insensitive
by specifying ignore.cases = TRUE.
Metacharacters (. * + etc.) can be used as
literal characters by escaping them. Characters
can be escaped using  or by enclosing them
in Q...E.
By default the asterisk * is greedy, i.e. it always
matches the longest possible string. It can be
used in lazy mode by adding ?, i.e. *?.
Greedy mode can be turned off using (?U). This
switches the syntax, so that (?U)a* is lazy and
(?U)a*? is greedy.
Regular expressions can be made case insensitive
using (?i). In backreferences, the strings can be
converted to lower or upper case using L or U
(e.g. L1). This requires PERL = TRUE.
CC BY Ian Kopacka • ian.kopacka@ages.at
Regular expressions can conveniently be
created using rex::rex().

More Related Content

PPT
Rate of change and tangent lines
PDF
5 2. string processing
ODP
Parsec
PPT
Functions
PPTX
String in programming language in c or c++
PPT
Question 1 Solution
PPTX
Otter 2014-12-08-02
PDF
Lista de exercícios 6 - Cálculo 1
Rate of change and tangent lines
5 2. string processing
Parsec
Functions
String in programming language in c or c++
Question 1 Solution
Otter 2014-12-08-02
Lista de exercícios 6 - Cálculo 1

What's hot (20)

DOCX
L'hopital's rule
PPTX
String (Computer programming and utilization)
PPTX
Keypoints c strings
PDF
CAPS_Discipline_Training
PPT
Volume
PPT
Graphing day 2 worked
PDF
5 1. character processing
PPTX
Bioinformatica p2-p3-introduction
PPT
5.7 rolle's thrm & mv theorem
PPTX
Processing Regex Python
PPTX
The Moore-Spiegel Oscillator
PDF
Methods of calculate roots of equations
ODP
Clug 2009 05 Ten Tips For Bash
PPT
Strongly Connected Components
PPT
Roll's theorem
PDF
Characterizing the Distortion of Some Simple Euclidean Embeddings
PPT
Regex Intro
L'hopital's rule
String (Computer programming and utilization)
Keypoints c strings
CAPS_Discipline_Training
Volume
Graphing day 2 worked
5 1. character processing
Bioinformatica p2-p3-introduction
5.7 rolle's thrm & mv theorem
Processing Regex Python
The Moore-Spiegel Oscillator
Methods of calculate roots of equations
Clug 2009 05 Ten Tips For Bash
Strongly Connected Components
Roll's theorem
Characterizing the Distortion of Some Simple Euclidean Embeddings
Regex Intro
Ad

Similar to Reg ex cheatsheet (20)

PDF
regex-presentation_ed_goodwin
PDF
Eag 201110-hrugregexpresentation-111006104128-phpapp02
PDF
Text Mining using Regular Expressions
ODP
Looking for Patterns
PPT
Introduction to regular expressions
PPT
Regular Expressions grep and egrep
PDF
Rbootcamp Day 5
PDF
Working with text, Regular expressions
ODP
Regular Expression
PPT
Bioinformatica 06-10-2011-p2 introduction
PDF
Maxbox starter20
PPTX
P3 2017 python_regexes
PPTX
P3 2018 python_regexes
PPT
Introduction to Regular Expressions RootsTech 2013
PPTX
Regular expressions in Python
PDF
Construction of a predictive parsing table.pdf
PDF
Regular expressions
PPT
Perl Presentation
PPTX
Bioinformatics p2-p3-perl-regexes v2014
PPT
Chapter-three automata and complexity theory.ppt
regex-presentation_ed_goodwin
Eag 201110-hrugregexpresentation-111006104128-phpapp02
Text Mining using Regular Expressions
Looking for Patterns
Introduction to regular expressions
Regular Expressions grep and egrep
Rbootcamp Day 5
Working with text, Regular expressions
Regular Expression
Bioinformatica 06-10-2011-p2 introduction
Maxbox starter20
P3 2017 python_regexes
P3 2018 python_regexes
Introduction to Regular Expressions RootsTech 2013
Regular expressions in Python
Construction of a predictive parsing table.pdf
Regular expressions
Perl Presentation
Bioinformatics p2-p3-perl-regexes v2014
Chapter-three automata and complexity theory.ppt
Ad

More from Dieudonne Nahigombeye (11)

PDF
Rstudio ide-cheatsheet
PDF
Rmarkdown cheatsheet-2.0
PDF
How big-is-your-graph
PDF
Ggplot2 cheatsheet-2.1
PDF
Eurostat cheatsheet
PDF
Devtools cheatsheet
PDF
Data transformation-cheatsheet
PDF
Data import-cheatsheet
PDF
Rstudio ide-cheatsheet
Rmarkdown cheatsheet-2.0
How big-is-your-graph
Ggplot2 cheatsheet-2.1
Eurostat cheatsheet
Devtools cheatsheet
Data transformation-cheatsheet
Data import-cheatsheet

Recently uploaded (20)

PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPTX
Challenges and opportunities in feeding a growing population
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Foundation of Data Science unit number two notes
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Logistic Regression ml machine learning.pptx
PPTX
Global journeys: estimating international migration
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
1intro to AI.pptx AI components & composition
PDF
Data Science Trends & Career Guide---ppt
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Data-Driven-Credit-Card-Launch-A-Wells-Fargo-Case-Study.pptx
Business Acumen Training GuidePresentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Taxes Foundatisdcsdcsdon Certificate.pdf
Challenges and opportunities in feeding a growing population
Introduction to Knowledge Engineering Part 1
Foundation of Data Science unit number two notes
.pdf is not working space design for the following data for the following dat...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Logistic Regression ml machine learning.pptx
Global journeys: estimating international migration
Introduction-to-Cloud-ComputingFinal.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Clinical guidelines as a resource for EBP(1).pdf
1intro to AI.pptx AI components & composition
Data Science Trends & Career Guide---ppt
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Data-Driven-Credit-Card-Launch-A-Wells-Fargo-Case-Study.pptx

Reg ex cheatsheet

  • 1. Cheat Sheet Updated: 09/16 * Matches at least 0 times + Matches at least 1 time ? Matches at most 1 time; optional string {n} Matches exactly n times {n,} Matches at least n times {,n} Matches at most n times {n,m} Matches between n and m times > string <- c("Hiphopopotamus", "Rhymenoceros", "time for bottomless lyrics") > pattern <- "t.m" grep(pattern, string) [1] 1 3 grep(pattern, string, value = TRUE) [1] "Hiphopopotamus" [2] "time for bottomless lyrics“ grepl(pattern, string) [1] TRUE FALSE TRUE stringr::str_detect(string, pattern) [1] TRUE FALSE TRUE regexpr(pattern, string) find starting position and length of first match gregexpr(pattern, string) find starting position and length of all matches stringr::str_locate(string, pattern) find starting and end position of first match stringr::str_locate_all(string, pattern) find starting and end position of all matches regmatches(string, regexpr(pattern, string)) extract first match [1] "tam" "tim" regmatches(string, gregexpr(pattern, string)) extracts all matches, outputs a list [[1]] "tam" [[2]] character(0) [[3]] "tim" "tom" stringr::str_extract(string, pattern) extract first match [1] "tam" NA "tim" stringr::str_extract_all(string, pattern) extract all matches, outputs a list stringr::str_extract_all(string, pattern, simplify = TRUE) extract all matches, outputs a matrix stringr::str_match(string, pattern) extract first match + individual character groups stringr::str_match_all(string, pattern) extract all matches + individual character groups sub(pattern, replacement, string) replace first match gsub(pattern, replacement, string) replace all matches stringr::str_replace(string, pattern, replacement) replace first match stringr::str_replace_all(string, pattern, replacement) replace all matchesstrsplit(string, pattern) or stringr::str_split(string, pattern) pattern string ^ Start of the string $ End of the string b Empty string at either edge of a word B NOT the edge of a word < Beginning of a word > End of a word [[:digit:]] or d Digits; [0-9] D Non-digits; [^0-9] [[:lower:]] Lower-case letters; [a-z] [[:upper:]] Upper-case letters; [A-Z] [[:alpha:]] Alphabetic characters; [A-z] [[:alnum:]] Alphanumeric characters [A-z0-9] w Word characters; [A-z0-9_] W Non-word characters [[:xdigit:]] or x Hexadec. digits; [0-9A-Fa-f] [[:blank:]] Space and tab [[:space:]] or s Space, tab, vertical tab, newline, form feed, carriage return S Not space; [^[:space:]] [[:punct:]] Punctuation characters; !"#$%&’()*+,-./:;<=>?@[]^_`{|}~ [[:graph:]] Graphical char.; [[:alnum:][:punct:]] [[:print:]] Printable characters; [[:alnum:][:punct:]s] [[:cntrl:]] or c Control characters; n, r etc. . Any character except n | Or, e.g. (a|b) […] List permitted characters, e.g. [abc] [a-z] Specify character ranges [^…] List excluded characters (…) Grouping, enables back referencing using N where N is an integer n New line r Carriage return t Tab v Vertical tab f Form feed (?=) Lookahead (requires PERL = TRUE), e.g. (?=yx): position followed by 'xy' (?!) Negative lookahead (PERL = TRUE); position NOT followed by pattern (?<=) Lookbehind (PERL = TRUE), e.g. (?<=yx): position following 'xy' (?<!) Negative lookbehind (PERL = TRUE); position NOT following pattern ?(if)then If-then-condition (PERL = TRUE); use lookaheads, optional char. etc in if-clause ?(if)then|else If-then-else-condition (PERL = TRUE) *see, e.g. http://www.regular-expressions.info/lookaround.html http://www.regular-expressions.info/conditional.html By default R uses POSIX extended regular expressions. You can switch to PCRE regular expressions using PERL = TRUE for base or by wrapping patterns with perl() for stringr. All functions can be used with literal searches using fixed = TRUE for base or by wrapping patterns with fixed() for stringr. All base functions can be made case insensitive by specifying ignore.cases = TRUE. Metacharacters (. * + etc.) can be used as literal characters by escaping them. Characters can be escaped using or by enclosing them in Q...E. By default the asterisk * is greedy, i.e. it always matches the longest possible string. It can be used in lazy mode by adding ?, i.e. *?. Greedy mode can be turned off using (?U). This switches the syntax, so that (?U)a* is lazy and (?U)a*? is greedy. Regular expressions can be made case insensitive using (?i). In backreferences, the strings can be converted to lower or upper case using L or U (e.g. L1). This requires PERL = TRUE. CC BY Ian Kopacka • [email protected] Regular expressions can conveniently be created using rex::rex().