0% found this document useful (0 votes)

43 views5 pages

The Default Behavior For Matching Can Be Changed

The document discusses various regular expression modifiers in Perl. It describes modifiers like /i for case-insensitive matching, /x for whitespace and comments in patterns, and character set modifiers like /a. It provides details on some modifiers and examples of using modifiers like /x to make patterns more readable.

Uploaded by

Mahendar S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views5 pages

The Default Behavior For Matching Can Be Changed

Uploaded by

Mahendar S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

The default behavior for matching can be changed, using various modifiers.

Modifiers that
relate to the interpretation of the pattern are listed just below. Modifiers that alter the way a
pattern is used by Perl are detailed in "Regexp Quote-Like Operators" in perlop and "Gory
details of parsing quoted constructs" in perlop. Modifiers can be added dynamically;
see "Extended Patterns" below.

Treat the string being matched against as multiple lines. That is,
change "^" and "$" from matching the start of the string's first line and the end of its
last line to matching the start and end of each line within the string.

Treat the string as single line. That is, change "." to match any character
whatsoever, even a newline, which normally it would not match.

Used together, as /ms, they let the "." match any character whatsoever, while still
allowing "^" and "$" to match, respectively, just after and just before newlines within
the string.

Do case-insensitive pattern matching. For example, "A" will match "a" under /i.

If locale matching rules are in effect, the case map is taken from the current locale for
code points less than 255, and from Unicode rules for larger code points. However,
matches that would cross the Unicode rules/non-Unicode rules boundary (ords
255/256) will not succeed, unless the locale is a UTF-8 one. See perllocale.

There are a number of Unicode characters that match a sequence of multiple

characters under /i. For example, LATIN SMALL LIGATURE FI should match the
sequence fi. Perl is not currently able to do this when the multiple characters are in
the pattern and are split between groupings, or when one or more are quantified.
Thus

"\N{LATIN SMALL LIGATURE FI}" =~ /fi/i; # Matches

"\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i; # Doesn't match!

"\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i; # Doesn't match!

# The below doesn't match, and it isn't clear what $1 and $2 would

# be even if it did!!

"\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i; # Doesn't match!

Perl doesn't match multiple characters in a bracketed character class unless the
character that maps to them is explicitly mentioned, and it doesn't match them at all if
the character class is inverted, which otherwise could be highly confusing.
See "Bracketed Character Classes" in perlrecharclass, and "Negation" in
perlrecharclass.

x and xx

Extend your pattern's legibility by permitting whitespace and comments. Details in "/x
and /xx"

Preserve the string matched such that ${^PREMATCH}, ${^MATCH},

and ${^POSTMATCH} are available for use after matching.

In Perl 5.20 and higher this is ignored. Due to a new copy-on-write

mechanism, ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} will be available after the
match regardless of the modifier.

a, d, l, and u

These modifiers, all new in 5.14, affect which character-set rules (Unicode, etc.) are
used, as described below in "Character set modifiers".

Prevent the grouping metacharacters () from capturing. This modifier, new in 5.22,
will stop $1, $2, etc... from being filled in.

"hello" =~ /(hi|hello)/; # $1 is "hello"

"hello" =~ /(hi|hello)/n; # $1 is undef

This is equivalent to putting ?: at the beginning of every capturing group:

"hello" =~ /(?:hi|hello)/; # $1 is undef

/n can be negated on a per-group basis. Alternatively, named captures may still be

used.

"hello" =~ /(?-n:(hi|hello))/n; # $1 is "hello"

"hello" =~ /(?<greet>hi|hello)/n; # $1 is "hello", $+{greet} is

# "hello"

Other Modifiers

There are a number of flags that can be found at the end of regular expression
constructs that are not generic regular expression flags, but apply to the operation
being performed, like matching or substitution (m// or s/// respectively).

Flags described further in "Using regular expressions in Perl" in perlretut are:

c - keep the current position during repeated matching

g - globally match the pattern repeatedly in the string

Substitution-specific modifiers described

in "s/PATTERN/REPLACEMENT/msixpodualngcer" in perlop are:

e - evaluate the right-hand side as an expression

ee - evaluate the right side as a string then eval the result

o - pretend to optimize your code, but actually introduce bugs

r - perform non-destructive substitution and return the new value

Regular expression modifiers are usually written in

documentation as e.g., "the /x modifier", even
though the delimiter in question might not really be a
slash. The modifiers /imnsxadlup may also be
embedded within the regular expression itself using
the (?...) construct, see "Extended Patterns" below.

Details on some modifiers

Some of the modifiers require more explanation than
given in the "Overview" above.

/x and /xx

A single /x tells the regular expression parser to

ignore most whitespace that is neither backslashed
nor within a bracketed character class. You can use
this to break up your regular expression into more
readable parts. Also, the "#" character is treated as
a metacharacter introducing a comment that runs up
to the pattern's closing delimiter, or to the end of the
current line if the pattern extends onto the next line.
Hence, this is very much like an ordinary Perl code
comment. (You can include the closing delimiter
within the comment only if you precede it with a
backslash, so be careful!)

Use of /x means that if you want real whitespace

or "#" characters in the pattern (outside a bracketed
character class, which is unaffected by /x), then
you'll either have to escape them (using backslashes
or \Q...\E) or encode them using octal, hex,
or \N{} or \p{name=...} escapes. It is ineffective to
try to continue a comment onto the next line by
escaping the \n with a backslash or \Q.

You can use "(?#text)" to create a comment that

ends earlier than the end of the current line,
but text also can't contain the closing delimiter
unless escaped with a backslash.
A common pitfall is to forget that "#" characters
begin a comment under /x and are not matched
literally. Just keep that in mind when trying to puzzle
out why a particular /x pattern isn't working as
expected.

Starting in Perl v5.26, if the modifier has a

second "x" within it, it does everything that a
single /x does, but additionally non-backslashed
SPACE and TAB characters within bracketed
character classes are also generally ignored, and
hence can be added to make the classes more
readable.

/ [d-e g-i 3-7]/xx

/[ ! @ " # $ % ^ & * () = ? <> ' ]/xx

may be easier to grasp than the squashed

equivalents

/[d-eg-i3-7]/

/[!@"#$%^&*()=?<>']/

Taken together, these features go a long way

towards making Perl's regular expressions more
readable. Here's an example:

# Delete (most) C comments.

$program =~ s {

/\* # Match the opening delimiter.

.*? # Match a minimal number of

characters.

\*/ # Match the closing delimiter.

} []gsx;

Note that anything inside a \Q...\E stays unaffected

by /x. And note that /x doesn't affect space
interpretation within a single multi-character
construct. For example (?:...) can't have a space
between the "(", "?", and ":". Within any delimiters
for such a construct, allowed spaces are not affected
by /x, and depend on the construct. For example, all
constructs using curly braces as delimiters, such
as \x{...} can have blanks within but adjacent to
the braces, but not elsewhere, and no non-blank
space characters. An exception are Unicode
properties which follow Unicode rules, for which
see "Properties accessible through \p{} and \P{}" in
perluniprops.

The set of characters that are deemed whitespace

are those that Unicode calls "Pattern White Space",
namely:

U+0009 CHARACTER TABULATION

U+000A LINE FEED

U+000B LINE TABULATION

U+000C FORM FEED

U+000D CARRIAGE RETURN

U+0020 SPACE

U+0085 NEXT LINE

U+200E LEFT-TO-RIGHT MARK

U+200F RIGHT-TO-LEFT MARK

U+2028 LINE SEPARATOR

U+2029 PARAGRAPH SEPARATOR

Character set modifiers

/d, /u, /a, and /l, available starting in 5.14, are

called the character set modifiers; they affect the
character set rules used for the regular expression.

The /d, /u, and /l modifiers are not likely to be of

much use to you, and so you need not worry about
them very much. They exist for Perl's internal use, so
that complex regular expression data structures can
be automatically serialized and later exactly
reconstituted, including all their nuances. But, since
Perl can't keep a secret, and there may be rare
instances where they are useful, they are
documented here.

The /a modifier, on the other hand, may be useful.

Its purpose is to allow code that is t

CSS Bangla Ebook by Faruk
No ratings yet
CSS Bangla Ebook by Faruk
173 pages
Range
50% (8)
Range
3 pages
Perl Reference Card Cheat Sheet: by Via
No ratings yet
Perl Reference Card Cheat Sheet: by Via
6 pages
Regular Expressions - Pattern Matching
No ratings yet
Regular Expressions - Pattern Matching
107 pages
Perl Training Regex
No ratings yet
Perl Training Regex
27 pages
Mastering Regular Expressions: Jeffrey E. F. Friedl
No ratings yet
Mastering Regular Expressions: Jeffrey E. F. Friedl
10 pages
Regex Slides PDF
No ratings yet
Regex Slides PDF
435 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Perl Regex
No ratings yet
Perl Regex
3 pages
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
No ratings yet
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
73 pages
Perlre Perl Regular Expressions
No ratings yet
Perlre Perl Regular Expressions
16 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
CS 105 Perl: Introduction To Regular Expressions: Curtis Dunham
No ratings yet
CS 105 Perl: Introduction To Regular Expressions: Curtis Dunham
32 pages
Modifiers: M// S/// QR// ??
No ratings yet
Modifiers: M// S/// QR// ??
29 pages
Re - Regular Expression Operations - Python 3.13.3 Documentation
No ratings yet
Re - Regular Expression Operations - Python 3.13.3 Documentation
28 pages
Perl Tutorial 08
No ratings yet
Perl Tutorial 08
54 pages
Perlre - Perl Regular Expressions Used in WM
No ratings yet
Perlre - Perl Regular Expressions Used in WM
36 pages
Regex
No ratings yet
Regex
20 pages
Perl Regex Documentation
No ratings yet
Perl Regex Documentation
35 pages
Regex
No ratings yet
Regex
30 pages
Perl Re Quick
No ratings yet
Perl Re Quick
9 pages
Regular Expressions in Perl::-KLK Mohan, 200841011, M.Tech, VLSI
No ratings yet
Regular Expressions in Perl::-KLK Mohan, 200841011, M.Tech, VLSI
18 pages
Perl 5 Pocket Reference
No ratings yet
Perl 5 Pocket Reference
74 pages
Regex All in One Guide
No ratings yet
Regex All in One Guide
16 pages
Pract 6
No ratings yet
Pract 6
5 pages
Practical 6 Com
No ratings yet
Practical 6 Com
5 pages
Sys LW-08EN Regex-Filters
No ratings yet
Sys LW-08EN Regex-Filters
31 pages
Syntax File For Re2
No ratings yet
Syntax File For Re2
7 pages
Matching This or That: ' - ' Dog Cat Dog - Cat Dog Dog Cat Cat
No ratings yet
Matching This or That: ' - ' Dog Cat Dog - Cat Dog Dog Cat Cat
7 pages
Regular Expressions: Item 15: Know The Precedence of Regular Expression Operators
No ratings yet
Regular Expressions: Item 15: Know The Precedence of Regular Expression Operators
36 pages
PCD Lab Manual
No ratings yet
PCD Lab Manual
28 pages
Perl - Part Iii: Indian Institute of Technology Kharagpur
No ratings yet
Perl - Part Iii: Indian Institute of Technology Kharagpur
24 pages
WINSEM2015-16 CP0067 14-Jan-2016 RM01 Perl File Handling and Regex
No ratings yet
WINSEM2015-16 CP0067 14-Jan-2016 RM01 Perl File Handling and Regex
26 pages
Regex in A Nutshell
No ratings yet
Regex in A Nutshell
2 pages
Perl 5 Pocket Reference
No ratings yet
Perl 5 Pocket Reference
74 pages
Perl Regex
No ratings yet
Perl Regex
37 pages
Python
No ratings yet
Python
96 pages
Perl and Regular Expressions 1 Perl and Regular Expressions 1
No ratings yet
Perl and Regular Expressions 1 Perl and Regular Expressions 1
5 pages
Perl Re Ref - Perl Documentation .
No ratings yet
Perl Re Ref - Perl Documentation .
6 pages
Unix Regular Expression
No ratings yet
Unix Regular Expression
7 pages
Perl T1
No ratings yet
Perl T1
11 pages
Perl Character Meanings
No ratings yet
Perl Character Meanings
1 page
BBEdit-TextWrangler RegEx Cheat Sheet
No ratings yet
BBEdit-TextWrangler RegEx Cheat Sheet
4 pages
SL Unit-V
No ratings yet
SL Unit-V
85 pages
Power Query M Formula Language
100% (1)
Power Query M Formula Language
56 pages
Perl Quick Reference Card
No ratings yet
Perl Quick Reference Card
1 page
Regular Expresions
No ratings yet
Regular Expresions
1 page
Regular Expression: Pocket Reference
No ratings yet
Regular Expression: Pocket Reference
11 pages
Active Server Pages Guide PDF
0% (1)
Active Server Pages Guide PDF
659 pages
Reference Card - Regular Expressions
No ratings yet
Reference Card - Regular Expressions
1 page
XML Tutorial
100% (1)
XML Tutorial
66 pages
Javascript Regexp Object
No ratings yet
Javascript Regexp Object
4 pages
Perl Regular Expression Quick Reference Card Syntax
No ratings yet
Perl Regular Expression Quick Reference Card Syntax
2 pages
Paul Ruschmann - Legalizing Marijuana (Point Counterpoint) (2004) PDF
100% (2)
Paul Ruschmann - Legalizing Marijuana (Point Counterpoint) (2004) PDF
130 pages
XDS - Modula 2.IDE - User.guide - en
No ratings yet
XDS - Modula 2.IDE - User.guide - en
61 pages
Microservices PDF
No ratings yet
Microservices PDF
6 pages
Notepad++ Searching and Replacing
No ratings yet
Notepad++ Searching and Replacing
9 pages
POSIX Regular Expressions: Brackets
No ratings yet
POSIX Regular Expressions: Brackets
5 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
9 pages
Perl Examples
No ratings yet
Perl Examples
8 pages
Characters: Character Legend Example Sample Match
No ratings yet
Characters: Character Legend Example Sample Match
5 pages
Regular Expressions: Pattern Matching Operators
No ratings yet
Regular Expressions: Pattern Matching Operators
0 pages
BSBITU306 Assessment Task 1
100% (1)
BSBITU306 Assessment Task 1
11 pages
Python For You and Me
No ratings yet
Python For You and Me
175 pages
1387191848seeking Efficiency.......... Holcim Study Report Feb 2012 PDF
No ratings yet
1387191848seeking Efficiency.......... Holcim Study Report Feb 2012 PDF
201 pages
Cantorial Library Printed Holdings
0% (1)
Cantorial Library Printed Holdings
19 pages
NFV-Foundations-Workshop 2-Day CISCO India VC Sept 03-04-2020 Part1 PDF
No ratings yet
NFV-Foundations-Workshop 2-Day CISCO India VC Sept 03-04-2020 Part1 PDF
92 pages
Byzantine Generals Problem PDF
No ratings yet
Byzantine Generals Problem PDF
20 pages
M1-IT 112 - Program Logic Formulation
No ratings yet
M1-IT 112 - Program Logic Formulation
12 pages
Jennys Song PDF
No ratings yet
Jennys Song PDF
1 page
TM351-MTA-By ISA-5th Edition
No ratings yet
TM351-MTA-By ISA-5th Edition
114 pages
IPT Full Materials
No ratings yet
IPT Full Materials
140 pages
The Template For Physics Essay
No ratings yet
The Template For Physics Essay
13 pages
OS Lab - Week 6
No ratings yet
OS Lab - Week 6
31 pages
Facebook Code Book
No ratings yet
Facebook Code Book
151 pages
IndividualProject-D1of3 (Summer 2025) - 6ne50x2
No ratings yet
IndividualProject-D1of3 (Summer 2025) - 6ne50x2
18 pages
MSO3000 and DPO3000 Series-Programmer Manual
No ratings yet
MSO3000 and DPO3000 Series-Programmer Manual
518 pages
Lab Manual Etcs204 Oops
No ratings yet
Lab Manual Etcs204 Oops
50 pages
MP
No ratings yet
MP
16 pages
Operators: General Properties of Operators
No ratings yet
Operators: General Properties of Operators
23 pages
InfluxDB Vs Cassandra PDF
No ratings yet
InfluxDB Vs Cassandra PDF
16 pages
Report Designer User Guide
No ratings yet
Report Designer User Guide
248 pages
Jenny's Song
No ratings yet
Jenny's Song
1 page
Error
No ratings yet
Error
3 pages
Chapter Eight: Streams: Big C++ by Cay Horstmann
No ratings yet
Chapter Eight: Streams: Big C++ by Cay Horstmann
93 pages
Grade 11 Com Prog Quarter 1 Week 4 Module 4
No ratings yet
Grade 11 Com Prog Quarter 1 Week 4 Module 4
15 pages
PHP String Functions
No ratings yet
PHP String Functions
8 pages
04 MMDesign Part 02 PDF
No ratings yet
04 MMDesign Part 02 PDF
39 pages
Java How To Program: Reserved
No ratings yet
Java How To Program: Reserved
33 pages
C++ Basic Syntax PDF
No ratings yet
C++ Basic Syntax PDF
5 pages
The Nordic Pile: A 1.2TB Nordic Dataset For Language Modeling
No ratings yet
The Nordic Pile: A 1.2TB Nordic Dataset For Language Modeling
14 pages
RegEx Cheat Sheet For Notepad
No ratings yet
RegEx Cheat Sheet For Notepad
2 pages
SAS Data Studio Transforms
No ratings yet
SAS Data Studio Transforms
2 pages

The Default Behavior For Matching Can Be Changed

Uploaded by

The Default Behavior For Matching Can Be Changed

Uploaded by

The default behavior for matching can be changed, using various modifiers.

There are a number of Unicode characters that match a sequence of multiple

"\N{LATIN SMALL LIGATURE FI}" =~ /fi/i; # Matches

"\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i; # Doesn't match!

"\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i; # Doesn't match!

"\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i; # Doesn't match!

Preserve the string matched such that ${^PREMATCH}, ${^MATCH},

In Perl 5.20 and higher this is ignored. Due to a new copy-on-write

"hello" =~ /(hi|hello)/; # $1 is "hello"

"hello" =~ /(hi|hello)/n; # $1 is undef

This is equivalent to putting ?: at the beginning of every capturing group:

"hello" =~ /(?:hi|hello)/; # $1 is undef

/n can be negated on a per-group basis. Alternatively, named captures may still be

"hello" =~ /(?-n:(hi|hello))/n; # $1 is "hello"

"hello" =~ /(?<greet>hi|hello)/n; # $1 is "hello", $+{greet} is

Flags described further in "Using regular expressions in Perl" in perlretut are:

c - keep the current position during repeated matching

Substitution-specific modifiers described

e - evaluate the right-hand side as an expression

ee - evaluate the right side as a string then eval the result

o - pretend to optimize your code, but actually introduce bugs

r - perform non-destructive substitution and return the new value

Regular expression modifiers are usually written in

Details on some modifiers

A single /x tells the regular expression parser to

Use of /x means that if you want real whitespace

You can use "(?#text)" to create a comment that

Starting in Perl v5.26, if the modifier has a

/ [d-e g-i 3-7]/xx

/[ ! @ " # $ % ^ & * () = ? <> ' ]/xx

may be easier to grasp than the squashed

Taken together, these features go a long way

# Delete (most) C comments.

/\* # Match the opening delimiter.

.*? # Match a minimal number of

\*/ # Match the closing delimiter.

Note that anything inside a \Q...\E stays unaffected

The set of characters that are deemed whitespace

U+0009 CHARACTER TABULATION

U+000A LINE FEED

U+000B LINE TABULATION

U+000C FORM FEED

U+000D CARRIAGE RETURN

U+0085 NEXT LINE

U+200E LEFT-TO-RIGHT MARK

U+200F RIGHT-TO-LEFT MARK

U+2028 LINE SEPARATOR

U+2029 PARAGRAPH SEPARATOR

Character set modifiers

/d, /u, /a, and /l, available starting in 5.14, are

The /d, /u, and /l modifiers are not likely to be of

The /a modifier, on the other hand, may be useful.

You might also like