0% found this document useful (0 votes)
164 views

Perl Training Regex

This document provides an overview of regular expressions (regexes) in Perl. It discusses the main components of regexes like quantifiers, character classes, anchors, backreferences and more. It provides examples of how to write regexes to match patterns in strings and how to use substitution to replace parts of a string. The document also demonstrates how to use regexes in Perl programs to search and manipulate text in files and strings.

Uploaded by

chandu_bezawada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
164 views

Perl Training Regex

This document provides an overview of regular expressions (regexes) in Perl. It discusses the main components of regexes like quantifiers, character classes, anchors, backreferences and more. It provides examples of how to write regexes to match patterns in strings and how to use substitution to replace parts of a string. The document also demonstrates how to use regexes in Perl programs to search and manipulate text in files and strings.

Uploaded by

chandu_bezawada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

PERL Regular Expressions

Regular Expressions (0)


Its a template that either matches or doesnt

match a given string.


One of the most important features of PERL -

a strong regular expression support

/PATTERN/

Regular Expressions (1)


the Dirty Dozen Metacharacters
\ . * +? ( ) |[ { ^ $
These characters have special meaning in
regular expressions.
A backslash in front of any meta-character
makes it non special.

Regular Expressions (2)


. matches any char except a newline \n

Quantifiers decides how many time the preceding


item has to be repeated.
/hello.you/ matches any string that has hello, followed by any
one (exactly one) character, followed by you.
/to*ols/ last character before * may be repeated zero or more
times. Matches tools,tooooools,tols (but not toxols !!!)
/to+ols/ ------//------- one or more -----//------.
/to.*ols/ matches to, followed by any string, followed by ols.

Regular Expressions(3)
/to?ols/ the character before ? is optional. Thus, there are only
two matching strings tools and tols.
/to{2}ls/ the number in {} tells about the repetitions
{count}

- Match exactly count times

{min,max} - Match at least min but not more than max times
{min,}

- Match at least min times

Write {} quantifier for *, +, ? ?

Regular Expressions (4)


Grouping parentheses ( ) are used for grouping one or more
characters.
/(tools)+/ matches toolstoolstoolstools.
Alternatives:
/hello (world|Perl)/ - matches hello world, hello Perl.

Regular Expressions (5)


Character Class - A list of all possible characters
/Hello [abcde]/ matches Hello a or Hello b
/Hello [a-e]/

the same as above

Negating:
[^abc] any char except a,b,c

Regular Expressions (6)


Shortcuts
\d digit [0-9]
\w word character [A-Za-z0-9_ ]
\s white space [\n \t \r \s]
Negative ^ [^\d] matches non digit
\S anything not \s
\D anything not \d
\W anything not \w

The character classes for -

1. Matching of vowels
2. Matching of consonants
3. Anything other than non Numbers
Diff between \D and [^\d]

Regular Expressions (7)


/^abc/ - ^ beginning of a string

Anchors

/a\^bc/ - matches \^
/[^abc]/ - negating

^ - marks the beginning of the string


$ - marks the end of the string
/^Hello Perl/ - matches Hello Perl, good by Perl, but not Perl
Hello Perl
What pattern will match blank lines ?
/^\s*$/ - matches all blank lines

Regular Expressions (8)


\b - matches at either end of a word (matches the start or the
end of a group of \w characters)
/\bPerl\b/ - matches Hello Perl, Perl
but not Perl++
/^\w+\b/ matches with what part of Thats my house

\B - negative of \b

Regular Expressions (9)


Back references:
/(World|Perl) \1/ - matches World World, Perl Perl.

/((hello|hi) (world|Perl))/
\1 refers to (hello|hi) (world|Perl)
\2 refers to (hello|hi)
\3 refers to (world|Perl)

$1,$2,$3 store the


values of \1,\2,\3 after
a reg.expr. is applied.

Regular Expressions (10)


Option modifiers
/i : Case insensitive
/s : . will match \n
/m : Let ^ & $ match next to embedded \n
/x : Ignore white spaces
/o : Compile the pattern once

Regular Expressions (11)


Bind Operator

=~

Tells Perl to match the pattern on the right


against the string on the left.

Pattern match operator m//


$str =~ /pattern/;
$str =~ m/pattern/;

Regular Expressions (12)


When no variable is mentioned the pattern is
matched with default variable $_
if( $str =~ /hello/){

while( <STDIN> ){
if( /hello/ ){

}
@words = split /\s+/, $str;

}
}

Examples
$date="12 10
10";
if($date=~ /(\d+)/){
print
$1.":".$2.":".$3.":\n";
}
#output ($2 and $3 are empty):
#12:::
if($date=~ /(\d+)(\s+\1)+/){
print $1.":".$2.":".$3.":\n";
}
#output (notice $3 is empty):
#10:
10::

$str="Hello World";
if($str=~ /((Hello|Hi) (World|Perl))/)
{
print $1.":".$2.":".$3.":\n";
}
#output:
#Hello World:Hello:World:
$str="Hello Perl Hi";
if($str=~ /((Hello|Hi) (World|Perl)) \
1/){
print $1.":".$2.":".$3.":\n";
}
#output: non
$str="Hello Perl Hi";
if($str=~ /((Hello|Hi) (World|Perl)) \
1/){
print $1.":".$2.":".$3.":\n";
}
#output:
#Hi Perl:Hi:Perl:

Examples
1. What is it?
/^0x[0-9a-fA-F]+$/

2. Date format: Month-Day-Year -> Year:Day:Month


$date = 12-31-1901;
$date =~ s/(\d+)-(\d+)-(\d+)/$3:$2:$1/;

Examples
3. Make a pattern that matches any line of input that has
the same word repeated two (or more) times in a row.
Whitespace between words may differ.

4. /^\w+\b/ matches with what part of Thats my house

Example
1. /\w+/

#matches a word

2. /(\w+)/

#to remember later

3. /(\w+)\1/

#two times

4. /(\w+)\s+\1/ #whitespace between words


5. This is a test -> /\b(\w+)\s+\1/
6. This is the theory -> /\b(\w+)\s+\1\b/

Lets try
1) Write a regular expression that identifies a 24-hour
clock. For example: 0:01, 00:20, 15:00, 23:59

2) Write a regular expression that identifies a floating


point. For example: 10, 10.0001, -0.1, +001.3456789

For both write a single program that identifies these


patterns in the input lines and prints out only the
matched patterns.

Negated Match
Negation
if( $str =~ /hello/){

if( $str !~ /hello/){

Regular Expressions (13)


$&

- what really was matched

$`

- what was before

- the rest of the string after the matched pattern

$` . $& . $ - original string

Caution: Never use this in your script if you really dont need
this.

Regular Expressions (14)


Substitutions:
s/T/U/; #substitutes T with U (only once)
s/T/U/g; #global substitution
s/\s+/ /g; #collapses whitespaces
s/(\w+) (\w+)/$2 $1/g;
s/T/U/; #applied on $_ variable
$str =~ s/T/U/;

Regular Expressions (15)


File Extension Renaming:
my ($from, $to) = @ARGV;
@files = glob (*.$from);
foreach $file (@files){
$newfile = $file;
s/\.$from$/\.$to/g
$newfile =~=~
s/\.$from/\.$to/g;
rename($file, $newfile);
}

Split and Join


$str=aaa bbb

ccc

dddd;

@words = split /\s+/, $str;


$str = join :, @words;

#result is aaa:bbb:ccc:dddd

@words = split /\s+/, $_; aaa b -> , aaa, b


@words = split;

aaa b ->

aaa, b

@words = split , $_;

aaa b ->

aaa, b

Grep

grep EXPR, LIST;


@results = grep /^>/, @array;
@results = grep /^>/, <FILE>;

Thank You !!!

You might also like