0% found this document useful (0 votes)

23 views27 pages

130-Linux Shell Scripting

The document provides an overview of regular expressions in Linux, explaining their purpose as patterns used to filter text in various utilities. It discusses different types of regular expression engines, such as POSIX Basic and Extended Regular Expressions, and details how to define patterns using special characters, character classes, and quantifiers. Additionally, it includes examples of using regular expressions in shell scripts to count executable files in directories defined by the PATH environment variable.

Uploaded by

alborzjfrnk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views27 pages

130-Linux Shell Scripting

Uploaded by

alborzjfrnk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Shell Scripting

Session 12

Vahab Shalchian (ITIL v3 , LPIC-1 , LPIC-2 , LPIC-3)

Regular
Expressions
What Are Regular Expressions?

A regular expression is a pattern template you define that a Linux

utility uses to filter text.
A Linux utility (such as the sed editor or the gawk program)
matches the regular expression pattern against data as that data
flows into the utility. If the data matches the pattern, it's accepted for
processing. If the data doesn't match the pattern, it's rejected.
Regular
Expressions
Types of regular expressions

The biggest problem with using regular expressions is that there isn't
just one set of them. Several different applications use different types
of regular expressions in the Linux environment.

These include such diverse applications as programming languages

(Java, Perl, and Python), Linux utilities (such as the sed editor, the
gawk program, and the grep utility), and mainstream applications
(such as the MySQL and PostgreSQL database servers).
A regular expression is implemented using a regular expression
engine. A regular expression engine is the underlying software that
interprets regular expression patterns and uses those patterns to
match text.
Regular
Expressions
In the Linux world, there are two popular regular expression engines:

• The POSIX Basic Regular Expression (BRE) engine

• The POSIX Extended Regular Expression (ERE) engine

Most Linux utilities at a minimum conform to the POSIX BRE engine

specifications, recognizing all of the pattern symbols it defines.
Unfortunately, some utilities (such as the sed editor) only
conform to a subset of the BRE engine specifications. This is due to
speed constraints, as the sed editor attempts to process text in the
data stream as quickly as possible.
Regular
Expressions
Defining BRE Patterns

The most basic BRE pattern is matching text characters in a data

stream.

Plain text
$ echo "This is a test" | sed -n '/test/p'
This is a test

$ echo "This is a test" | sed -n '/trial/p'

patterns are case sensitive. This means they'll only match patterns
with the proper case of characters.
Regular
Expressions
Special characters

Regular expression patterns assign a special meaning to a few

characters. If you try to use these characters in your text pattern, you
won't get the results you were expecting.
The special characters recognized by regular expressions are:

.*[]^${}\+?|()
Regular
Expressions
For example, if you want to search for a dollar sign in your text, just
precede it with a backslash character:

$ cat data2
The cost is $4.00

$ sed -n '/\$/p' data2

The cost is $4.00
$
Regular
Expressions
Starting at the beginning

The caret character (^) defines a pattern that starts at the beginning
of a line of text in the data stream. If the pattern is located any place
other than the start of the line of text, the regular expression pattern
fails.
To use the caret character, you must place it before the pattern
specified in the regular expression:
$ echo "The book store" | sed -n '/^book/p'
$
$ echo "Books are great" | sed -n '/^Book/p'
Books are great
$
Regular
Expressions
Looking for the ending

The opposite of looking for a pattern at the start of a line is looking

for it at the end of a line. The dollar sign ($) special character defines
the end anchor. Add this special character after a text
pattern to indicate that the line of data must end with the text
pattern:
$ echo "This is a good book" | sed -n '/book$/p'
This is a good book
$ echo "This book is good" | sed -n '/book$/p'
$
Regular
Expressions
The dot character

The dot special character is used to match any single character except
a newline character. The dot character must match a character
though; if there's no character in the place of the dot, then the
pattern will fail.
Regular
Expressions
$ cat data6
This is a test of a line.
The cat is sleeping.
That is a very nice hat.
This test is at line four.
at ten o'clock we'll go home.
$ sed -n '/.at/p' data6
The cat is sleeping.
That is a very nice hat.
This test is at line four.
$
Regular
Expressions
Character classes
The dot special character is great for matching a character position
against any character, but what if you want to limit what characters to
match? This is called a character class in regular expressions.
To define a character class, you use square brackets. The brackets
should contain any character that you want to include in the class.

Here's an example of creating a character class:

$ sed -n '/[ch]at/p' data6
The cat is sleeping.
That is a very nice hat.
$
Regular
Expressions
$ echo "Yes" | sed -n '/[Yy]es/p'
Yes
$ echo "yes" | sed -n '/[Yy]es/p'
yes
$

Negating character classes

Instead of looking for a character contained in the class, you can look
for any character that's not in the class. To do that, just place a caret
character at the beginning of the character class range:
$ sed -n '/[^ch]at/p' data6
This test is at line two.
$
Regular
Expressions
Using ranges
You can use a range of characters within a character class by using the
dash symbol. Just specify the first character in the range, a dash, then
the last character in the range.

$ sed -n '/^[0-9][0-9][0-9][0-9][0-9]$/p' data8

60633
46201
45902
Regular
Expressions
The asterisk
Placing an asterisk after a character signifies that the character must
appear zero or more times in the text to match the pattern:
$ cat file1| sed -n '/ie*k/p'
ik
iek
ieek
ieeek
ieeeek
Regular
Expressions
$ cat file2 | sed -n '/b[ae]*t/p'
bt
bat
bet
baat
baaeeet
baeeaeeat
Regular
Expressions
Extended Regular Expressions
The POSIX ERE patterns include a few additional symbols that are
used by some Linux applications and utilities. The gawk program
recognizes the ERE patterns, but the sed editor doesn't.

The question mark

The question mark indicates that the preceding character can appear
zero or one time, but that's all. It doesn't match repeating
occurrences of the character:
Regular
Expressions
$ echo "bt" | gawk '/be?t/{print $0}'
bt
$ echo "bet" | gawk '/be?t/{print $0}'
bet
$ echo "beet" | gawk '/be?t/{print $0}'
$
$ echo "beeet" | gawk '/be?t/{print $0}'
$
Regular
Expressions
The plus sign
The plus sign indicates that the preceding character can appear one
or more times, but must be present at least once. The pattern doesn't
match if the character is not present:
$ echo "beeet" | gawk '/be+t/{print $0}'
beeet
$ echo "beet" | gawk '/be+t/{print $0}'
beet
$ echo "bet" | gawk '/be+t/{print $0}'
bet
$ echo "bt" | gawk '/be+t/{print $0}'
$
Regular
Expressions
Using braces
Curly braces are available in ERE to allow you to specify a limit on a
repeatable regular expression.
This is often referred to as an interval. You can express the interval in
two formats:
• m: The regular expression appears exactly m times.
• m,n: The regular expression appears at least m times, but no more
than n times.

Note: By default, the gawk program doesn't recognize regular

expression intervals. You must specify the --re-interval command line
option for the gawk program to recognize regular expression
intervals.
Regular
Expressions
$ echo "bt" | gawk --re-interval '/be{1}t/{print $0}'
$
$ echo "bet" | gawk --re-interval '/be{1}t/{print $0}'
bet
$ echo "beet" | gawk --re-interval '/be{1}t/{print $0}'
$
Regular
Expressions
The pipe symbol
The pipe symbol allows to you to specify two or more patterns that
the regular expression engine uses in a logical OR formula when
examining the data stream. If any of the patterns match the data
stream text, the text passes. If none of the patterns match, the data
stream text fails.

The format for using the pipe symbol is:

expr1|expr2|...
Regular
Expressions
Here's an example of this:

$ echo "The cat is asleep" | gawk '/cat|dog/{print $0}'

The cat is asleep
$ echo "The dog is asleep" | gawk '/cat|dog/{print $0}'
The dog is asleep
$ echo "The sheep is asleep" | gawk '/cat|dog/{print $0}'
$
Regular
Expressions
Grouping expressions
Regular expression patterns can also be grouped by using
parentheses. When you group a regular
expression pattern, the group is treated like a standard character. You
can apply a special
character to the group just as you would to a regular character. For
example:
$ echo "Sat" | gawk '/Sat(urday)?/{print $0}'
Sat
$ echo "Saturday" | gawk '/Sat(urday)?/{print $0}'
Saturday
$
Regular
Expressions
$ echo "cat" | gawk '/(c|b)a(b|t)/{print $0}'
cat
$ echo "cab" | gawk '/(c|b)a(b|t)/{print $0}'
cab
$ echo "bat" | gawk '/(c|b)a(b|t)/{print $0}'
bat
$ echo "bab" | gawk '/(c|b)a(b|t)/{print $0}'
bab
$ echo "tab" | gawk '/(c|b)a(b|t)/{print $0}'
$
Regular
Expressions
Regular Expressions in Action
Counting directory files

Write a shell script that counts the executable files that are present
in the directories defined in your PATH environment variable.
Regular
Expressions
#!/bin/bash
# count number of files in your PATH
mypath=`echo $PATH | sed 's/:/ /g'`
count=0
for directory in $mypath
do
check=`ls $directory`
for item in $check
do
count=$[ $count + 1 ]
done
echo "$directory - $count"
count=0
done

Coffee Tasting - Caffe Verona - June 2019 Script
0% (1)
Coffee Tasting - Caffe Verona - June 2019 Script
2 pages
Puja Pustakam Mantra Compilation
100% (12)
Puja Pustakam Mantra Compilation
112 pages
Consolidation Test (E5)
No ratings yet
Consolidation Test (E5)
27 pages
Boxing: Knockout
No ratings yet
Boxing: Knockout
6 pages
Regular Expressions: Exceptions in A Character Set
No ratings yet
Regular Expressions: Exceptions in A Character Set
10 pages
Regex Slides PDF
No ratings yet
Regex Slides PDF
435 pages
Lecture19 12PM
No ratings yet
Lecture19 12PM
38 pages
Unit-3 Usp
No ratings yet
Unit-3 Usp
82 pages
Regular Expressions and Sed & Awk
No ratings yet
Regular Expressions and Sed & Awk
14 pages
Regular Expressions and Sed & Awk
No ratings yet
Regular Expressions and Sed & Awk
13 pages
Chapter 4 - Regular Expression
No ratings yet
Chapter 4 - Regular Expression
6 pages
Pattern Matching - Part 01
No ratings yet
Pattern Matching - Part 01
25 pages
Chapter 8: Regular Expressions
No ratings yet
Chapter 8: Regular Expressions
24 pages
Sed by Example
No ratings yet
Sed by Example
16 pages
Week 5 Bash
No ratings yet
Week 5 Bash
63 pages
Session10 Advanced Filters
No ratings yet
Session10 Advanced Filters
10 pages
$address M/ (/D . ) /N ( (A-Z) (2) ) (/D (5) ) - ? (/D (0,5) )
No ratings yet
$address M/ (/D . ) /N ( (A-Z) (2) ) (/D (5) ) - ? (/D (0,5) )
98 pages
Regex Cheatsheet
No ratings yet
Regex Cheatsheet
6 pages
Final Study Notes
No ratings yet
Final Study Notes
36 pages
Unix Regular Expression
No ratings yet
Unix Regular Expression
7 pages
Unit 3 Linux Regular Expression
No ratings yet
Unit 3 Linux Regular Expression
15 pages
Lab 8
No ratings yet
Lab 8
6 pages
Reg Expressions
No ratings yet
Reg Expressions
5 pages
Assignment 7
No ratings yet
Assignment 7
4 pages
Linux Regular Expression Tutorial - Grep Regex Example
No ratings yet
Linux Regular Expression Tutorial - Grep Regex Example
8 pages
Perl Re Quick
No ratings yet
Perl Re Quick
9 pages
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
No ratings yet
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
73 pages
L5 - Reg Exp
No ratings yet
L5 - Reg Exp
38 pages
UNIX Special Characters
No ratings yet
UNIX Special Characters
6 pages
Using Grep, TR and Sed With Regular Expressions
No ratings yet
Using Grep, TR and Sed With Regular Expressions
7 pages
Introduction To Sed
No ratings yet
Introduction To Sed
17 pages
Regular Expressions in Grep Command With 10 Examples - Part I
No ratings yet
Regular Expressions in Grep Command With 10 Examples - Part I
5 pages
DOC4
No ratings yet
DOC4
67 pages
Know Your Regular Expresions
No ratings yet
Know Your Regular Expresions
15 pages
Sed - Awk
No ratings yet
Sed - Awk
7 pages
Sheet 01
No ratings yet
Sheet 01
2 pages
UNIX Shells by Example (PDFDrive)
No ratings yet
UNIX Shells by Example (PDFDrive)
1,194 pages
Regex
No ratings yet
Regex
30 pages
Grep & Sed Session2
No ratings yet
Grep & Sed Session2
18 pages
Linux
No ratings yet
Linux
7 pages
Linux Regular Expression
No ratings yet
Linux Regular Expression
3 pages
Lesson 04 Text Files
No ratings yet
Lesson 04 Text Files
6 pages
Software Carpentry
No ratings yet
Software Carpentry
83 pages
20.10 Filters-Text Processing Commands
No ratings yet
20.10 Filters-Text Processing Commands
14 pages
Sys LW-08EN Regex-Filters
No ratings yet
Sys LW-08EN Regex-Filters
31 pages
4 Filter and Regex
No ratings yet
4 Filter and Regex
10 pages
Regular Expressions and Its Applications
No ratings yet
Regular Expressions and Its Applications
6 pages
Matching This or That: ' - ' Dog Cat Dog - Cat Dog Dog Cat Cat
No ratings yet
Matching This or That: ' - ' Dog Cat Dog - Cat Dog Dog Cat Cat
7 pages
Tutorial: Using Regular Expressions: Section 1. Introduction To The Tutorial Who Is This Tutorial For?
No ratings yet
Tutorial: Using Regular Expressions: Section 1. Introduction To The Tutorial Who Is This Tutorial For?
22 pages
3.1. Regular Expressions: 3.1.1 Definition and Example
No ratings yet
3.1. Regular Expressions: 3.1.1 Definition and Example
8 pages
Introducing Regular Expressions Notes
No ratings yet
Introducing Regular Expressions Notes
3 pages
A Beginner's Guide To Grep - Basics and Regular Expressions - Manish Rane
No ratings yet
A Beginner's Guide To Grep - Basics and Regular Expressions - Manish Rane
9 pages
Bash Regex
No ratings yet
Bash Regex
53 pages
SED (1) User Commands SED
No ratings yet
SED (1) User Commands SED
2 pages
Filter 4
No ratings yet
Filter 4
11 pages
Regexp
No ratings yet
Regexp
28 pages
U4 - Shell Pattern Matching
No ratings yet
U4 - Shell Pattern Matching
5 pages
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
Java: Best Practices to Programming Code with Java
From Everand
Java: Best Practices to Programming Code with Java
Charlie Masterson
No ratings yet
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
From Everand
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
Charlie Masterson
No ratings yet
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
From Everand
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
Shari Eskenas
5/5 (1)
Simplified PHP
From Everand
Simplified PHP
James Blanchette
No ratings yet
Synthesis and Characterization of Mapbi Thin Film and Its Application in C Si/Perovskite Tandem Solar Cell
No ratings yet
Synthesis and Characterization of Mapbi Thin Film and Its Application in C Si/Perovskite Tandem Solar Cell
9 pages
Zhao 2018
No ratings yet
Zhao 2018
13 pages
2022-11-16 - Malware Development Persistence - Part 19 Disk - Chromium - Anna's Archive
No ratings yet
2022-11-16 - Malware Development Persistence - Part 19 Disk - Chromium - Anna's Archive
11 pages
124-Linux Shell Scripting
No ratings yet
124-Linux Shell Scripting
28 pages
Stations of The Cross
No ratings yet
Stations of The Cross
17 pages
Section 5: Divisible and Indivisible Obligations
No ratings yet
Section 5: Divisible and Indivisible Obligations
13 pages
Lucky Chanda Grade 12 Result
No ratings yet
Lucky Chanda Grade 12 Result
1 page
10506-Hayfield SoftTwistDK (Free)
100% (1)
10506-Hayfield SoftTwistDK (Free)
4 pages
Mem and Zin - Wikipedia
No ratings yet
Mem and Zin - Wikipedia
3 pages
Characteristics of Caste System
No ratings yet
Characteristics of Caste System
10 pages
2024 Bangkok SR-JR Awards
No ratings yet
2024 Bangkok SR-JR Awards
302 pages
Resistive Touch Screen Design Guide
No ratings yet
Resistive Touch Screen Design Guide
21 pages
Sample Detailed Lesson Plan
No ratings yet
Sample Detailed Lesson Plan
5 pages
Accounting Basics
No ratings yet
Accounting Basics
42 pages
Easygen-800: Manual
No ratings yet
Easygen-800: Manual
156 pages
Assembly Speech Text
No ratings yet
Assembly Speech Text
6 pages
0007 Beef Stew
No ratings yet
0007 Beef Stew
10 pages
Group 8 Feasibility Study
No ratings yet
Group 8 Feasibility Study
13 pages
Rosa Parks: The Bus Boycott
No ratings yet
Rosa Parks: The Bus Boycott
12 pages
Get Notified When Matric 10th Result Is Announced
No ratings yet
Get Notified When Matric 10th Result Is Announced
25 pages
WEEK 5 LESSON UnderstandingAndConceptualizingInteraction
No ratings yet
WEEK 5 LESSON UnderstandingAndConceptualizingInteraction
50 pages
DEOs Session Court Lahore
No ratings yet
DEOs Session Court Lahore
132 pages
The Study of Consumers
No ratings yet
The Study of Consumers
5 pages
Salient Features of On MSMED Act
100% (1)
Salient Features of On MSMED Act
6 pages
Feminist Perspectives On The Family
No ratings yet
Feminist Perspectives On The Family
10 pages
Listening - Trac Nghiem
No ratings yet
Listening - Trac Nghiem
1 page
Untitled
No ratings yet
Untitled
6 pages
Santos Vs CA - 120820 - August 1, 2000 - J
No ratings yet
Santos Vs CA - 120820 - August 1, 2000 - J
7 pages
DL-Application Manual Book 2024-10-07
No ratings yet
DL-Application Manual Book 2024-10-07
155 pages
Langston Hughes - Harlem Renaissance 20 Pts
No ratings yet
Langston Hughes - Harlem Renaissance 20 Pts
4 pages

130-Linux Shell Scripting

Uploaded by

130-Linux Shell Scripting

Uploaded by

Shell Scripting

Vahab Shalchian (ITIL v3 , LPIC-1 , LPIC-2 , LPIC-3)

A regular expression is a pattern template you define that a Linux

These include such diverse applications as programming languages

• The POSIX Basic Regular Expression (BRE) engine

Most Linux utilities at a minimum conform to the POSIX BRE engine

The most basic BRE pattern is matching text characters in a data

$ echo "This is a test" | sed -n '/trial/p'

Regular expression patterns assign a special meaning to a few

$ sed -n '/\$/p' data2

The opposite of looking for a pattern at the start of a line is looking

Here's an example of creating a character class:

Negating character classes

$ sed -n '/^[0-9][0-9][0-9][0-9][0-9]$/p' data8

The question mark

Note: By default, the gawk program doesn't recognize regular

The format for using the pipe symbol is:

$ echo "The cat is asleep" | gawk '/cat|dog/{print $0}'

You might also like