Linux Systems Administration Sed - Awk - and Perl v1
Linux Systems Administration Sed - Awk - and Perl v1
Lesso n 2: Se d
Sed
Multiple Co mmands
Output Redirectio n
Sed Script Files
Lesso n 3: Awk
Awk
Awk Example
Lesso n 5: Pe rl
What is Perl?
Getting Started
Variables
Operato rs
Lesso n 6 : If St at e m e nt s and Lo o ps
If Statements
Embedded Ifs and Multiple Co nditio ns
String Co nditio ns
Lesso n 7: Lo o ps
While Lo o ps
Fo r Lo o ps
Last and Next
Lesso n 9 : Input
Getting Input
In Mo re Ways Than One
Still Mo re Ways
Opening a File fo r Reading
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Introduction to Scripting
To learn a new skill o r techno lo gy, yo u have to experiment. The mo re yo u experiment, the mo re yo u learn. Our system
is designed to maximize experimentatio n and help yo u learn to learn a new skill.
We'll pro gram as much as po ssible to be sure that the principles sink in and stay with yo u.
Each time we discuss a new co ncept, yo u'll put it into co de and see what YOU can do with it. On o ccasio n we'll even
give yo u co de that do esn't wo rk, so yo u can see co mmo n mistakes and ho w to reco ver fro m them. Making mistakes
is actually ano ther go o d way to learn.
Abo ve all, we want to help yo u to learn to learn. We give yo u the to o ls to take co ntro l o f yo ur o wn learning experience.
When yo u co mplete an OST co urse, yo u kno w the subject matter, and yo u kno w ho w to expand yo ur kno wledge, so
yo u can handle changes like so ftware and o perating system updates.
T ype t he co de . Resist the temptatio n to cut and paste the example co de we give yo u. Typing the co de
actually gives yo u a feel fo r the pro gramming task. Then play aro und with the examples to find o ut what else
yo u can make them do , and to check yo ur understanding. It's highly unlikely yo u'll break anything by
experimentatio n. If yo u do break so mething, that's an indicatio n to us that we need to impro ve o ur system!
T ake yo ur t im e . Learning takes time. Rushing can have negative effects o n yo ur pro gress. Slo w do wn and
let yo ur brain abso rb the new info rmatio n tho ro ughly. Taking yo ur time helps to maintain a relaxed, po sitive
appro ach. It also gives yo u the chance to try new things and learn mo re than yo u o therwise wo uld if yo u
blew thro ugh all o f the co ursewo rk to o quickly.
Expe rim e nt . Wander fro m the path o ften and explo re the po ssibilities. We can't anticipate all o f yo ur
questio ns and ideas, so it's up to yo u to experiment and create o n yo ur o wn. Yo ur instructo r will help if yo u
go co mpletely o ff the rails.
Acce pt guidance , but do n't de pe nd o n it . Try to so lve pro blems o n yo ur o wn. Go ing fro m
misunderstanding to understanding is the best way to acquire a new skill. Part o f what yo u're learning is
pro blem so lving. Of co urse, yo u can always co ntact yo ur instructo r fo r hints when yo u need them.
Use all available re so urce s! In real-life pro blem-so lving, yo u aren't bo und by false limitatio ns; in OST
co urses, yo u are free to use any reso urces at yo ur dispo sal to so lve pro blems yo u enco unter: the Internet,
reference bo o ks, and o nline help are all fair game.
Have f un! Relax, keep practicing, and do n't be afraid to make mistakes! Yo ur instructo r will keep yo u at it
until yo u've mastered the skill. We want yo u to get that satisfied, "I'm so co o l! I did it!" feeling. And yo u'll have
so me pro jects to sho w o ff when yo u're do ne.
Lesson Format
We'll try o ut lo ts o f examples in each lesso n. We'll have yo u write co de, lo o k at co de, and edit existing co de. The co de
will be presented in bo xes that will indicate what needs to be do ne to the co de inside.
Whenever yo u see white bo xes like the o ne belo w, yo u'll type the co ntents into the edito r windo w to try the example
yo urself. The CODE TO TYPE bar o n to p o f the white bo x co ntains directio ns fo r yo u to fo llo w:
CODE TO TYPE:
White boxes like this contain code for you to try out (type into a file to run).
If you have already written some of the code, new code for you to add looks like this.
If we want you to remove existing code, the code to remove will look like this.
We may also include instructive comments that you don't need to type.
We may run pro grams and do so me o ther activities in a terminal sessio n in the o perating system o r o ther co mmand-
line enviro nment. These will be sho wn like this:
INTERACTIVE SESSION:
Co de and info rmatio n presented in a gray OBSERVE bo x is fo r yo u to inspect and absorb. This info rmatio n is o ften
co lo r-co ded, and fo llo wed by text explaining the co de in detail:
OBSERVE:
Gray "Observe" boxes like this contain information (usually code specifics) for you to
observe.
The paragraph(s) that fo llo w may pro vide additio n details o n inf o rm at io n that was highlighted in the Observe bo x.
We'll also set especially pertinent info rmatio n apart in "No te" bo xes:
Note No tes pro vide info rmatio n that is useful, but no t abso lutely necessary fo r perfo rming the tasks at hand.
T ip Tips pro vide info rmatio n that might help make the to o ls easier fo r yo u to use, such as sho rtcut keys.
WARNING Warnings pro vide info rmatio n that can help prevent pro gram crashes and data lo ss.
T he CodeRunner Screen
This co urse is presented in Co deRunner, OST's self-co ntained enviro nment. We'll discuss the details later, but here's
a quick o verview o f the vario us areas o f the screen:
These video s explain ho w to use Co deRunner:
Co de Edito r Demo
Co ursewo rk Demo
INTERACTIVE SESSION:
cold1:~$
When yo u see this pro mpt, yo u are lo gged in and ready to wo rk! After yo u finish wo rking o n yo ur co ursewo rk, lo g o ut
fro m the machine using the exit co mmand:
INTERACTIVE SESSION:
cold1:~$ exit
Great! Yo u kno w to access yo ur Linux pro mpt. No w, befo re we start using Linux, I'd like yo u to kno w a little abo ut its
histo ry and why yo u've made a smart decisio n in cho o sing to learn the language:
What is Scripting?
Script ing is the term applied to a small pro gram that is typically used to make a large jo b easier o r to auto mate a
repetitive task. A script might be run using cro n to ro tate system lo gs every few days. Yo u can also use scripts to clean
o ld users o ut o f the passwo rd file o r to retain statistics abo ut system usage o ver time.
In this lesso n, yo u will learn ho w to use se d and awk o n the co mmand line. Beginning with these co ncepts, yo u will
see ho w po werful Pe rl can be in the hands o f a systems administrato r.
Line Editing
Befo re we get into se d, we need to gain a basic understanding o f line editing. Yo u are pro bably very familiar with text
edito rs and wo rd pro cesso rs that allo w yo u to lo o k at a do cument while yo u mo ve aro und and make changes. This is
so metimes called full text editing because yo u can see the entire do cument.
Line editing, o n the o ther hand, is used to make changes to a single line o f a file at a time. This is do ne using special,
editing commands rather than making the changes by hand.
Let's use e d (a standard Unix line edito r) to make this all a little clearer. First we need a file to edit. Co py
/httpd/conf/httpd.conf into yo ur ho me directo ry o n the co ld.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cp /httpd/conf/httpd.conf .
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ed httpd.conf
1168702
p
q
cold:~$
The first thing yo u will no tice is that e d repo rts the number o f characters in the file. I am no t really sure why a line edito r
repo rts the number o f characters, but it do es. After displaying the number o f characters, yo ur curso r just sits there
witho ut a pro mpt. e d has po sitio ned yo u at the last line o f the file and is waiting fo r yo u to tell it what to do . The final
instructio n used was the q co mmand to quit the current editing sessio n. Use p to print the current line. In the example
abo ve, the last line o f the httpd.co nf file is empty.
Note The standard Unix metho d o f typing ct rl+c to break o ut o f a co mmand will no t wo rk with e d.
It do es no t do us much go o d to just sit o n the last line o f a file. We can specify a different line by typing a number. Fo r
example, if we want to go to line 5, we wo uld just type 5 .
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ed httpd.conf
1168702
5
ServerRoot /httpd
The d co mmand deletes the current line. After deleting line 5 we printed o ut the current line. We are still o n line 5, but the
o ld line is go ne (we are at what used to be line 6 ). w is typed to save o ur changes. This writes the file. Finally, e d
displays the updated to tal number o f characters in the file after writing.
Typing o ne co mmand at a time can be pretty bo ring tho ugh. Lucky fo r us, e d allo ws us to co mbine co mmands into a
single string.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ed httpd.conf
1168684
5dp
ScoreBoardFile /httpd/logs/httpd.scoreboard
wq
1168614
cold:~$
The first string says, "go to line 5, delete the line, and then print the current line (the new line 5)." This string can be
bro ken up into two parts: the address and the command. The address in this case is simply a line number. That is all
fine and go o d, but it is o nly useful to use a line number if we kno w exactly where we want to make changes.
Patterns as Addresses
Instead o f using line numbers, e d allo ws us to pick lines that match a specified pattern. Patterns are Unix regular
expressions co ntained inside two fo rward slashes (/). Let us find and print the first line that co ntains "htdo cs".
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ed httpd.conf
1168614
/htdocs/
#DocumentRoot /httpd/htdocs
Chances are, there are multiple lines that will match that pattern. We can use g to run a global co mmand.
The print co mmand is implied. Let us lo o k at what we have. This is a glo bal pattern that is a regular expressio n. The
default co mmand is to print o ut the matching lines. Hmmm... glo bal regular e xpressio n print. Yo u remember the gre p
co mmand right? That is no co incidence.
Imagine fo r a mo ment that this is no t an impo rtant file (actually, it is no t, it is just a co py). We can add co mmands to o ur
address pattern.
We have just deleted every line that co ntains "htdo cs". Oo ps. If yo u try to quit e d after making changes and witho ut
writing the file first, it will give yo u a questio n mark "?". If yo u want to quit witho ut saving changes, just type q a seco nd
time.
OBSERVE:
s/pattern/replacement/
Recall that we needed to use the g flag in o rder to use a co mmand o n every line that matched the address pattern. The
same is true fo r the switch co mmand. It will o nly replace the first o ccurrence o f the pattern within a line unless we
specify o therwise with a trailing g. Put this to gether with what we already kno w and we can replace every o ccurrence o f
"specific" with "precise".
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ed httpd.conf
1168614
g/specific/s/specific/precise/g
p
# you might expect, make sure that you have preciseally enabled it
This co mmand will match any line with "specific" in it and then replace every o ccurrence o f "specific" with "precise."
The current line beco mes the last line where a change to o k place. Printing the line sho ws that the results may no t be as
we intended. Remember, we are matching a pattern, no t just wo rds. That is ho w "specifically" became "preciseally". A
better set o f co mmands wo uld include spaces o n either side o f "specific" like the fo llo wing:
The first co mmand wo uld replace "specific" fo und in the middle o f a line. Do yo u remember what the o ther two wo uld
do ?
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Sed
Sed
When we want to write scripts that can use the capabilities o f e d we lo o k to se d. Yo u can think o f this as Scripting with
Ed. First, let us learn ho w to use se d o n the co mmand line.
Unlike ed, we can no t use sed interactively. We have to give sed a file and a list o f co mmands to perfo rm o n it. Let us
apply so me o f the same co mmands we went thro ugh with ed to sed to see ho w sed o perates.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed -n '/htdocs/p' httpd.conf
#DocumentRoot /httpd/htdocs
#DocumentRoot "/home/webpages/userworld/htdocs"
This is the same o utput we go t when we used g/ht do cs/ with ed. One o f the majo r differences between sed and ed is
that sed lo o ks at every line in the file auto matically. As a result we do no t have to include the beginning g. Additio nally,
sed will print o ut every line in the file regardless o f whether it was changed, unless we tell it no t to with the -n flag.
(Printing o ut every line is useful fo r o utput redirectio n, as we will see later.) Also , distinguishing sed fro m ed is the
inclusio n o f the p co mmand to print o ut the lines that were changed o r, in this case, the lines that matched the pattern
o f "htdo cs".
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed -n 's/specific/precise/gp' httpd.conf
# Note that from this point forward you must preciseally allow
# you might expect, make sure that you have preciseally enabled it
No tice that we did no t include the address pattern. Since sed lo o ks at every line o f the file, when the address and
replacement pattern are the same, it is no t necessary to include the address pattern. Ho wever, we co uld replace o nly
o n lines that co ntain "No te".
After the co mmand pro mpt, type the fo llo wing co mmands:
Multiple Commands
Sed will also execute mo re than o ne co mmand as it reads thro ugh the lines o f a file. There are two ways we can get
sed to execute multiple co mmands simultaneo usly. One way is to use the -e flag fo r each co mmand we want to
execute.
After the co mmand pro mpt, type the fo llo wing co mmands:
After the co mmand pro mpt, type the fo llo wing co mmands:
This pro ves that bo th co mmands get executed o n each line. We have already printed the line with the first co mmand,
but the seco nd co mmand matches the line as well and ends causing it to be printed again.
Ano ther way to use multiple co mmands is to separate them with a semi-co lo n.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed -n '/htdocs/s/home/house/g ; /htdocs/p' httpd.conf
#DocumentRoot /httpd/htdocs
#DocumentRoot "/house/webpages/userworld/htdocs"
This co mmand replaces "ho me" with "ho use" o n any line that co ntains "htdo cs." Also , any line with "htdo cs" is printed
regardless o f whether a change to o k place.
Output Redirection
We can make changes to the lines o f a file, but ho w do we save them? By taking advantage o f Unix o utput redirectio n.
Remember we are using the -n o ptio n to keep sed fro m printing o ut all o f the lines. If we remo ve that o ptio n, the entire
file, with any changes, will be printed to standard o utput. We can redirect this o utput to ano ther file.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed '/htdocs/s/home/house/g' httpd.conf > httpd2.conf
This co mmand line makes the desired changes to the lines and prints all o f the lines to the new file: httpd2.conf. When
yo u are no t using the -n o ptio n, yo u typically do no t want to use the print o ptio n (p) either. Do ing so wo uld cause the
lines to print o ut twice.
Remember, we can no t write to the same file fro m which we are reading. Attempting to do this can
WARNING cause unpredictable results.
After the co mmand pro mpt, type the fo llo wing co mmands:
Instead o f deleting lines o r replacing who le patterns, we have the o ptio n o f translation. The idea behind translatio n is
that yo u can replace o ne set o f characters with ano ther set, but they do no t have to be in any particular o rder o r next to
each o ther. Here is an example:
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed -n 'y/DR/dr/ ; /htdocs/p' httpd.conf
#documentroot /httpd/htdocs
#documentroot "/home/webpages/userworld/htdocs"
No tice, we are no t replacing "DR" with "dr", but instead we are translating any uppercase D to a lo wercase D and any
uppercase R to a lo wercase r. "Do cumentRo o t" beco mes "do cumentro o t". Keep in mind tho ugh, the changes are
applied o n any line with a capital D o r R, no t just the o nes we printed o ut. Yo u are no t restricted to using like letters
either. We co uld have replaced A with z o r t with 3.
/Indexes/d
s/ document / letter /g
s/ document$/ letter/
s/^document /letter /
/htdocs/s/home/house/g
w httpd2.conf
Here the w co mmand writes lines to a file. It allo ws us create the o utput file in the script instead o f redirecting the o utput
o n the co mmand line. Let us add the -n flag back into the sed co mmand line.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ sed -nf changes.sed httpd.conf
The -f flag lets us specify the script file to use when pro cessing o ur file.
Sed, just like mo st Unix co mmands, can read fro m standard input rather than reading a file. All we have to do is pipe
the o utput fro m ano ther co mmand into sed.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ps aux |sed -n 'y/1234567890/abcdefghij/ ; /username/p'
username acdii j.j j.b bcbh acfj pts/j S Aprcj j:jj -bash
username aedge j.j j.b bccf achj pts/b S Mayja j:jj -bash
username cagcj j.j j.a bedh gdd pts/b R ab:df j:jj ps aux
username cagca j.j j.j acdh dbh pts/b S ab:df j:jj sed -n y/abcdefgh
Yo u can read abo ut many mo re sed co mmands o n the sed man page. See yo u at the next lesso n!
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Awk
Awk
Awk is the co mpanio n o f sed. It wo rks almo st the same way, by examining each line o f input. The main difference
between the two is that awk auto matically divides each line into fields, sed do es no t. Awk divides a line into fields by
separating wo rds divided by spaces, but later we will learn ho w to change the field separato r to anything we want. But
right no w, let us lo o k at ho w to print o ut every line that co ntains "sendmail" using awk.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ awk '/Indexes/ { print $0 }' httpd.conf
Options FollowSymLinks ExecCGI Indexes Includes
Options FollowSymLinks Indexes Includes
# This may also be "None", "All", or any combination of "Indexes",
Options Indexes FollowSymLinks ExecCGI Includes
Options Indexes MultiViews
Here we have matched the line and used the print co mmand to print o ut $ 0 . Since awk divided the line into fields, we
need a way to reference them. $ 0 references the entire line. If we wanted to print o ut just the seco nd field, it wo uld be an
easy change.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ awk '/Indexes/ { print $2 }' httpd.conf
FollowSymLinks
FollowSymLinks
This
Indexes
Indexes
Co o l, huh? But using httpd.conf do es no t really sho w o ff the true po wer o f awk. Let us try wo rking with a different file.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cp /etc/passwd .
Yo u are already familiar eno ugh with /etc/passwd to kno w that it has co lo n separated fields. Let us make awk separate
by co lo ns instead o f spaces.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ awk -F ":" '{ print $1 }' passwd
root
bin
daemon
...
This co mmand prints o ut the first field, which happens to co ntain the usernames. We can print o ut whichever field we
want tho ugh. Having o nly this basic understanding o f awk, we can see immediately ho w useful it is when co mbined
with o ther Unix co mmands. Check o ut the co mmand belo w:
The same thing co uld be do ne witho ut grep, tho ugh it wo uld be slightly slo wer.
Observe the fo llo wing:
The co mmands inside o f the back ticks (`) are executed first and the resulting o utput is used as a list fo r kill -9 . ps aux
retrieves a list o f all o f the running pro cesses and the o utput is piped to awk which prints o ut the seco nd field o f any
line that co ntains "username". The seco nd field happens to be the PID o f the pro cesses. The result is that all o f the
pro cesses o wned by that username get killed.
Awk Example
Let us say we want to find o ut the to tal number o f unique users currently lo gged into a system. We can retrieve a list o f
users with the w co mmand.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ w
11:09am up 1:37, 1 user, load average: 0.00, 0.00, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
kbutson pts/8 cold.useractive. 10:36am 0.00s 0.09s 0.02s w
Hmm...well, that was no t very hard. We have just o ne unique user (there may be mo re when yo u try it). Let us lo o k at an
example o f a server that is a little mo re active. In yo ur ho me directo ry yo u will find a file called w.txt. Let us lo o k at the
co ntents o f this file.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cat w.txt
3:36pm up 24 days, 20:17, 22 users, load average: 0.05, 0.06, 0.08
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
johndoe2 ttyq0 v1740426-a.domai 10:31am 5:03m 0.05s 0.05s -bash
fsmith ttypf work987.useracti 2:01pm 1:20m 4.08s 4.03s pine
johndoe2 ttyq3 work1.useractive 11:25am 14:12 0.19s 0.19s -bash
alexandr ttyp6 host.useractive. 8:08am 2:06m 0.08s 0.08s -bash
johndoe2 ttyq5 work1.useractive 11:28am 3:51m 0.06s 0.06s -bash
george11 ttypd work56.useractiv Mon 1pm 0.00s 0.60s 0.07s w
monkey ttyq7 work2.useractive 2:02pm 27:21 0.66s ? -
george11 ttyq8 work56.useractiv 1:11pm 1:54 1.20s 0.94s emacs -l
johndoe2 ttyq2 work1.useractive 11:24am 3:38m 0.22s ? -
johndoe2 ttyp0 v1740426-a.domai 1:18am 5:23m 0.29s 0.13s slogin f
monkey ttyp1 v1825631-a.domai Mon12pm 3:40 0.27s 0.27s -bash
george11 ttyp5 v1740426-a.domai 2:08am 6:12m 0.09s 0.09s -bash
johndoe2 ttyp8 work1.useractive Fri 2pm 18:59 0.23s ? -
mouse ttype v946166-a.domain 12:41pm 10:59 0.88s 0.84s /usr/loc
monkey ttyq9 monkeybar.useract 11:56am 50:37 0.64s 0.51s emacs i
johndoe2 ttypb work1.useractive 11:21am 3:23m 0.07s 0.07s -bash
monkey ttyp7 v1825631-a.domai Mon12pm 10:58m 0.63s 0.51s emacs in
mouse ttyp3 v946166-a.domain 12:43pm 2:22 1.43s ? -
catdog ttyp9 work52.useractiv 3:06pm 8:31 1.15s 1.13s pine
monkey ttyp4 v1825631-a.domai 1:29am 11:33m 0.39s 0.31s emacs in
fsmith ttyqc work987.useracti 1:19pm 2:06m 0.07s 0.07s -tcsh
If we co unt these, we will see that there are 7 unique users. Our go al is to retrieve that number directly by using several
Unix co mmands. Ho w do we build so mething that will acco mplish this? Well, think abo ut ho w yo u co unted them. The
first step was to eliminate all o f the stuff that did no t matter, everything that is no t a username. We began by getting rid
o f the header info rmatio n pro vided by w so that all o f the lines retrieved were o f the same fo rmat. Lucky fo r us, w had a
-h flag that suppressed the headers. Output representing w -h was sto red in a file called wh.txt.
The next step is to take all o f the lines (which no w lo o k pretty much the same) and figure o ut ho w to return o nly the
usernames. This is where awk co mes in handy.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cat w.txt |awk '{ print $1 }'
johndoe2
fsmith
johndoe2
alexandr
johndoe2
george11
monkey
george11
johndoe2
johndoe2
monkey
george11
johndoe2
mouse
monkey
johndoe2
monkey
mouse
catdog
monkey
fsmith
That is already a lo t easier to read. Unix pro vides us with ano ther useful co mmand called uniq. This co mmand will
remo ve duplicates fro m a so rted list. But ho w do get yo ur list so rted in the first place? Since so rting is a pretty co mmo n
task, I bet Unix has a so rt co mmand.
After the co mmand pro mpt, type the fo llo wing co mmands:
Almo st there. This is really easy to read, but remember o ur ultimate go al is to find the actual number o f unique users.
The wc co mmand stands fo r word count. With the -l (a lo we rcase L) o ptio n, wc will return the number o f lines (which
is also the number o f unique users).
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cat w.txt |awk '{ print $1 }' |sort |uniq |wc -l
7
Aweso me.
As this example sho ws, it is useful to think o f all these different Unix co mmands as to o ls in a to o lbo x. In mo st cases,
using o nly o ne to o l wo n't do the jo b, but using many to o ls to gether in the right o rder can so lve almo st any pro blem.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Basic Shell Scripting
Shell Scripts
We are already familiar with the co ncept o f a Unix shell. It is the enviro nment that allo ws yo u interact with the server. We
use the shell by issuing it a series o f co mmands at the pro mpt. Many times, we find o urselves issuing the same set o f
co mmands o ver and o ver. When this is the case, we can write a shell script to speed things up. A script is a term o ften
used to refer to a small pro gram.
Shell scripts are mo stly just Unix co mmands with the additio n o f variables and lo gical statements. These variables and
statements are capabilities built in to yo ur shell. Each shell (sh, tcsh, etc) has a little different syntax, but we will
co ntinue fo cusing o n bash.
Let us take a lo o k at a very basic shell script. Edit a file called script.sh.
uptime
Save this file. The first line needs to be in every bash shell script. It indicates the lo catio n o f bash o n the server. We
need to make this script executable befo re we can run it.
After the co mmand pro mpt, type the fo llo wing co mmands:
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./script.sh
7
No w, anytime we want to find o ut the number o f unique users o n the system, instead o f typing o ut that who le huge
co mmand line, all we have to do is run script.sh.
Shell Variables
Just like o ur shell has enviro nment variables (ex: PATH), o ur shell scripts may have variables to o . These variables are
assigned values and used in exactly the same way they are o n o ur shell. Edit a new file called var.sh.
VAR="hello"
echo $VAR
When we want to assign a value to a variable, we simply indicate the variable name. Ho wever, when we want to access
the variable, we need to use the "$" symbo l. Save and execute var.sh.
After the co mmand pro mpt, type the fo llo wing co mmands:
VAR=hello
echo $VAR
echo $PATH
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./var.sh
hello
/users/username/.gemhome/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sb
in
The next step is to assign the o utput o f a co mmand (o r set o f co mmands) to a variable that we can use later. Edit the
previo us script.sh file again.
Here, we have added backward single quo tatio n marks aro und o ur co mmand string to indicate that we want to sto re
the o utput in NUM. By the way, the backward quo tatio n mark key is the same key as the tilde (~) key.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./script.sh
2
IF Statements
Many times, we will want to execute a certain series o f co mmands o nly if a variable has a specific value. To test fo r
these conditions, we need to use an if statement. Let us try it. Edit script.sh again.
if [ $NUM = 2 ]; then
w
echo $NUM
fi
Adjust the co nditio n depending o n the number o f users currently o n the co ld when yo u try this example. Save and
execute script.sh.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./script.sh
2:42pm up 31 days, 2:58, 2 users, load average: 0.00, 0.00, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
username pts/1 home.server.co 25Sep01 1:23m 0.12s 0.12s -bash
username pts/2 home.server.co 25Sep01 0.00s 0.22s 0.02s bash ./scr
1
Here we can see the script in actio n. First it gathers the number o f unique users o n the system. If there is o ne user, it
will execute the w co mmand and print o ut the value o f NUM. This reads as "If NUM is equal to 1, then perfo rm the
desired o peratio ns".
No te that the syntax (fo rmat and spacing) o f the if statement must be exact. Fo r example, this will no t wo rk:
That wo rks well when co mparing numerical values, but we also need to be able to co mpare strings. In o rder to
co mpare strings, we use do uble quo tatio n marks.
if [ $NUM = 1 ]; then
w
echo $NUM
fi
As always, replace "username" with yo ur o wn lo gin. Here we are checking to make sure the USER enviro nment
variable is equal to o ur username.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./script.sh
3:28pm up 31 days, 3:44, 2 users, load average: 0.00, 0.00, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
username pts/1 home.server.co 25Sep01 2:09m 0.12s 0.12s -bash
username pts/2 home.server.co 25Sep01 0.00s 0.24s 0.02s bash ./scr
1
username 8210 0.0 0.2 2336 1376 pts/1 S Sep25 0:00 -bash
username 8694 0.0 0.2 2332 1364 pts/2 S Sep25 0:00 -bash
username 30964 0.0 0.1 2000 972 pts/2 S 15:28 0:00 bash ./script.sh
username 30972 0.0 0.1 2548 744 pts/2 R 15:28 0:00 ps aux
username 30973 0.0 0.1 1520 592 pts/2 S 15:28 0:00 grep username
Wrap Up
Shell scripts are used mo st o ften during system startup when o ther scripting languages might no t be available. Take at
lo o k at /etc/rc.d/rc.sysinit fo r a go o d shell script example.
With the increasing po pularity o f Perl, peo ple are using shell scripts much less fo r day-to -day scripting needs. As a
result, we will be fo cusing o n Perl fo r the rest o f this co urse.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Perl
What is Perl?
The first versio n o f Perl (Practical Extraction and Report Language) was written by Larry Wall in 19 8 7. Since its creatio n,
Perl has beco me po pular with systems administrato rs because o f its ease o f use and po wer with text manipulatio n. It
is mo re versatile than sed o r awk and is typically easier to write than o ther languages. Perl is an immensely po pular
language fo r writing CGI scripts, but we will be co ncentrating o n its usefulness to systems administrato rs.
Perl is an interpreted language, meaning that its instructio ns are co nverted into machine co de while a Perl script is
executed. Co nversely, a co mpiled language is co nverted to machine language befo re its executio n.
We are go ing to spend the next few lesso ns beco ming familiar with the fundamentals o f Perl syntax befo re applying it
to systems administrato r tasks.
If yo u have taken o ur CGI/Perl class, a lo t o f the Perl syntax will be familiar to yo u. Ho wever, the
Note applicatio n o f Perl in a no n-CGI enviro nment will be so mething yo u are no t familiar with yet.
Getting Started
The first thing we need to do is lo cate the Perl executable. We can find it by using the co de belo w:
After the co mmand pro mpt, type the fo llo wing co mmands:
The very first line o f o ur Perl scripts will co ntain the lo catio n o f the Perl executable. Let us go ahead and write o ur first
Perl script. Edit a file called test.pl so that it co ntains the fo llo wing lines:
#!/usr/local/bin/perl
print "blaa";
This may so und a tad o bvio us, but it is impo rtant to make sure that the first line starts at the very beginning o n the far
left side. Also , no tice that the print statement ends in a semi-co lo n. Mo st lines in Perl will end in a semi-co lo n because
that deno tes the end o f a statement. Save the script and exit yo ur edito r.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./test.pl
bash: ./test.pl: Permission denied
Hmm...that is no t go o d. We have to make the script executable befo re we can use it.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ chmod 755 test.pl
cold:~$ ./test.pl
blaacold:~$
We just printed o ut "blaa", but it wo uld be a little nicer if it were o n a line by itself.
Make this change to test.pl
#!/usr/local/bin/perl
print "blaa\n";
The backslash (\) is called the escape character in Perl. This indicates to Perl that the character right after it sho uld be
treated specially. Fo r instance, the co mbinatio n o f \n represents a new line. The escape character can also be used to
print quo tes that wo uld o therwise interfere with o ur print statement.
print "\"blaa\"\n";
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./test.pl
"blaa"
Variables
Just printing o ut lines o f text do es no t do us much go o d o n its o wn tho ugh. Let us go o ver so me different variable
types in Perl. A variable can be tho ught o f as a bo x that co ntains a piece o f data. When sto ring so mething in a variable,
it do es no t matter if it is a number o r a string.
$var = 3;
$var2 = "This is a string";
print "$var\n";
print "$var2\n";
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./test.pl
3
This is a string
Regular variables in Perl always start with $ . Here, we have created a co uple o f simple variables and printed them o ut.
They are no t particularly useful individually tho ugh. We need to be able to change the values o f tho se variables and
use them in co njunctio n with each o ther.
Operators
In the previo us example, when we created o ur variables we gave them values with the assignment operator o r equals
sign (=). We can also perfo rm mathematical o peratio ns o n them. Let us create a new file called operator.pl with the
fo llo wing lines:
Add these lines to o perato r.pl:
#!/usr/local/bin/perl
$var1 = 3;
$var2 = 5;
Here we can see that basic math o peratio ns wo rk just fine. Test this script to make sure yo u get the right values.
Remember, yo u will have to make the file executable befo re yo u can run it.
Co mments in Perl start with a po und sign (#). Anything after a po und sign is igno red when the script is
Note executing. Co mments are used to explain the co de to o ther pro grammers who lo o k at yo ur co de. It is
always a go o d idea to co mment yo ur co de especially as yo ur co de gets mo re co mplex.
$var1 = 3;
$var2 = 5;
No tice that we can use a variable in the o peratio n to assign it a new value. Here we have used the current value o f
$ var1 to co me up with a new o ne. The expo nent o perato r (**) raises $ var1 to the po wer o f $ var2.
An o perato r yo u might no t be familiar with is the modulus (%). This is no t a percentage o peratio n. The mo dulus acts
like a divisio n o peratio n except that, instead o f returning the answer, it returns the remainder. Fo r example: 5 % 2 = 1 (5
divided by 2 equals 2 with a remainder o f 1).
All o f these o peratio ns deal with numbers, but we kno w that a variable can sto re a string as well. Two strings can be
"added" to gether with the concatenation o perato r (which happens to be a perio d).
$var1 = 3;
$var2 = 5;
$string1 = "This is my ";
$string2 = "sentence.";
The concatenation o perato r simply pushes all the strings to gether into o ne string. Save operator.pl and test it o ut.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./operator.pl
243 3
This is my sentence. Neat, huh?
Perl pro vides us with sho rtcuts fo r a lo t o f o peratio ns. Here is a table o f o perato r sho rtcuts and their expanded
versio ns.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
If Statements and Loops
If Statements
Often times we will need to perfo rm a test o n a variable befo re executing so me co de. We do this using an if statement.
These statements are o f the fo rm "if, then":
If the condition turns o ut to be true, then the operations are executed. If the condition turns o ut to be false, then the
operations are no t executed. In pro gramming, t rue is also represented as and interpreted as 1, while f alse is
represented and interpreted as 0 .
Let us lo o k at a co uple examples to see this in actio n. Create a new file called if.pl.
#!/usr/local/bin/perl
$var1 = 3;
$var2 = 4;
if($var1 == 4){
print "third test true\n";
}else{
print "third test false\n";
}
The first if statement tests to see if the variable is less than five. If so , the print statement is executed, o therwise the
script co ntinues after the clo sing bracket. The seco nd if statement is checked regardless o f the result o f the first o ne.
The same is true fo r the third if statement. In this last case, we are testing to see if the variable is equal to fo ur. No tice
that the test fo r equality is a do uble equals sign (==); this is no t the same as the assignment o perato r (=). Additio nally,
if the variable is no t equal to fo ur, everything inside o f the e lse statement is executed.
Let's save if.pl and try it o ut. Be sure to change the file permissio ns first.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ chmod 755 if.pl
cold:~$ ./if.pl
first test true
second test true
third test false
All o f the abo ve o perato rs deal with numbers (with the exceptio n o f ! (no t) which will invert any co nditio nal statement).
But what abo ut when we are co mparing strings? When co mparing strings we have to use a different set o f o perato rs.
So , yo u are pro bably wo ndering ho w o ne string can be less than ano ther string? When Perl co mpares two strings this
way, it tests strings fo r ho w they rank alphabetically. Fo r example, "abc" wo uld be less than "xyz."
$var1 = 1;
$var2 = 0;
$var3 = 5;
if($var1 == 1){
if($var2 == 0 ){
print("\$var2 is 0\n");
}
if($var2 == 1){
print("\$var1 is 1 and \$var2 is 1\n");
}else{
if($var3 == 2){
print("\$var1 is 1, \$var2 is not 1 or 0, and \$var3 is 2\n");
}
}
}
Here we have embedded if statements. The seco nd if wo uld no t be checked unless the first o ne was true. No tice that
the if statements are indented to match up with their clo sing brackets and the co de inside is indented even farther. This
is no t a requirement fo r Perl syntax, but it helps to keep things straight. If every statement was written justified to the left,
it wo uld be much harder to tell where o ne if statement ended and the next o ne began. No te also that we have used the
escape character (\) in o rder to print "$". Witho ut the escape character, Perl wo uld return the value o f the variable in its
place.
We can make the abo ve co de run a little faster by using e lsif , which stands fo r else if.
Make the fo llo wing changes to if.pl:
#!/usr/local/bin/perl
$var1 = 1;
$var2 = 0;
$var3 = 5;
if($var1 == 1){
if($var2 == 0 ){
print("\$var2 is 0\n");
}elsif($var2 == 1){
print("\$var1 is 1 and \$var2 is 1\n");
}else{
if($var3 == 2){
print("\$var1 is 1, \$var2 is not 1 or 0, and \$var3 is 2\n");
}
}
}
This co de has the same o utput as befo re. Save and test if.pl to see the results.
Embedded if statements are no t the o nly way to test fo r multiple co nditio ns. We can use several tests fo r the same
co nditio n.
$var1 = 1;
$var2 = 0;
$var3 = 5;
The if statements are still testing to see if the co nditio ns are true, but the co nditio ns are a little mo re co mplicated. In the
first case, the print statement will o nly be executed if $ var1 and $ var2 are equal to 1. A do uble ampersand (&&)
represents "AND", while a do uble pipe (||) represents "OR". The co nditio n fo r the last if statement returns true if either
o ne (o r bo th) o f the tests is true.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./if.pl
$var1 or $var2 equals 1
String Conditions
We mentio ned befo re that strings do no t use the no rmal co nditio nal o perato rs. Let us do a quick example to sho w this
in actio n. Edit a new file called ifstring.pl.
Add the fo llo wing co de to ifstring.pl:
#!/usr/local/bin/perl
$var1 = "hello";
$var2 = "goodbye";
$var3 = "hello";
if($var1 eq $var3){
print "These two strings are the same: $var1 $var3\n";
}
if($var1 ne $var2){
print "These two strings are not the same: $var1 $var2\n";
}
if($var1 lt $var2){
print "$var1 comes before $var2\n";
}else{
print "$var1 comes after $var2\n";
}
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ chmod 755 ifstring.pl
cold:~$ ./ifstring.pl
These two strings are the same: hello hello
These two strings are not the same: hello goodbye
hello comes after goodbye
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Loops
While Loops
An extensio n o f the if statements in the previo us lesso n is the ability to keep checking the co nditio n and keep
executing the co de o ver and o ver. This is do ne with loops. let us start by lo o king at while lo o ps. Create a new file
called while.pl to use fo r these examples.
#!/usr/local/bin/perl
$i = 0;
Save and test this example. Be sure to change the file permissio ns befo re testing it.
$ i starts o ut as zero which causes the co nditio n o f the while statement to be true. The value o f $ i is printed o ut and
the co nditio n is checked again. The co nditio n is still true so the print statement is executed yet again. This is what's
kno wn as an infinite loop. It will go o n fo rever until we sto p it. If yo u're testing o ut a pro gram and it seems to be stalled,
there's a go o d chance yo u've go t an accidental infinite lo o p.
$i = 0;
Excellent. No w, $ i will be incremented by o ne every time thro ugh the lo o p. The value will be printed o ut when it is 0 , 1,
and 2, but when $ i equals 3 the co nditio n will be false and the lo o p will end. let us check it o ut.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./while.pl
0
1
2
Pretty simple, right? Later we will see that we are no t limited to simple math tests in o ur co nditio ns. Lo ts o f Perl
functio ns will return true o r false values based o n the success o f what they were trying to do . There are many ways to
pro vide useful test co nditio ns fo r a lo o p.
For Loops
Using a variable such as $ i to lo o p a specific number o f times is very co mmo n. There is ano ther type o f lo o p that is
built perfectly fo r this situatio n. The difference between the two is that with a f o r lo o p the initial value, co nditio n, and
change are all co ntained in o ne spo t. let us create a script called for.pl to demo nstrate the wo rkings o f a "fo r lo o p".
Make the fo llo wing change to while.pl:
#!/usr/local/bin/perl
No tice that the three parts o f the f o r statement are separated by semi-co lo ns. And we aren't limited to using such
simple increments, any expressio n will do .
last wo rks just ho w its name implies. It makes the current iteratio n o f the lo o p the last o ne. The lo o p will no t even
finish executing the rest o f its co de blo ck. Go back and mo dify yo ur while.pl script.
#!/usr/local/bin/perl
$i = 0;
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./while.pl
0
1
2
3
4
5
6
Oh no! $i is greater than 5!
No w let us take a lo o k at ne xt .
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./while.pl
0
1
2
3
4
5
9
When $ i was equal to 5, we added 4 to it and started the lo o p o ver. No tice that it skipped the last part o f the lo o p, where
it wo uld have no rmally incremented $ i by o ne.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Arrays and Hashes
Arrays
So far, all we have been using is regular variables to sto re a number o r a string. What if we have a who le set o f data we
want to sto re? Do we have to create a bunch o f variables? No , luckily we can sto re them in an array. An array is simply
a list o f variables that are usually used to sto re a series o f related data. Whether it co ntains a bunch o f numbers o r
lines fro m a file, arrays are used the same way. Edit a new file called array.pl.
@myarray = (1,2,3,4,5);
@words = ("one","two","three","four","five");
When we reference an element o f an array we have to use the do llar symbo l ($) again, just as if it were a regular
variable. Square brackets are used to indicate the index o r po sitio n in the array. Arrays start with an index o f zero (0 ).
This means that the first element o f the array is at 0 , the seco nd at 1, and so o n. Save array.pl and try it o ut. Be sure to
change the file permissio ns befo re testing it.
Perl defines an array called @ARGV fo r us auto matically. It co ntains co mmand line arguments passed to the script
when it is executed.
#!/usr/local/bin/perl
$#ARGV will co ntain the number o f the last index in the array. This turns o ut to be the to tal number o f elements minus
o ne. Let us test this script o ut a few times to see ho w it wo rks.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./array.pl one
Maximum index: 0
Last argument: one
cold:~$ ./array.pl one two three
Maximum index: 2
Last argument: three
cold:~$ ./array.pl
Maximum index: -1
Last argument:
Here we can see that it always prints o ut the last argument passed to the script.
Printing Array
We have seen ho w to print o ut individual elements o f an array, but what if we wanted to print o ut the who le thing? We
co uld try this:
Make the fo llo wing changes to array.pl:
#!/usr/local/bin/perl
@words = ("one","two","three","four","five");
print @words;
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./array.pl
onetwothreefourfivecold:~$
Yuck. That is ho rrible. It wo uld be nice if we co uld print each element, o ne at a time, and add a newline o r so mething.
We need a way to start at the beginning o f the array and sto p at the end. Using a while lo o p so lves this pro blem.
#!/usr/local/bin/perl
@words = ("one","two","three","four","five");
$i=0;
while($i <= $#words){
print "$words[$i]\n";
$i++;
}
The index o f the array starts at zero so that is where we will start o ur variable. We increment $ i every time thro ugh the
lo o p (pro gressing thro ugh the elements o f the array) as lo ng as it is less than o r equal to the maximum index. The
results lo o k a little better than befo re.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./array.pl
one
two
three
four
five
This metho d is o bvio usly no t just useful fo r printing an array. It allo ws us to perfo rm o peratio ns o n each element o f the
array separately. We will get lo ts o f practice with this co ncept in the later lesso ns.
Foreach
A f o re ach lo o p allo ws us do so mething for each element in a list. Check this o ut:
@words = ("one","two","three","four","five");
Pretty nice, huh? The o utput is exactly the same as befo re, plus the co de is a little easier to understand.
Hashes
Hashes are a lo t like arrays in that they sto re a list o f data. Instead o f a numbered index, hashes asso ciate values with
a key. These keys are used to reference the values much like the index o f an array. Hashes are so rt o f like small
databases. let us take a lo o k at a pre-defined hash that sto res enviro nment variables by creating a script called hash.pl.
#!/usr/local/bin/perl
print %ENV;
To reference a who le hash we use % . Test this script to see what happens. It printed o ut a big mess, right? Referencing
the who le hash turns o ut no t to be very useful. Instead we can print o ut a hash value by using its key.
print "$ENV{'USER'}\n";
The "USER" enviro nment variable sto res yo ur username. To get the value fro m a hash we have to change the % to a $
and include the key inside braces after the hash name.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./hash.pl
username
$key = "USER";
print $ENV{$key}."\n";
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./hash.pl
username
No tice the use o f the perio d (co ncatenatio n o perato r) to push two strings to gether.
#!/usr/local/bin/perl
%names = ("john","smith",
"sally","johnson",
"bill","peterson");
Here we defined a hash by giving it a list o f key/value pairs. We used the ke ys functio n to get a list o f the keys fro m o ur
hash. Each key was then used within f o re ach to print o ut o ur hash table.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./hash.pl
john smith
sally johnson
bill peterson
If we did no t want to use the keys at all, we co uld use the value s functio n instead o f the ke ys functio n.
Perl also pro vides us with the so rt functio n that will let us alphabetize o ur list with ease.
%names = ("john","smith",
"sally","johnson",
"bill","peterson");
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./hash.pl
bill peterson
john smith
sally johnson
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Input
Getting Input
Up until this po int we have really just been learning abo ut basic Perl syntax. No w it is time to get so me real wo rk do ne.
Let us try to do the same so rt o f things we were do ing with se d by searching fo r a pattern in a file and printing o ut the
line. First, we have to get input.
One simple way o f getting input is to capture the o utput o f ano ther Unix co mmand. We do this by using back ticks. Edit
input.pl.
#!/usr/local/bin/perl
So me o f this we have seen befo re and so me we have no t. Getting o ur input takes place as an assignment o peratio n
fo r an array. @ input will co ntain an element fo r each line o f the co ntents fro m /etc/protocols. We already kno w that we
can use f o re ach to lo o k at each line o f an array. All we need to do no w is to search fo r a pattern. In the if statement
we have ano ther co nditio n o perato r that we have no t seen befo re. Using =~ allo ws us to match o ur variable against a
pattern. In this case, we want to see if $ line co ntains "ICMP". If a line matches the pattern it is printed o ut, o therwise the
f o re ach lo o p just go es o n to the next line.
Note The co mpanio n o perato r to =~ is !~. It allo ws us to match all lines that do not co ntain a specific pattern.
After the co mmand pro mpt, type the fo llo wing co mmands:
#!/usr/local/bin/perl
@input = <STDIN>;
The o nly difference here is that we are getting o ur input fro m STDIN instead o f fro m a Unix co mmand directly. Let us
see it in actio n.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ cat /etc/protocols | ./input.pl
icmp 1 ICMP # internet control message protocol
ipv6-icmp 58 IPv6-ICMP # ICMP for IPv6
This metho d o f grabbing the co ntents o f the file allo ws us to change the input witho ut mo difying the script.
while(<STDIN>){
if($_ =~ /ICMP/){
print $_;
}
}
What is $ _ all abo ut? This is a special variable that Perl uses to represent a default pattern matching space. It will be
equal to each line o f STDIN and can be used just like any o ther variable. The neat part abo ut $ _ is that since it is the
default space, it is no t always necessary to include it. Check this o ut:
That lo o ks a lo t simpler, co rrect? Instead o f piping in the co ntents o f a file as input, let us use this script to o bserve the
while lo o p treating the input as a stream. If we run the script by itself, it is still expecting input, so we have to type it
o urselves.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./input.pl
hello
what is happening?
ICMP
ICMP
we are creating the stream as we go
to end the stream, hit Ctrl+d
We can see what we are typing, but behind the scenes o ur script is using it as input. This beco mes apparent when we
type ICMP. As so o n as we hit enter o n that line it is used as the next pattern space in the while lo o p. It matches the
pattern and o ur script prints o ut ICMP. Then we co ntinue o n with the stream. To end the stream we have to hit Ct rl+d.
In Unix, this sends an EOF (end of file) which indicates the end o f the input stream. The lo o p will end and Perl will
co ntinue o n with the rest o f the script.
open(FILE,"/etc/protocols");
while(<FILE>){
if(/ICMP/){
print;
}
}
close(FILE);
Save and test this script to make sure yo u get the same o utput as befo re. The o pe n functio n creates a filehandle fro m
which to read. It is used just like STDIN was used befo re. We need to clo se the file when we are do ne with it.
If we do no t want to keep the file o pen the who le time, we can sto re the co ntents in an array and clo se the file befo re
do ing any wo rk o n its co ntents.
open(FILE,"/etc/protocols");
@lines = <FILE>;
close(FILE);
We will discuss later ho w we can o pen files fo r writing; either by creating a new file o r appending to an existing o ne.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Data Manipulation
Split
Recall that we used awk to break up the fields o f the passwo rd file. Perl uses the split co mmand to let us to the same
thing. Let us edit a new script called passwd.pl.
#!/usr/local/bin/perl
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
Just like befo re, we are reading in a file (/etc/passwd) and putting the co ntents into an array. Then we use f o re ach to
perfo rm an o peratio n o n each line o f the array.
The split co mmand takes a string, in this case $ line , and divides it into an array o f strings. It separates the elements
by whatever pattern is given as its first argument. Each time thro ugh the lo o p, @ f ie lds co ntains data fo r the next line.
Push
Currently, passwd.pl prints o ut a list o f usernames as they appear in the /etc/passwd file. But we want an alphabetized
list instead. We kno w we can easily so rt a list, that is no t the issue. So meho w we have to create a list o f usernames
that can be so rted later. One way to do this wo uld require a variable to "increment" each time thro ugh the f o re ach
lo o p, like so :
$i=0;
foreach $line (@lines){
@fields = split(/:/,$line);
print "$fields[0]\n";
$users[$i] = $fields[0];
$i++;
}
That seems like an awful lo t o f wo rk just to add a new element into an array. It wo uld be nice if we co uld just push
so mething o nto an array.
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
#!/usr/local/bin/perl
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
When yo u test this script, yo u will see that the usernames are no w printed o ut in alphabetical o rder.
The push functio n has a co unterpart called unshif t which do es exactly the same thing except the new
Note elements are added to the beginning o f the array instead o f to the end.
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
This wo uld wo rk, but it is really a rather ro ugh way o f do ing it. As always with Perl, there's pro bably a better way.
Add the fo llo wing co de to passwd.pl:
#!/usr/local/bin/perl
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
The re ve rse functio n do es exactly what its name implies. It takes an array, reverses the elements, and o utputs a new
array. We co uld still use a large print statement to o utput the new lines, but that's an awful lo t o f wo rk. What if we were
no t dealing with a kno wn number o f fields? Fo rtunately, Perl gives us split to break up the fields and it gives us jo in to
put them back to gether. To use jo in, all we have to do is specify a string to use as a separato r and a list o f elements to
jo in. Be sure and save this script befo re trying it o ut.
After the co mmand pro mpt, type the fo llo wing co mmands:
hottub:~$ ./passwd.pl |head
/bin/bash
:/root:root:0:0:x:root
:/bin:bin:1:1:x:bin
:/sbin:daemon:2:2:x:daemon
:/var/adm:adm:4:3:x:adm
:/var/spool/lpd:lp:7:4:x:lp
Who a! Ho ld o n a seco nd. I tho ught we were suppo sed to have reversed lines, so why is it all bro ken up like this? Let
us examine what is happening a little clo ser by lo o king at the first line o f /etc/passwd.
This is the who le line, co rrect? Well, no t really. The key to understanding this pro blem is realizing that the line has a
newline character at the end o f it.
This is what the line actually lo o ks like. No tice that we have that newline character at the end.
Our line is split up at all o f the co lo ns, giving us seven separate elements. Then we reverse the o rder o f the elements.
After rejo ining the line back to gether with co lo ns, we can see ho w the newline is embedded into the new line. When we
print this line o ut, the newline causes a line break even tho ugh we might no t have intended it.
Chomp
This trailing newline pro blem is so co mmo n that Perl has a special functio n specifically designed to get rid o f it when it
o ccurs. There is a functio n called cho p that remo ves the last character fro m all o f the elements in a list. That's go o d,
but we co uld accidentally remo ve characters that are no t newlines. Fo rtunately we have the cho m p functio n. It
remo ves all o f the newlines fro m each item in a list (o r no thing if there is no t a newline). Let us check this o ut in actio n.
open(FILE,"/etc/passwd");
@lines = <FILE>;
close(FILE);
Save this script and try it o ut. No tice that this time we get the results we expected to get the first time.
Instead o f giving cho m p a who le list, we co uld have just given it the last field (which we kno w had the newline).
OBSERVE:
chomp($fields[$#fields]);
Since we hsve remo ved the newline, we need to re-add it befo re printing it o ut. It is simple eno ugh just to add o n the
newline as part o f the print statement.
So me o f the entries in /etc/passwd do no t have a shell listed. If we were to remo ve the newline fro m $ line
befo re using split , we wo uld lo se the last field. Since no thing wo uld exist after the last co lo n, an element
Note is no t created. Thus, when we use jo in the extra co lo n is lo st. Try it o ut and see. See yo u at the next
lesso n!
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Fun with Regular Expressions
Regular Expressions
So up until no w we have claimed that Perl makes text manipulatio n easy. Regular expressio ns are what gives Perl a lo t
o f its po wer. Unfo rtunately, wo rking with regular expressio ns is no t easy. Simple wo rd matching patterns aren't difficult,
but regular expressio ns can get very co mplex. Fo r this reaso n, we're go ing to spend so me extra time o n regular
expressio ns.
Unix regular expressio ns and Perl regular expressio ns aren't identical. They have very similar, but a lo t o f the things
that can be do ne with Perl will no t wo rk o n the Unix co mmand line.
Quick Review
Let us briefly go o ver what we already kno w abo ut regular expressio ns. We kno w that matching a simple string is fairly
easy. We just include it inside o f a pair o f fo rward slashes.
/st ring/
/[abc]/
/str[io ]ng/
Let's lo o k carefully at the way Perl wo uld use this regular expressio n to match a string. Say we're searching fo r this
pattern in the string belo w:
We start at the first character and co ntinue searching until we match the first thing in the pattern.
After we see an S, we lo o k fo r t and r. If bo th o f tho se are fo und, we try to match the gro up [io ].
The "u" is no t an i o r an o so we sto p matching. We co ntinue thro ugh the string lo o king fo r the beginning o f o ur pattern
again.
It sure do es. So no w all we have to do is check fo r ng, if that matches we are do ne.
Multipliers
Multipliers allo w us to match part o f a pattern mo re than o nce. Fo r instance, what if I wanted to match "Ned" and "need"
with the same expressio n? So meho w I have to match multiple e 's.
/[Nn]e +d/
The plus sign (+) stands fo r one or more. This expressio n will match an upper o r lo wercase n, fo llo wed by o ne o r
mo re e 's and a d. But what if we needed to match "Nd" as well?
/[Nn]e *d/
The star (*) wo rks in the same way except that it stands fo r zero or more. The last simple multiplier we sho uld talk abo ut
is the questio n mark (?). The questio n mark lets us match zero or one o f so mething. It is kind o f like asking, "Is there
o ne o f these?"
/ab?c/
This wo uld match abc and ac. Ano ther way to use multipliers is to specify the minimum and maximum values directly
using braces. There are three main ways to define a multiplier using braces. First, a single number in braces, such as
{4 } , means it will match exactly 4 times. If we fo llo w that up with a co mma, as in {4 ,} , it wo uld match 4 o r mo re times.
The third and final way wo uld be to specify a maximum value. To match 4 to 6 times we wo uld use {4 ,6 } . Let us see
ho w this metho d can be used to make equivalents to the simple multipliers we already talked abo ut.
Special Characters
If there is o ne thing Perl has a lo t o f, it is special characters. The +, *, and ? fro m abo ve are examples o f special
characters (characters that have special meanings depending o n ho w they are used).
Two very useful special characters are the karat (^) and the do llar sign ($ ). When used in a regular expressio n, the ^
symbo l will match the beginning o f a line and $ matches the end. Fo r example, if we wanted to match all lines that
co ntain "The" at the beginning we wo uld use this:
/^T he /
/t he e nd$ /
$ matches befo re any newline characters so yo u do no t have to wo rry abo ut tho se getting in the way. Let us no t fo rget
that these two are special characters. Remember that $ is used when we reference variables as well. The ^ also has
ano ther use. If it is included at the beginning o f a character gro up with the brackets, it no t match the list. Let us lo o k at
an example:
In this case, the gro uping will match anything but i and o . Ano ther special character is the perio d (.). We have already
seen that it can be used as the co ncatenatio n o perato r fo r strings. But inside o f a regular expressio n it is a who le
different sto ry. A perio d represents any character.
/st r.ng/
This expressio n will match "string", "stro ng", "str9 ng", etc. The o nly thing a perio d will no t match is a newline. Perio ds
are o ften used in co njunctio n with multipliers.
/^T he .*nice /
Here we are matching any line that starts with "The" and co ntains any number o f characters fo llo wed by "nice". That is
all fine and go o d, but yo u are pro bably wo ndering ho w we wo uld match an actual perio d in the text.
T he Escape Character
The escape character, represented by a backslash (\), indicates to Perl that the character right after it sho uld be treated
differently than it no rmally wo uld. This is used fo r two main purpo ses. First, it can make special characters have no
special meaning o r it can make o rdinary characters take o n a new meaning.
Say we wanted to match a perio d inside o f o ur text. Let us take it a little further and try to match a do llar amo unt. The
decimal po int and the do llar sign are bo th special characters.
The escape character has been used here to disregard the no rmal meaning o f the do llar sign and decimal po int. Can
yo u spo t any pro blems with the expressio n? Is it really useful? Are there any cases where the expressio n wo uld no t
match a do llar amo unt? Aha! This expressio n will no t match things like $1,0 0 0 .0 0 and $5. Here is a better way
(changes are in blue ):
The escape character functio ns with quo tes, semi-co lo ns, at symbo ls (@), etc. As mentio ned, the escape character
no t o nly remo ves the special meaning fro m so me symbo ls, it also adds meaning to o thers. Belo w is a table o f the
mo st co mmo nly used o f these:
Character Matches
\w alphanumeric, including undersco re (_)
\s whitespace
\d numeric
\n a newline
\W no n-alphanumeric
\S no n-whitespace
\D no n-digit
These special characters are used in a regular expressio ns just like any o ther character o r gro up o f characters. Yo u
are do ing great! See yo u at the next lesso n.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
More Fun with Regular Expressions
Co ntinuing with o ur example, instead o f sto pping after the first match we can co ntinue until the end o f the string.
Perl, like Unix, is case sensitive. This means that it treats upper and lo wer case letters differently. This can cause
pro blems if we are lo o king fo r a specific wo rd, but we do no t kno w if it is at the beginning o f a sentence o r maybe even
in all upper case letters. We co uld use character gro upings like this:
/[Ww][Oo ][Rr][Dd]/
That is a little ridiculo us, tho ugh, do n't yo u think? Instead we can use a regular expressio n co mmand.
/wo rd/i
#!/usr/local/bin/perl
$oldfile = "/etc/passwd";
$newfile = "/users/username/passwd";
open(FILE,"$oldfile");
open(NEW,">$newfile");
while(<FILE>){
$_ =~ s/sbin/sysbin/;
print NEW;
}
close(FILE);
close(NEW);
Mo st o f this pro gram wo rks by setting up the files and reading in the lines that we need to mo dify. The search and
replace is do ne o n a single line. That is why regular expressio ns are so po werful. Ho wever, this is no t the best
expressio n we co uld have used. Keep in mind that Perl will replace the first instance o f "sbin" that it finds. What if o ne o f
the user names was "sbinder?" We wo uld have a pro blem then. The fo llo wing regular expressio n wo uld be a little
better:
s/\/sbin\//\/sysbin/;
No bo dy is go ing to have a lo gin name o f "/sbin/", that is fo r sure. Regular expressio ns are po werful, but they can also
be dangero us. Fo r tho se having a difficult time reading the abo ve regex here is a reader-friendly view:
s/ \/sbin\/ / \/sysbin /; #We are escaping the slashes o f "/sbin/" and "/sysbin"
T ranslate
The translatio n co mmand, t r, lo o ks like search and replace when yo u first see it, but it acts a lo t differently. Instead o f
replacing an entire pattern with so mething else, translatio n takes a list o f characters and replaces them with
co rrespo nding characters fro m a replacement list. Let us write a script called translate.pl that co nverts all o f the upper
case letters fro m input into lo wer case letters.
#!/usr/local/bin/perl
while(<STDIN>){
$_ =~ tr/[A-Z]/[a-z]/;
print;
}
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ chmod 755 translate.pl
cold:~$ echo "ABC" |./translate.pl
abc
Here we have translated entire gro upings o f characters. We do no t have to use o nly single letters. Here are mo re
po ssible translatio ns.
t r/123/abc/
t r/\'/\" /
Parentheses can be used as part o f a pattern to sto re a match in a tempo rary variable. Create a file called grab.pl.
#!/usr/local/bin/perl
while(<STDIN>){
if(/str([io])ng/){
$vowel = $1;
print "We matched $vowel\n";
}
}
This script tests each line o f input fo r the given regular expressio n. The difference is that when there is a match, the
vo wel that was part o f the match gets sto red in $ 1.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./grab.pl
string
We matched i
Ctrl+c
The entire wo rd "string" was matched, but o nly the "i" was sto red in a variable. Try it again with "stro ng" and see what
happens. We can use as many sets o f parenthesis as we want and the variables will be named in numerical o rder.
Perl allo ws us to use the sto red matches in the same regular expressio n in which it was matched. This is do ne by
using the escape character and is kno wn as a back reference.
while(<STDIN>){
if(/str([io])ng\1/){
$vowel = $1;
print "We matched $vowel\n";
}
}
We have o nly added o ne thing, but let us see ho w it changes the functio nality o f the script.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./grab.pl
stringi
We matched i
strongo
We matched o
stringo
Ctrl+c
#!/usr/local/bin/perl
while(<STDIN>){
$_ =~ s/^([^:]*:)([^:]*:)/\2\1/;
print;
}
Let's save this and make sure it wo rks befo re we try to figure it o ut.
After the co mmand pro mpt, type the fo llo wing co mmands:
That regular expressio n sure lo o ks co mplicated, but the trick is to break it do wn into pieces. Here we have the who le
thing:
s/^([^:]*:)([^:]*:)/\2\1/
The first s indicates that this is a search and replace o peratio n. That gives us two parts to deal with: the pattern and the
replacement string.
^([^:]*:)([^:]*:)
Examining the pattern mo re clo sely, we can see three main parts. The first karat matches the beginning o f the string.
No w we've go t two parts that lo o k exactly the same.
([^:]*:)
This expressio n matches zero o r mo re characters that aren't co lo ns fo llo wed by a single co lo n. Then the match is
sto red in a tempo rary variable. In this case, we do no t want to use a perio d to match all characters because it wo uld
have matched co lo ns to o . That is no t so bad. No w let us lo o k at the replacement string.
\2\1
This is simply two back references to tempo rary variables. We've reversed the o rder to switch them aro und. Let us put
it all back to gether again.
s/^([^:]*:)([^:]*:)/\2\1/
while(<STDIN>){
if(/([ts]?he)/){
$new = ucfirst($1);
$_ =~ s/$1/$new/;
print;
}
}
The ucf irst functio n takes a string and capitalizes the first character. This script finds a line that co ntains "the", "she",
o r "he". Then it capitalizes the first letter and replaces the o riginal match with the new and impro ved o ne.
Of co urse, there is an easier way. We do no t have to do so much wo rk. The e co mmand will evaluate the replacement
string as no rmal Perl co de befo re using it as a replacement. Here is a better way to do the previo us example:
while(<STDIN>){
$_ =~ s/([ts]?he)/ucfirst($1)/e;
print;
}
No te that when the replacement string is evaluated we can no t use the escape character back reference. We have to
use the tempo rary variable name.
Wrap Up
The trickiest part abo ut regular expressio ns is taking a pro blem and figuring o ut ho w to apply a regular expressio n to
help so lve it. This is where lo ts o f practice co mes in handy.
Functions
We use functions in o ur scripts to perfo rm specific tasks. We have used lo ts o f them already, including print , split and
so rt . These are each pre-defined metho ds fo r do ing specific tasks. We co uld write the co de o urselves using o ther
parts o f Perl, but we cho o se to use these functio ns because they help to clean up the o verall lo o k o f o ur script (that
and it's easier than writing o ur o wn).
So metimes we have scripts that use the same sectio n o f co de o ver and o ver. Other times we may have a huge piece
o f co de that we o nly use o nce, but it still manages to clutter up o ur script. In these cases (and a few o thers) we may
want to define o ur o wn functio ns o r subro utines. Let's start a new script named f unct io n.pl:
CODE TO TYPE:
#!/usr/local/bin/perl
open(PASSWD,"/etc/passwd");
@passwd = <PASSWD>;
close(PASSWD);
This script takes lines fro m the /etc/passwd file and displays them scrambled in the co nso le. It is no t a particularly
useful script, but we are go ing to play with it so we can experiment with the functio ns. Let's remo ve the guts o f this
script into a separate Perl functio n.
CODE TO TYPE:
#!/usr/local/bin/perl
open(PASSWD,"/etc/passwd");
@passwd = <PASSWD>;
close(PASSWD);
sub messup {
my $line = shift(@_);
$line =~ s/^([^:]*:)([^:]*:)/\2\1/;
$line =~ tr/[A-Z]/[a-z]/;
@fields = split(/:/,$line);
chomp(@fields);
@rfields = reverse(@fields);
$newline = join(":",@rfields);
return $newline;
}
The to p part o f the script is unchanged except that we have replaced mo st o f the co de with this line:
sub m e ssup{
This line defines the beginning o f o ur functio n (subro utine). There is a matching clo sing brace (}) at the end. The next
line might be the mo st impo rtant part:
Functio ns wo uld no t be nearly as useful if they co uld no t receive input and return results. When a functio n is invo ked,
the arguments that are passed to it are sto red in a list called @ _. It wo rks so rt o f like the tempo rary variable $ _, but it
wo rks like an array instead o f a single variable. The shif t functio n returns the first element fro m the list, which in this
case is the o nly variable we gave as an argument. Next we assign this value to $ line . The additio n o f m y makes the
variable private to the subro utine so it will no t get co nfused with o ther variables o f the same name fo und in o ur main
script.
re t urn $ ne wline ;
But let's get back to functio ns making pro gramming easier—so far, this new functio n just seems like a lo t o f extra wo rk.
That's true at first, but we can perfo rm the same o peratio n again with very little extra effo rt. Make the fo llo wing change to
f unct io n.pl:
CODE TO TYPE:
#!/usr/local/bin/perl
open(PASSWD,"/etc/passwd");
@passwd = <PASSWD>;
close(PASSWD);
sub messup {
my $line = shift(@_);
$line =~ s/^([^:]*:)([^:]*:)/\2\1/;
$line =~ tr/[A-Z]/[a-z]/;
@fields = split(/:/,$line);
chomp(@fields);
@rfields = reverse(@fields);
$newline = join(":",@rfields);
return $newline;
}
Here we perfo rmed the same set o f o peratio ns a seco nd time. All we had to do was add an additio nal call to the
functio n instead o f writing o ut all o f that co de again. Save and test this script to see the results.
Libraries
Ano ther benefit to writing functio ns is that we can reuse them in future scripts. We just co py them into new scripts o r put
them into a library. A library is a set o f pre-written functio ns that are set aside in a separate file fo r use in o ther scripts.
Let's co py m e ssup fro m f unct io n.pl into a library. To remo ve the co de marked fo r deletio n, highlight it and press
[Ct rl+x].
Make the fo llo wing changes to functio n.pl:
#!/usr/local/bin/perl
require 'mess-lib.pl';
open(PASSWD,"/etc/passwd");
@passwd = <PASSWD>;
close(PASSWD);
sub messup{
my $line = shift(@_);
$line =~ s/^([^:]*:)([^:]*:)/\2\1/;
$line =~ tr/[A-Z]/[a-z]/;
@fields = split(/:/,$line);
chomp(@fields);
@rfields = reverse(@fields);
$newline = join(":",@rfields);
return $newline;
}
The new re quire line tells Perl to include the co ntents o f m e ss-lib.pl (which we'll create in a mo ment) when it runs.
We co uld write lo ts o f o ther scripts that use the m e ss-lib.pl library as well. All we need to include is the re quire line.
No w, create the m e ss-lib.pl file. The name o f this file do esn't really matter (as lo ng as o ur re quire statement abo ve
matches it), but it is helpful if the name indicates what the file co ntains—but yo u knew that. Add the fo llo wing co de to
mess-lib.pl (first, press [Ct rl+v] to insert the co de we remo ved fro m functio n.pl):
CODE TO TYPE:
sub messup{
my $line = shift(@_);
$line =~ s/^([^:]*:)([^:]*:)/\2\1/;
$line =~ tr/[A-Z]/[a-z]/;
@fields = split(/:/,$line);
chomp(@fields);
@rfields = reverse(@fields);
$newline = join(":",@rfields);
return $newline;
}
1;
This is the exact same functio n we wo rked o n befo re with the exceptio n o f that last line. Perl library files must return
"true", so every library file needs to end with a 1;. Save this file.
Also , if we have to change ho w m e ssup wo rks, we wo uld o nly have to change o ne file instead o f all o f o ur scripts.
Go o d jo b! Let's mo ve o n to the next lesso n.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Directories and Files
Reading Directories
In o rder to perfo rm o peratio ns o n a directo ry full o f files, it beco mes necessary to be able to list the directo ry's
co ntents. To do this we wo uld use ls o n the co mmand line. We co uld do the same in Perl by using back ticks to
capture the o utput. But it wo uld be better to use a few o f Perl's built-in functio ns to read the directo ry. Write a script
called list.pl that will list the co ntents o f the current directo ry.
#!/usr/bin/perl
opendir(CURRENT,".");
@list = readdir(CURRENT);
closedir(CURRENT);
These three new functio ns are fairly self-explanato ry. We o pen the directo ry, read its co ntents into an array, and finally,
clo se the directo ry when we are finished. Save this script and test it.
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./list.pl
.
..
cgi
.emacs
html_css
.ncftp
.bash_login
.bash_history
.error_log
linux4
.ssh
There are a co uple things to no tice abo ut this o utput. First, the co ntents are no t listed alphabetically. They are listed in
the o rder that they were created. In mo st cases yo u do no t really need an alphabetical list so there is no po int in Perl
wasting the time to do it. The seco nd thing to no te is that do t files sho w up auto matically. Let us change o ur script so it
do es no t list files that begin with a do t.
opendir(CURRENT,".");
@list = readdir(CURRENT);
closedir(CURRENT);
Instead o f simply printing o ut the file names, we co uld o pen the files and perfo rm o peratio ns o n them.
File T ests
Once we have a filename, either by supplying it o urselves o r reading it fro m a directo ry, we can use it to do a number o f
tests. File test operators lo o k a lo t like Unix co mmand line o ptio ns and are o ften used within co nditio nal statements.
#!/usr/bin/perl
$username = "yourusername";
$filename = "/users/".$username."/.bash_login";
if(-e $filename){
open(FILE,"$filename");
while(<FILE>){ print };
close(FILE);
}else{
print "Error: $filename doesn't exist\n";
}
Replace "username" with yo ur username when creating this script. Save and test the script. If yo u have a .bash_login
file, the script will display its co ntents, o therwise yo u will get an erro r saying the file do es no t exist.
Ano ther metho d enables us to make sure the file has a no n-zero size. Using the -s file test o perato r instead o f -e , we
can check fo r a file and determine its size at the same time.
$username = "yourusername";
$filename = "/users/".$username."/.bash_login";
if($size = -s $filename){
print "$filename has a size of $size\n";
}else{
print "Error: $filename doesn't exist or has zero size.\n";
}
Change the $ f ile nam e variable to a file that yo u kno w exists. Save and test this script. Additio nally, the -z can be used
to test fo r a file that exists and has zero size.
We can test to see if a file exists and find o ut its size, and we can also use file tests to determine a lo t o f additio nal
info rmatio n. Here is a table o f a few o f the different file test o perato rs available.
File T e st s Descriptio n
-e Exists
-s No n-zero size (returns size)
-z Zero size
-r Readable by effective uid/gid
-w Writable by effective uid/gid
-x Executable by effective uid/gid
-o Owned by effective uid/gid
-f Plain file
-d Directo ry
-l Symbo lic Link
-M Age o f file (in days) since last mo dificatio n
The effective user o r gro up id is typically go ing to be the uid/gid o f the user executing the script.
File Manipulation
We no w kno w quite a bit abo ut gathering info rmatio n o n a file. No w let us learn ho w to mo dify things. We can change
the permissio ns using the chm o d functio n.
This wo rks just the way yo u wo uld expect, except fo r o ne impo rtant difference. The numerical permissio ns must start
with a zero in o rder fo r them to be interpreted co rrectly.
A file can be renamed with a functio n called, lo gically eno ugh, re nam e . Using it is pretty simple.
No t everything is named quite so intuitively tho ugh. Fo r instance, to delete a file we use unlink. Creating hard links and
symbo lic links is do ne with link and sym link respectively.
There seems to be so mething missing here tho ugh. Ho w wo uld we co py a file? There is no t a co mmand specifically
fo r this purpo se built into Perl. This is o ne o f many cases where yo u find it might be easier to use a simple Unix
co mmand instead. Fo rtunately this is no t a pro blem.
system("find",@ARGV);
When syst e m executes its co mmand, the Perl script will wait fo r it to finish befo re co ntinuing. This is necessary when
the results o f yo ur system co mmand are needed by the rest o f the script. In case yo u do no t want to wait fo r the
external co mmand to finish, yo u can always backgro und it with an ampersand (&) as part o f the co mmand string.
A seco nd, similar metho d to use is the e xe c functio n. The main difference between e xe c and syst e m is that e xe c will
terminate the o riginating Perl script and replace it with the new co mmand. Therefo re, it is o nly useful at the end o f a
script, typically fo r running a shell o r so mething like that.
Reusing user input o n the co mmand line with a system call can be dangero us in cases when the
Perl script is running as a different user than the perso n executing it (CGI scripts and suid scripts).
Fo r example, if yo u grab user input and they input "; so me co mmand" as an argument, the
WARNING semico lo n wo uld represent the end o f the first co mmand. This wo uld allo w the seco nd arbitrary
co mmand to run o n the system. User input sho uld always be parsed fo r special characters with a
regular expressio n befo re being used in a system call.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.
Recursive Directory Search
Recursion
Yo u have pro bably heard o f "recursio n" even if yo u are no t entirely familiar with the co ncept. The idea behind recursion
is to fo cus o n a pro blem, divide it into increasingly small pro blems o f the same type and find so lutio ns fo r the smaller
pro blems. At this po int, yo u can back up and see that the entire big pro blem has been so lved.
In Perl, a recursive sub-ro utine wo uld call itself o ver and o ver as it makes pro gress to wards a base case. The base
case is the smallest part o f the pro blem with a "kno wn" so lutio n. Recursio n can be a difficult co ncept to get yo ur mind
aro und at first, so let us lo o k at an example.
Where do we start? Well, befo re we write any co de at all, let's make a plan. We kno w ho w to o pen a directo ry and read
its co ntents. Using file tests we can also find the name o f o ther directo ries within the first o ne. Once we get a list o f
tho se directo ries we can list their co ntents as well. The pro blem is that we do no t kno w ho w far this is go ing to go . But
do yo u see the pattern? Even if we are five directo ries deep, we are do ing the same thing: reading a directo ry and
getting a list o f o ther directo ries within it.
Let's write a functio n that do es just that. Create a script file called rd.pl.
#!/usr/local/bin/perl
$startdir = "/lib";
list_dirs($startdir);
sub list_dirs{
my $dir = shift(@_);
opendir(TOP,$dir);
my @files = readdir(TOP);
closedir(TOP);
The list _dirs functio n reads the directo ry it is given to get a list o f files. Then it tests tho se files to see which o nes are
directo ries. No tice that when this happens we have to include the first directo ry as part o f the test (-d " $ dir/$ f ile " ) so
that the path is co rrect. Save and test this script:
After the co mmand pro mpt, type the fo llo wing co mmands:
cold:~$ ./rd.pl
kbd
security
rtkaio
modules
terminfo
.
i686
..
udev
firmware
Excellent. No w the idea here is to take these new directo ries and do the same thing o ver and o ver again until there are
no mo re directo ries left. There is o ne pro blem, ho wever. What happens if we run o ur functio n o n the . and ..
directo ries? These are the current and previo us directo ries. If o ur script is allo wed to list the co ntents o f these
directo ries as well, it will never end. It will keep go ing back and fo rth while the pro blem gro ws ever larger. To prevent
this, we prevent . and .. fro m being added to the list o f new directo ries.
Remember, if yo u find yo urself in an infinite lo o p, yo u can quit yo ur pro cess by hitting Ctrl+c at the
Note co mmand line.
Removing . and ..
Once we have listed all o f the files, we will so rt them so we end up with . and .. at the to p o f o ur array. It is simple
eno ugh to remo ve them befo re do ing any mo re pro cessing.
#!/usr/local/bin/perl
$startdir = "/lib";
list_dirs($startdir);
sub list_dirs{
my $dir = shift(@_);
opendir(TOP,$dir);
my @files = readdir(TOP);
closedir(TOP);
@files = sort(@files);
shift(@files);
shift(@files);
That is all there is to it. We no lo nger have to wo rry abo ut tho se two special directo ries. Save and test the script to
make sure they are go ne.
$startdir = "/lib";
list_dirs($startdir);
sub list_dirs{
my $dir = shift(@_);
opendir(TOP,$dir);
my @files = readdir(TOP);
closedir(TOP);
@files = sort(@files);
shift(@files);
shift(@files);
All we have do ne is add a line that co ntinues the search do wn the next branch o f the directo ry tree. In the next branch,
the same thing will happen and the search will co ntinue all the way do wn until there are no mo re directo ries.
Save and test the script. Yo u sho uld see a lo t o f directo ry names all in a big co lumn. So no w we have go t the list o f
directo ries, but the o nly pro blem is that it is really hard to tell what go es where. It wo uld be nice if the subdirecto ries
were indented slightly fro m their parent directo ries.
$startdir = "/lib";
$level = 0;
list_dirs($startdir,$level);
sub list_dirs{
my $dir = shift (@_);
my $lev = shift (@_);
opendir(TOP,$dir);
my @files = readdir(TOP);
closedir(TOP);
@files = sort(@files);
shift(@files);
shift(@files);
sub spaces{
my($num) = shift(@_);
for($i=0;$i<$num;$i++){
print " ";
}
}
Test the script to see the new o utput. As part o f a quiz yo u will be describing ho w this wo rks. Go o d luck!
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
See http://creativecommons.org/licenses/by-sa/3.0/legalcode for more information.