Pratical Extraction and Reporting Language: Welcome
Pratical Extraction and Reporting Language: Welcome
Welcome
Marcos Rebelo [email protected]
What is Perl?
Perl Pros
Perl Cons
Hello World
inafile:
print "Hello, World!\n" #> perl -e 'print "Hello, World!\n"' #> perl -E 'say "Hello, World!"'
ontheshell
withperl5.10
Perlstatementsendinasemicolon. say("Hello, world!"); Youcanuseparenthesesforsubroutinesarguments oromitthemaccordingtoyourpersonaltaste. say "Hello, world!"; Commentsstartwitha#andruntotheendofthe line. # This is a comment
Doublequotesorsinglequotesmaybeusedaround literalstrings: say("Hello, world!"); say('Hello, world!'); However,onlydoublequotesinterpolatevariables andspecialcharacterssuchasnewlines(\n): print("Hello, $name!\n"); # interpolate $name print('Hello, $name!\n'); # prints $name!\n literally
[email protected]
Whitespacesareirrelevant: print( "Hello world"); ...exceptinsidequotedstrings: print("Hello world"); # this would print with a # line break in the middle.
10
Perl5.10asaddedanewfeature
use feature qw(say); say 'Hello, world!'; print 'Hello, world!', "\n"; { local $\ = "\n"; print 'Hello, world!' }
11
Isequivalentto:
Wewillseelaterthatthecorrectexpressionis:
TorunaPerlprogramfromtheUnixcommandline: #> perl progname.pl Alternatively,putthisshebanginyourscript: #!/usr/bin/perl andrunyourexecutablescript #> progname.pl -e:allowsyoutodefinethePerlcodeinthe commandlinetobeexecuted,Etoget5.10
#> perl -e 'print("Hello, World!\n")'
12
Scalars
my $animal = 'camel'; my $answer = 42; my $scalar_ref = \$scalar;
Arrays
my @animals = ('camel', 'lama'); my @numbers = (23, 42, 69); my @mixed = ('camel', 42, 1.23);
AssociativeArray/HashTables
my %fruit_color = ( apple => 'red', banana => 'yellow' ); [email protected]
13
Scalar Type
my $animal my $answer
= 'camel'; = 42;
my $scalar_ref = \$scalar;
Perlwillautomaticallyconvertbetweentypesas required.
14
Array Type
Anarrayrepresentsalistofscalars.Thevariable namestartswitha@.
my @animals = ('camel', 'llama'); my @numbers = (23, 42, 69); my @mixed = ('camel', 42, 1.23);
Arraysarezeroindexed,negativeindexstartby1 attheendofthearray.
15
Array Slice
Togetmultiplevaluesfromanarray:
@numbers[0,1]; # gives (23, 42); @numbers[0..2]; # gives (23, 42, 69); @numbers[1..$#numbers];
16
Array Type
Thespecialvariable$#arraytellsyoutheindexof thelastelementofanarray:
17
Ahashrepresentsasetofkey/valuepairs:
Youcanusewhitespaceandthe=>operatortolay themoutmorenicely:
18
Togetathashelements:
Youcangetatlistsofkeysandvalueswith keys()andvalues().
Hasheshavenoparticularinternalorder,thoughyou cansortthekeysandloopthroughthem.
[email protected]
19
Hash Slice
Toget/setmultiplevaluesfromanhash
@fruit_color{'watermelon', 'orange'} = ('green', 'orange'); my @array = (1,2,9,5,2,5); my %hash; @hash{@array} = (); @array = keys(%hash); say "@array"; # 1 9 2 5
Removingrepetitionfromanarray
20
Variable Reference
Scalarreferences
Arrayreferences
Hashreferences
Unrefavariable
21
Morecomplexdatatypescanbeconstructedusing references:
my $cities = { IT => [ 'Milan', 'Rome', ...], PT => [ 'Lisbon', 'Oporto', ...], ... }; print($cities->{'IT'}->[1]); print($cities->{'IT'}[1]); my @citiesPT = @{$cities->{PT}};
[email protected]
Youcanaccessitthrow:
22
Lexical Variables
Throughoutthepreviousslidestheexampleshave usedthesyntax:
Themyisactuallynotrequired,youcoulduse:
However,theaboveusagewillcreateglobal variablesthroughoutyourprogram.
Lexical Variables
Nothavingthevariablesscopedisusuallyabad programmingpractice.
my $a = 'foo'; if ($some_condition) { my $b = 'bar'; print($a); # prints "foo" print($b); # prints "bar" } print($a); # prints "foo" print($b); # prints nothing; $b has # fallen out of scope
[email protected]
24
Lexical Variables
25
Magic variables
Therearesomemagicvariables.Thesespecial variablesareusedforallkindsofpurposes.
26
Arithmetic
+ * / && || !
+= -= *= /= and or not
Booleanlogic
27
Miscellaneous
28
29
Conditional constructs
Perlhasmostoftheusualconditional
if COND BLOCK if COND BLOCK else BLOCK if COND BLOCK elsif COND BLOCK if COND BLOCK elsif COND BLOCK else BLOCK
if ( is_valid( $value ) ) { }
[email protected]
30
Conditional constructs
There'salsoanegatedversionofif(don'tuseit):
unless ( is_valid( $value ) ) { ... }
Thisisprovidedasa'morereadable'versionof
if ( not( is_valid( $value ) ) ) { ... }
0,'0','',()andundefareallfalseinaboolean context.Allothervaluesaretrue.
[email protected]
31
Conditional constructs
Thetraditionalway if ($zippy) { print("Yow!"); } ThePerlishpostconditionway print("Yow!") if $zippy; print("No cubes") unless $cubes;
[email protected]
32
Perlhasmostoftheusualloopconstructs.
WhileLoop: while ( is_valid( $value ) ) { } There'salsoanegatedversion(don'tuseit): until ( is_valid( $value ) ) { } Youcanalsousewhileinapostcondition: print("xpto\n") while condition; Goingthrowahash: while (my ($key,$value) = each(%ENV)){ print "$key=$value\n"; }
[email protected]
33
TheCstyleforloopisrarelyneededinPerlsince Perlprovidesthemorefriendlyforeachloop.
34
Passingallelementsofanarray,foreachisan aliastofor.
foreach my $var (@array) { ... } for my $value (values(%hash)) { ... } foreach (@array) { print "$_\n" }
Bydefaultthevaluegoeson$_
Changingthevariable,changesthevalueinsidethe array.$varisanalias.
35
Jump on loops
LINE: while ( defined(my $line = <>) ) { next LINE if not is_valid( $line ); #... }
Jumponloops:
last LABEL:immediatelyexitstheloopinquestion next LABEL:startsthenextiterationoftheloop redo LABEL:restartstheloopblockwithout evaluatingtheconditionalagain IftheLABELisomitted,thecommandreferstothe innermostenclosingloop.
36
Withasinglecommand
Withmultiplycommands
37
Exercises 1 - Scalars
1)ImplementtheGuesstheluckynumber.The programshallchosearandomnumberbetween0and 100andasktheuserforaguess.Noticetheuserifthe guessisbigger,smallerorequaltotherandom number.Iftheguessiscorrecttheprogramshallleave otherwisereaskforanumber.
38
Exercises 1 - Array
2)Createanarraythatcontainsthenamesof5 studentsofthisclass. 2a)Printthearray. 2b)CreateanewArrayshiftingtheelementsleftby onepositions(element1goesto0,)andsetting thefirstelementinthelastposition.Printthearray. 2c)Askausertoinputanumber.Printthenamewith thatindex.
[email protected]
39
Exercises 1 - Hash
3)HomerFamily
my %relatives Lisa => Bart => Maggie => Marge => Homer => Santa =>
40
Subroutines
41
Subroutines
ThePerlmodelforsubroutinecallandreturnvalues issimple:
42
Subroutine - example
sub max { my $mval = shift(@_); # my ($mval, @rest) = @_; # big copy foreach my $foo (@_) { if ( $mval < $foo ) { $mval = $foo; } } return $mval; }
43
Thelaststatementvalueisreturnedbydefault.
44
Using@_isdangerouseandshallbecarefully considered.It'salwayspossibletodoacopy.
sub double { my ($a) = @_; $a *= 2; return $a; } my $b = 5; print( double( $b ) ); # prints 10 print($b); # prints 5
45
Justbecausealexicalvariableisstaticallyscopedto itsenclosingblock,eval,orfile.
state variables
Fromperl5.10youmayusestaticvariables.
use feature 'state'; sub gimmeAnother { state $secretVal = 0; return ++$secretVal; } print gimmeAnother; # OK
Somefeaturesinperl5.10havetobeactivatedto avoidcolisionswitholdcode.Activatingallthem:
Named Parameters
48
Named Parameters
Wemaypassthehashdirectly.
49
Named Parameters
Wecaneasilygivedefaultvaluesbycheckingthe hash.
sub login { my ($param) = @_; $param->{user} = $DEFAULT_USER if not exists $param->{user}; $param->{pass} = $DEFAULT_PASS if not exists $param->{pass}; $param->{host} = $DEFAULT_HOST if not exists $param->{host}; ... }
[email protected]
50
Named Parameters
Wecaneasilygivedefaultvaluesbycheckingthe hash.
sub login { my $param = { 'user' => $DEFAULT_USER, 'pass' => $DEFAULT_PASS, 'host' => $DEFAULT_HOST, %{shift(@_)} }; ... }
[email protected]
51
Named Parameters
Wecanalsowritethesubroutinesothatitaccepts bothnamedparametersandasimplelist.
sub login { my $param; if ( ref($_[0]) eq 'HASH' ) { $param = $_[0]; } else { @{$param}{qw(user pass host)}=@_; } ... } login('Login', 'Pass'); login({user => 'Login', pass => 'Pass'});
[email protected]
52
Exercises 2
1)Createanewsubroutinethatcalculatesthe Fibonacciseries.Usingthissubroutine,doa programthatreceivesmultiplenumbersasargument andprintstheFibonaccivalue. F(0)=0 F(1)=1 F(n)=F(n1)+F(n2) 1a)withpresistentvariable 1b)withstatevariable
[email protected]
53
IO
54
Read a File
open(FH, '<', 'path/to/file') or die "can't open file: $!"; while ( defined( my $line = <FH> ) ) { chomp($line); } close(FH);
55
Open a Filehandler
Openingafileforinput.
Openingafileforoutput.
open(OUTFH, ">", "output.txt") or die $!; open(OUTFH, ">output.txt") open(LOGFH, ">>", "my.log") open(LOGFH, ">>my.log")
[email protected]
Openingafileforappending
Open a Filehandler
Youcanalsouseascalarvariableasfilehandle:
open(my $inFH, "input.txt") or die $!; alexicalscalarvariableclosesattheendoftheblockifit wasn'tclosedbefore open(my $net, "netstat |") or die "Can't netstat: $!"; open(my $sendmail, "| sendmail -t") or die "Can't open sendmail: $!";
[email protected]
It'spossibletopipefromacommand:
It'spossibletopipetoacommand:
57
Write to a Filehandle
print STDERR ('Are you there?'); print OUTFH $record; print { $FH } $logMessage; Note:Thereisno,betweenfilehandleandthetext. close($inFH);
ClosingFileHandles:
58
Youcanreadfromanopenfilehandleusingthe<> operatororthereadline()subroutine.
Linebyline: my $line = <$fh>; my $line = readline($fh); Slurp: my @lines = <$fh>; my @lines = readline($fh);
59
while ( defined(my $line = <$fh>) ) { print "Just read: $line"; } foreach my $line (<$fh>) { # slurps print "Just read: $line"; }
[email protected]
60
open(my $fh, '<', $myfile) or die $!; my $txt = do{local $/ = undef; <$fh>}; close($fh); my $txt = do { local (@ARGV, $/) = ($myfile); readline(); };
[email protected]
61
62
*DATA Filehandle
Exercises 3
Theprogramshallbecalledlike:
Exercises 3
Tosplitastringuse: split(/\t/,$str)
[email protected]
65
Regular Expressions
66
Regular Expressions
Thesimplestregexissimplyaword.
"Hello World" =~ m/World/ Orsimply:"Hello World" =~ /World/ print "Matches" if $str =~ /World/;
Expressionslikethisareusefulinconditionals:
Thesenseofthematchcanbereversedbyusing!~ operator:
67
Regular Expressions
Regexpsaretreatedmostlyasdoublequotedstrings, sovariablesubstitutionworks:
$foo = 'house'; 'cathouse' =~ /cat$foo/; # matches 'housecat' =~ /${foo}cat/; # matches foreach my $regexp (@regexps) { my $comp = qr/$regexp/; foreach my $str (@strs) { print '$str\n' if $str =~ /$comp/; } }
[email protected]
qr//compilestheRegularExpression.
68
Tospecifywheretheregexshouldmatch,wewould usetheanchormetacharacters^and$.
The^meansmatchatthebeginningofthestring. print 'Starts with Hello' if /^Hello/; The$meansmatchattheendofthestringorbeforea newlineattheendofthestring. print 'Ends with World!' if /World!$/;
69
Character Classes
Thespecialcharacter''actsasarangeoperator,so [0123456789]become[0-9]:
/item[0-9]/matches'item0'or...or'item9 /[0-9a-f]/imatchesahexadecimaldigit
[email protected]
70
Character Classes
71
Character Classes
Character Classes
Perlhasseveralabbreviationsforcommoncharacter classes:
73
Character Classes
74
Exercises 4
75
Quantifiers
76
Quantifiers
Ifyouwantittomatchtheminimumnumberof timespossible,followthequantifierwitha?.
'a,b,c,d' =~ /,(.+),/;
# match 'b,c'
Avoidunnecesarybacktracing:
77
Eventhoughdogisthefirstalternativeinthesecond regex,catisabletomatchearlierinthestring.
'cats or dogs' =~ /cat|dog/; # matches 'cat' 'cats or dogs' =~ /dog|cat/; # matches 'cat'
78
79
Extracting matches
# extract time in hh:mm:ss format $time =~ /(\d\d):(\d\d):(\d\d)/; my ($hour, $min, $sec) = ($1,$2,$3);
80
Extracting matches
Togetalistofmatcheswecanuse:
/(ab(cd|ef)((gi)|j))/; 1 2 34
[email protected]
81
Named capture
'Michael Jackson' =~ /(?<NAME>\w+)\s+ (?<NAME>\w+)/ %+is('NAME' => 'Michael') %-is('NAME' => ['Michael','Jackson']) $1is'Michael' $2is'Jackson'
[email protected]
82
Withtheglobalmodifier,s///gwillsearchand replacealloccurrencesoftheregexinthestring:
83
Iftheemptyregex//isused,thestringissplitinto individualcharacters.
84
85
Magic Variables
Wehavealreadyseen$1,$2,...Thereisalso:
Thesevariablesarereadonlyanddynamically scopedtothecurrentBLOCK.
MaketheRegexpslow
86
Switch
use qw(switch say); given($foo) { when (undef) {say '$foo is undefined'} when ('foo') {say '$foo is the str "foo"'} when (/Milan/) {say '$foo matches /Milan/'} when ([1,3,5,7,9]) { say '$foo is an odd digit'; continue; # Fall through } when ($_ < 100) {say '$foo less than 100'} when (\&check) {say 'check($foo) is true'} default {die 'what shall I do with $foo?'} }
[email protected]
87
Smart Matching
88
Smart Matching
$a Code Any Hash Hash Hash Hash Array Array Array Array Any Any Code() Any Num Any Any Any
$b Code Code Hash Array Regex Any Array Regex Num Any undef Regex Code() Code() numish Str Num Any
Matching Code $a == $b # not empty prototype if any $b->($a) # not empty prototype if any [sort keys %$a]~~[sort keys %$b] grep {exists $a->{$_}} @$b grep /$b/, keys %$a exists $a->{$b} arrays are identical, value by value grep /$b/, @$a grep $_ == $b, @$a grep $_ eq $b, @$a !defined $a $a =~ /$b/ $a->() eq $b->() $b->() # ignoring $a $a == $b $a eq $b $a == $b $a eq $b
[email protected]
89
Exercises 5
regexp=REGEXPtheprogramshallonlyprintthe linesthatmatchwiththeREGEXP.
90
Exercises 5
'The Beatles (White Album) - Ob-La-Di, ObLa-Da' 'Tel: 212945900' '(c) (.+)\s*\1'
RegExp
/(\(.*\))/ and /(\(.*?\))/ /\d{4,}/ /(\w\w)-(\w\w)-(\w\w)/ /\W+/
91
Core subroutines
92
lc(EXPR), lcfirst(EXPR), uc(EXPR), ucfirst(EXPR)lowercase,lowercasefirst, uppercaseanduppercasefirst. length(EXPR)Returnsthelengthincharacters ofthevalueofEXPR. sprintf(FORMAT, LIST)Returnsastring formattedbytheusualprintfconventionsofC. abs(EXPR), cos(EXPR), int(EXPR), log(EXPR), sin(EXPR), sqrt(EXPR) normalnumericsubroutines. 93
[email protected]
chop/chomp
substr
ExtractsasubstringoutofEXPRandreturnsit
my $var = 'Good dog'; say substr($var, 5); # 'dog' substr($var, 5) = 'cat'; # 'Good cat' substr($var, 5, 5, 'cow'); # 'Good cow'
95
0xinterpretsitasahexstring. 0binterpreteditasabinarystring.
96
pop(ARRAY), push(ARRAY, LIST) Pops/Pushesavaluefrom/totheendoftheARRAY. shift(ARRAY), unshift(ARRAY, LIST) Pops/Pushesavaluefrom/tothestartoftheARRAY. Note:inthepopandshiftifARRAYisomitted,pops the@ARGVarrayinthemainprogram,andthe@_ arrayinsubroutines.Avoidtouseit.
97
join
join(EXPR,LIST)Joinstheseparatestringsof LISTseparatedbythevalueofEXPR.
98
reverse
reverse(LIST)
Inlistcontext,returnsalistvalueconsistingofthe elementsofLISTintheoppositeorder.
Invertingthekeys/valuesinahash.
99
map
Note:$_isanaliastothelistvalue.
100
grep
Note:$_isanaliastothelistvalue.
101
sort
sort(LIST)Inlistcontext,thissortstheLIST andreturnsthesortedlistvalue.
Bydefaultcomparestheelementsasstrings. sort(10,9,20) #10, 20, 9 Providingaclosure,theelementscomein$aand$b. sort {$a <=> $b} (10,9,20) # 9, 10, 20 Schwartziantransform
@sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [ $_, foo($_) ] } @unsorted;
[email protected]
102
each
103
exists
my @a = (1, undef); $a[3] = undef; exists($a[1]) # true exists($a[3]) # true exists($a[2]) # false my %a = ('a' => 1); exists($a{'a'}) # true exists($a{'b'}) # false
[email protected]
104
delete
my @array = (a => 1, b => 2, c => 3); delete($array[2]); # ('a',1,2,'c',3); my %hash = (a => 1, b => 2, c => 3); delete($hash{b}); # (a => 1, c => 3);
105
eval / die
106
Exercises 6
Firstcharacterinuppercase. Theremaininginlowercase.
[email protected]
107
Modules and OO
108
Package
Theideaistoprotectedeachpackagevariables.
package Dog; our $c = 1; my $d = 1; sub inc {$c++; $d++} package main; our $c = 0; my $d = 0; sub inc {$c++; $d++} print("$d-$c-$Dog::d-$Dog::c\n");# "0-0--1" Dog::inc(); print("$d-$c-$Dog::d-$Dog::c\n");# "0-0--2" inc(); print("$d-$c-$Dog::d-$Dog::c\n");# "1-1--2"
109
use Modules
110
use Modules
111
Write Modules
Tostartatraditionalmodulecreateafilecalled Some/Module.pmandstartwiththistemplate:
package Some::Module; # package name use strict; use warnings; # use always use Exporter; our @ISA = qw(Exporter); our $VERSION = 05.22; our @EXPORT = qw($var1 &func1); our @EXPORT_OK = qw($var2 &func2); our ( $var1, $var2 ) = ( 1, 2 ); sub func1() {print("func1\n");} sub func2() {print("func2\n");} 1; # has to finnish with a true value
112
Write Modules
use Exporter; our @ISA = qw(Exporter); ImportExportermodule.Derivethemethodsof Exportermodule.Eachelementofthe@ISAarrayis justthenameofanotherpackage,thepackagesare searchedformissingmethodsintheorderthatthey occur. use base qw(Exporter); # is similar our $VERSION = 05.22; Setstheversionnumber.Importinglike: use Some::Module 6.15;
[email protected]
113
Write Modules
our @EXPORT = qw($var1 &func1); Listsofsymbolsthataregoingtobeexportedby default(avoidtouseit). our @EXPORT_OK = qw($var2 &func2); Listsofsymbolsthataregoingtobeexportedby request(betterpractice). our ( $var1, $var2 ) = ( 1, 2 ); sub func1() {print("func1\n");} sub func2() {print("func2\n");} Definitionofthesymbols.
[email protected]
114
Write Modules
Thisisuglybutcanbeusedtocallsubroutineas well.
115
Perl Objects
Therearethreeverysimpledefinitions.
116
Object constructor
package Animal; sub new { bless({}) } package Animal; sub Animal { bless({}) }
[email protected]
Thatwordnewisn'tspecial.
117
Objects Inheritance
package Animal; sub new {return bless({}, shift @_);} package Dog; use base qw(Animal); # use 'Animal'; # +- true # push @ISA, 'Animal'; my $dog = Dog->new();
[email protected]
Thiswouldbecalledlike:
118
get/put method
Theget/putmethodinperl
sub property { my ($self, $value) = @_; $self->{property} = $value if @_>1; return $self->{property}; }
119
Method overwriting
Themostcommonwaytotocallamethodfroman objectis:
print("Dog: ".$dog->fly()."\n"); print("Bat: ".$bat->fly()."\n"); Perlwilllooktothescalarreferenceandseethepackage nameoftheblessedreference. package Animal; sub fly { return 0; } package Bat; use base qw(Animal); sub fly { return 1; }
MethodImplementation
120
Method overwriting
Ifyouneedto,youcanforcePerltostartlookingin someotherpackage
$bat->Insect::fly(); # dangerous
$bat->SUPER::fly();
121
Object Destroy
122
Exercises 7
Createaprogramtotestthem.
[email protected]
123
Standard Modules
124
pragma
Theusualones
Perl5.10givesnewfeatures,seeperlpragma
125
Usual guilties
Data::Dumperstringifiedperldatastructures Carpbetterdie()andwarn() Cwdpathnameofcurrentworkingdirectory ExporterImplementsdefaultimportmethodfor modules POSIX IPC::Open3openaprocessforreading,writing,and errorhandling Time::HiResHighresolutionalarm,sleep, gettimeofday,intervaltimers
[email protected]
126
World-Wide Web
127
World-Wide Web
128
Apache/mod_perl packages
129
Security
Test
131
Other
132
Advanced Perl
133
DBI
use DBI; my $dsn = "DBI:mysql:database=$database;" . "host=$hostname;port=$port"; my $dbh = DBI->connect($dsn, $user, $password); my $sth = $dbh->prepare( "SELECT * FROM person WHERE name = ?") or die $dbh->errstr; $sth->execute('oleber') or die $dbh->errstr; while (my $ref = $sth->fetchrow_hashref()) { print Dumper $ref; } $sth->finish(); $dbh->disconnect;
134
AUTOLOAD a method
Whenthemethodisn'tfound,theAUTOLOADwillbecalled.
sub AUTOLOAD { my ($self, @params) = @_; my $name = $AUTOLOAD; $name =~ s/.*://; # strip name die "Can't access '$name' field" if not exists $self->{_p}->{$name}; ( $self->{$name} ) = @params if @params; return $self->{$name}; }
135
tie variable
package SConst; sub TIESCALAR { my ($pkg, $val) = @_; bless \$val, $pkg; return \$val; } sub FETCH { # read return ${shift()}; } sub STORE { # write die "No way"; } 1;
use SConst; my $var; tie $var, 'SConst', 5; print "$var\n"; $var = 6; # dies
136
Q/A
137