Perl
Perl
What Is Perl?
Perl Practical Extraction and Reporting Language. Perl has similarities with the C programming language. Perl has similarities with shell scripting. Perl is a linear programming language, not a cyclic processor like sed and awk. Perl has built in commands and functions. Perl uses modules to extend its capabilities. Extensive documentation is available on the Comprehensive Perl Archive Network (CPAN).
H4311S B.00
H4311S B.00
H4311S B.00
Perl Statements
a command or subroutine call an assignment a terminating semicolon a condition a block a pair of braces a set of simple statements
A block consists of
H4311S B.00
Statements Example
1 2 3 4 5 6 7 8 9 10 11 12 13 } print $var1\n; exit; #! /opt/perl5/bin/perl # # # $var1 = Im in the outer block.; print { my $var1 = Im in the inner block.; print $var1\n; $var1\n; @(#) statements.pl: Version 1.0 # This program demonstrates Perl statements
H4311S B.00
Variables
Perl does not have data types. Perl will store data in
Perl will convert data to the proper type for the statement. Scalar variables start with $ followed by alpha followed by an alphanumeric
allow underscores store a single value that may contain white space
H4311S B.00
Scalar Variables
Assigning scalar variables: $cost = 50000; $margin = 0.15; $product = car; Using scalar variables: $price = $cost + ($cost * $margin); $desc = A red car with lots of extras. Only $price dollars; print (The cost is: , $cost);
H4311S B.00
Commands
Built-in commands for
variable manipulation input and output program flow management of processes, users, groups network information IPC and sockets
User-defined subroutines
H4311S B.00
H4311S B.00
Operators
list operators, parentheses, braces, quotes, (){} array and hash index [] {} dereference and method calls -> increment, decrement - unary operators + bit operators & | logical && || comma , logical and or xor ^ ! ~ ++
H4311S B.00
Managing Data
STDIN the standard input device (the keyboard) STDOUT the standard output device (the monitor) STDERR the standard error device (the monitor)
Some commands will use them as the defaults: <> is the same as <STDIN> print This is a line of output.\n print STDOUT This is a line of output.\n STDERR must be specified explicitly print STDERR This is an error!\n
H4311S B.00
Opening Files
open (filehandle, mode filename); close filehandle; Filehandle is any name you want to use. Mode can be the following: omitted or < input (reading) > output (writing) >> append (writing) +> input and output truncate the file if it exists. +< input and output do not truncate the file. Filename in quotes is the pathname of the file: filename alone is a file. | filename is a command that reads from the pipe. filename | is a command that writes to the pipe.
H4311S B.00 2003 Hewlett-Packard Development Company, L.P.
H4311S B.00
printf(format string, positional parameter); The format string contains literals and field specifiers that will be replaced by the positional parameters. printf The value of var is %s.\n, $var; printf %d is more than %d.\n, $var+7, $var-7;
H4311S B.00
H4311S B.00
Formats
1. format SALARYFORM = 2. Employee 4. @<<<<<<<<<< 5. $name 6. . 7. $~ = SALARYFORM; 8. $name = M Mouse; 9. $salary = 1000; 10.write; , Salary @>>>>>>> $salary 3. ========================
H4311S B.00
Numeric: 0, 0.0 String: , 0 Other: undef, null 1, hello, 3.1415926, -32, 0x0003152BF0, 0.0, return value of zero or null is FALSE e.g. int(0.0) return value of non-zero or non null is TRUE
Commands
H4311S B.00
if
if (condition) { block; } if (condition) { block; } elsif (condition) { block; } . . . else { block; }
H4311S B.00
unless
unless (condition) { block; } unless (condition) { block; } else { block; }
H4311S B.00
while Loop
while (condition) { block; } executes if condition is true will execute 0 or more times stops executing when (condition) is false
H4311S B.00
until Loop
until (condition) { block; } will execute 0 or more times executes if condition is false stops execution when condition is true
H4311S B.00
for Loop
for (initializer; condition; iterator) { block } initializer can be any valid Perl expression, but is usually a single assignment statement condition is a relational or conditional expression to evaluate iterator is executed at the end of each block i.e. just before the next iteration for ($i = 0; $i < 10; $i++) { print $i; }
H4311S B.00
foreach Loop
foreach $value (list) { block; } executes the command block once for each element of the list stops execution when no more elements are in the list
H4311S B.00
Lists
A list is an ordered set of values enclosed in parentheses. A list has no name. Each element in the list can be accessed by an index. The index is enclosed in square brackets. The members of a list can be literals, scalars, or other lists. A list can be used as an rvalue or an lvalue. (1, 2, 3, 4, $varX) ($v1, $v2) = <STDIN> ($v1, $v2, $v3) = (1, 2, 3) a list a list as an lvalue a list as an rvalue
H4311S B.00
H4311S B.00
H4311S B.00
Arrays
An array is a list with a name. The name must start with @. The name starts with a letter followed by alphanumeric and underscore. An array can be populated from a list @arrayname = (list); Array elements are accessed using an index. $element = $arrayname[index]; index can be any expression that evaluates to a number. A slice is an array that is a subset of a larger array. @array_slice = @array[index list];
H4311S B.00
H4311S B.00
H4311S B.00
Hashes
A hash is a named list that contains key-value pairs The key is frequently a string The name starts with a %. The first character is a letter, followed by alphanumeric or underscore. Hashes may be populated from a list %hash = (key1, value1, key2, value2,); or %hash = (key1 => value1, key2 => value2,); Access a value by specifying its key $hash{key2};
H4311S B.00
4. %new_animals = %animals;
H4311S B.00
H4311S B.00
10
Modifiers
Any statement may be augmented with a modifier: statement modifier (condition);
The condition is evaluated before executing the statement if and unless cause the statement to be executed once, or not at all, depending on the condition. while, and until cause the statement to be executed 0 or more times, depending on the condition Exception: if the statement is a do statement modified with while or until, the condition is checked after the statement is executed. Thus a do statement will be executed at least once with while and until.
H4311S B.00
H4311S B.00
H4311S B.00
Labels
A label provides a name by which a block of code can be referenced. Labels can be used by redo, last, next, and goto. A label consists of an alpha or underscore, followed by one or more alphanumeric or underscore characters. A label is terminated by a colon. A label is case sensitive.
H4311S B.00
Pattern Matching
Pattern Matching
Pattern Matching is part of the Perl language, not an add-on Pattern Matching uses Binding Operators, Regular Expressions (REs), Commands, and Command Modifiers Binding operators associate a string topic to a RE pattern Sail Away =~ m/^Sail \w+/i; REs express patterns using literals, and special characters Commands specify how the pattern is used against the bound topic: m// (match), s/// (substitute), tr/// or y/// (transliterate) Command Modifiers change command behaviour i (ignore case), g (global), s (squash)
H4311S B.00
H4311S B.00
a () (no match) or (1)(match) is returned if binding in list context $a = Abe Lincoln =~ m/Wash/; # $a is 0 @arr = Abe Lincoln =~ m/Lincoln/; # @arr is (1) The m is assumed if missing @arr = Abe Lincoln =~ /Wash/;# @arr is () Topic and binding may be omitted: if so, $_ is bound $_ = Abe Lincoln; $a = /Lincoln/; # $a is 1 $character = (/Lincoln/) ? honest : cagey;
H4311S B.00
a pattern of what to look for in a string, usually delimited with / interpolated before processing, just like a double-quoted string used with m// and s/// commands, as well as with Perl functions (e.g. split) literal characters
metasymbols
/\d/, /\w+\s\d+/
H4311S B.00
Literal Matching
Most characters in an RE are matched to themselves: yes: / matches yes: 45, and ayes: 36 Some characters have special meaning: \ | ( ) [ { ^ $ . * + ? Precede special characters with backslash (\) to match them literally /hp.com/ matches hp.com and hpicom /hp\.com/ matches hp.com, but not hpicom The delimiter is special, but may be changed: m/\/usr\/tmp/ # matches /usr/tmp m#/usr/tmp# # same, but easier to read Note: the m is required when specifying a different delimiter than /
H4311S B.00
Special Characters
^,$ Anchors to the start, end of a line (or string) [ ] Matches one of the specified group of characters . \ | * + ? {} () Matches any single character (except newline) Treat next character as literal; also, start metasymbol sequence Separates alternatives Matches 0 or more of the preceding RE element Matches 1 or more of the preceding RE element Matches 0 or 1 of the preceding RE element; also, create a minimal match for the preceding quantifier Used to specify quantifiers Used to capture sub-expressions
H4311S B.00
Metasymbols
A metasymbol is a character sequence with a special meaning
Specifying a specific, perhaps non-printable, character: \a, \n, \r, \t, \f, \e, \007, \x07, \cx Specifying one of a certain type of character: \d,\D, \w,\W, \s,\S, \l, \u Specifying an assertion / anchor / boundary: \b, \B, \A, \Z, \z, \G Start / End specified case of letters \L, \U, \E
H4311S B.00
\D (non-digit) \S (non-white) \W (non-word) order doesnt matter (except for readability!) ranges are specified using [abcde], [ebdca], [a-e] # equivalent ^, when first, means except for; when not first, it means itself [^0-9], [a-z\-0-9], [ABC^,_] Backslash and metasymbols may also be used: [ \t]
H4311S B.00
H4311S B.00
10
Anchors
^ $ \b \B \A \Z \z \G anchors the pattern to the start of the string or a newline. anchors the pattern to the end of a string or a newline. anchors to a word boundary. anchors to a non-word boundary. anchors to the start of a string. anchors to the end of a string or a newline at the end. anchors to the end of a string. anchors to where the previous m/RE/g finished. /^root/ matches root and rooter, but not chroot /root$/ matches root and chroot, but not rooter /^root$/ matches only root
H4311S B.00
11
Quantifiers { }
Quantifiers specify how many times a pattern should occur:
{3,} a minimum of three times * match the preceding character 0 or more times /do*r/ matches dr dor door dooor. + ? match the preceding character 1 or more times /do+r/ matches dor door dooor. match the preceding character 0 or 1 times /do?r/ matches dr dor
Default is maximal match; follow with ? for a minimal match: *?, +?, ?? {}? Makes the match minimal.
H4311S B.00
12
H4311S B.00
13
H4311S B.00
14
H4311S B.00
15
\1, \2, \3, ... persist only during the current binding
In subsequent statements, use $N In substitutions and matching patterns, use \N jub-jub =~ m/(\w+)-\1/; # matches dim-sum =~ m/\(\w+)-\1/; # doesnt match In substitution replacements, use either $N or \N: $name = Abraham Lincoln; $name =~ s/(\w+) (\w+)/\2, $1/;# mixed $N and \N print $name\n; # prints Lincoln, Abraham print $1\n; # prints Abraham
H4311S B.00
16
Backreferences Examples
1. Given an array @pal1 = (noon, naan, pip, pie, nine); Create a regular expression that will identify the four character palindromes. Create a regular expression that will identify the three character palindromes. 2. Given an array @pal2 = (wing on wing, dollar for dollar,at the ball) Create a regular expression that will identify the three word palindromes. 3. Given a string $string = root console Mar 22 16:45
17
H4311S B.00
18
Command Modifiers
m and s patterns i Ignore case. x Ignore white space. s Let the dot match a newline. m Let anchors match a newline. o Compile pattern only once. m only g (list) find all matches g (scalar) save position cg Do not reset search position after a failed match. s only g global replace e evaluate right side tr and y c Complement the search list. d Delete specified characters. s Squash duplicate characters.
H4311S B.00
19
H4311S B.00
20
Module Subroutines
H4311S B.00
Scope of Variables
By default, variables in Perl have global scope The my and local list operators create variables of limited scope:
Variables hide previous variables with the same name Variables may be initialized when created
Variables disappear when the current block completes The my list operator creates variables with static scope Variables are accessible by code located within the current block The local list operator creates variables with dynamic scope
Variables are also accessible by any code called from within the current block
H4311S B.00
$a $b
5 5
global
$a $b
7 7
H4311S B.00
H4311S B.00
Prototypes
Prototypes may be used to specify the number and type of arguments a subroutine expects Prototypes are necessary when using forward declarations Use of prototypes is optional Example sub mysub ($$@); . . . mysub 1, $i, @items; . . . sub mysub ($$@) { . . . } # forward declaration # use # subroutine defined
H4311S B.00
H4311S B.00
Special Variables
H4311S B.00
H4311S B.00
H4311S B.00
H4311S B.00
H4311S B.00
What Is Possible
records
simple complex
linked lists
H4311S B.00
References
Array and hash element values must be scalars References refer to a block of memory belonging to a scalar, array, or hash (or code) All references are scalars; what they refer to need not be
$sref
SCALAR(0x4002abcd)
$temp
warm
0 hot 1 cold
H4311S B.00
Creating References
Use \ to create a reference to that variables memory: $sref = \$var; $aref = \@arr; $href = \%hsh; The value of the variable indicates the data type, and memory location: print $sref; # prints SCALAR(0x4001abcd) print $aref; # prints ARRAY(0x400a0010) print $href; # prints HASH(0x400e00aa) Anonymous references can be created to arrays and hashes: $anon_array = [value1, value2, value3]; $anon_hash = {key1, value1, key2, value2};
H4311S B.00
Using References
SCALAR References $var = warm; $sref = \$var; print $sref; print $$sref; ARRAY References @temps = (hot, cold); $aref = \@temps; print $aref; print @$aref; print $$aref[1]; print $aref->[1]; HASH References %book = (Title => Lord of the Rings, Author => JRR Tolkien); $href \%book; print $href; print %$href; print $href->{Title}; print $$href{Title};
H4311S B.00
Anonymous References
Variable associated with this block of memory:
0x4001abc0 @seasons @seasons = ( winter, spring, summer, fall ); 0 winter $aref = \@seasons; 1 spring # print summer: $aref ARRAY(0x4001abc0) 2 summer print $seasons[2]\n; 3 fall print $$aref[2]\n;
print $aref->[2]\n;
Anonymous Reference:
0x4008def0
$aref = [ winter, spring, summer, fall ]; # print summer: print $$aref[2]\n; print $aref->[2]\n;
H4311S B.00
$aref ARRAY(0x4008def0)
0 1 2 3
Records
A record is a list of related items. The items have a name and a value. Simple records are usually hashes, occasionally arrays, with scalar data values Complex records contain arrays and hashes A record is often implemented as an anonymous hash, using the hash constructor {}. Records are often stored in arrays or hashes, i.e. references to the records are stored.
H4311S B.00
Simple Record
Hash implementation %book = ( Title => Lord of the Rings, Author => JRR Tolkien ); Hash reference implementation $book = { Title => Lord of the Rings, Author => JRR Tolkien };
Array implementation @book = ( Lord of the Rings, JRR Tolkien ); Array reference implementation $book = [ Lord of the Rings, JRR Tolkien ];
H4311S B.00
$$boat{Model}; $$boat{Options}{main};
H4311S B.00
H4311S B.00
10
Arrays of Arrays
Multidimensional arrays are created as arrays of references. @array = ( [one, two, three], [dog, cat, bird], [golden, tiger, canary]); $array[0] is (one, two, three) $array[1] is (dog, cat, bird) $array[2] is (golden, tiger, canary) $array[1][2] is bird This could also be done using an anonymous array constructor instead of a list.
H4311S B.00
11
Arrays of Hashes
@dogs = ( { dog => lab, name => rover , size => big }, { dog => spaniel, name => bowser , size => medium } $dogs[0]{dog} refers to lab. )
H4311S B.00
12
Hashes of Hashes
%pets =(dogs => {mine => obedient, yours => untrained}, cats => {mine => independent, yours => undisciplined}, hamsters=> {mine = >perfect, yours => unmotivated}); $pets{cats}{mine} refers to independent.
H4311S B.00
13
Hashes of Arrays
%animals = ( dogs => [spaniel, poodle, lab], cats => [persian, tabby], birds => [canary, duck, goose, turkey] ); $animals{dogs}[1] is poodle $animals{cats}[0] is persian
H4311S B.00
14
Linked List
sub make_node{ print Enter record: ; chomp ($value = <STDIN>); my $node = {value => $value, next => $next}; return } . . . if (defined $head){ $last_node = find_last_node($head); #see notes $last_node{next} = make_node(); } else{ $head = make_node(); }
H4311S B.00 2003 Hewlett-Packard Development Company, L.P.
$node;
15
H4311S B.00
CGIs Role
CGI is the glue that holds the web together. Typically sandwiched between HTML forms A client completes a form to provide needed information to the program running on the server. The CGI script is executed on the server in real time. Results are relayed back to the client. A cheap disclaimer. We will keep HTML as simple as possible. The module, cgi.pm will be deferred until the next unit in this course. This lets us get a better look at the data flow.
H4311S B.00
Creating a Form
print Content-Type: text/html\n\n; print <FORM ACTION=http://www.servername.com/cgibin/task.cgi METHOD=POST> <B>Select task:</B> <SELECT NAME=task> <OPTION VALUE=check_daemons>check daemons <OPTION VALUE=kill_old_users>kill old users </SELECT> <INPUT TYPE=submit VALUE=submit task> ;
H4311S B.00
print Content-Type: text/html\n\n; print <FORM ACTION=http://www.servername.com/cgibin/task.cgi METHOD=GET> First Name: <INPUT TYPE="TEXT" NAME="firstname" SIZE="25"><BR> Last Name : <INPUT TYPE="TEXT" NAME="lastname" SIZE="25"><BR> <INPUT TYPE=radio NAME=job_title VALUE=S>Sysadmin <INPUT TYPE=radio NAME=job_title VALUE=N>Netadmin <INPUT TYPE=radio NAME=job_title VALUE=W>Webmaster <INPUT TYPE=submit VALUE=submit> ;
H4311S B.00
Security
Security is naturally a concern. The ISP or webmaster will determine if and where CGI scripts will be allowed to run. Three levels: /opt/apache/cgi-bin (more secure) allow users to maintain their own directory for CGI scripts (less secure) any directory, the program name must end in .cgi (insecure) If users are allowed to maintain their own CGI scripts a configuration change will be made to allow public_html this path is appended to ~user. For example, the script called by http://r208w100/~instr/prog.cgi will be /home/instr/public_html/prog.cgi
H4311S B.00
H4311S B.00
H4311S B.00
H4311S B.00
H4311S B.00
10
Perl Modules
What Is a Module?
A module is Perl script that another programmer wants to share located at CPAN web sites a combination of C source and header files, configuration files, documentation, and scripts accessed as a zipped tar file available for web, networking, windows, X11, etc. can be improved on and resubmitted
H4311S B.00
Use the proper unzip utility to restore the tar file. Untar the file.
UNPACK
H4311S B.00
Sockets
A socket can be a port at an IP address that receives data. A socket can be a port at which a local application receives data. A server listens at a port. A client is a program that sends information to or requests information from a server at a specific port. There are two different types of messages, streams and datagrams.
H4311S B.00
Sockets Example
SERVER 1. Use IO::Socket; 2. $sock=IO::Socket::INET ->new 3. (LocalPort => 12345, 4. Type => SOCK_STREAM, 5. Reuse => 1, 6. Listen = 5) or die message; 7. while ($client = $sock->accept){ 8. $line = <$client>; 9. print $line; } 10. close ($sock); 1. 2. 3. 4. 5. 6. 7. 8. 9. CLIENT Use IO::Socket; $sock=IO::Socket::INET ->new (PeerAddr => hostname, PeerPort => 12345, ype => SOCK_STREAM, Proto => tcp,) or die message; while (more_to_send){ $line = data_to_send; print $sock $line; }
H4311S B.00
CGI
The CGI module is a standard module. The CGI module generates the web pages dynamically. <STDIN> and <STDOUT> now use the web browser. The screen is created by printing the HTML commands to the browser. The CGI methods produce HTML code dynamically.
H4311S B.00
CGI Example
use CGI; $page = new CGI; print
$page -> start_html(), $page center($page -> h1(Hello World)), $page start_form(), $page -> textarea( -name => My Text Area, -rows => 10, -columns => 40),
H4311S B.00