Perl 4
Perl 4
Perl is a general-purpose programming language originally developed for text manipulation and now
used for a wide range of tasks including system administration, web development, network
programming, GUI development, and more.
If you have basic knowledge of C or UNIX Shell then PERL is very easy to learn. If this is your first
language to learn then you may take one to two week to be fluent in PERL
What is PERL?
Perl is a stable, cross platform programming language.
Perl stands for Practical Extraction and Report Language.
It is used for mission critical projects in the public and private sectors.
Perl is Open Source software, licensed under its Artistic License, or the GNU General Public
License (GPL).
Perl was created by Larry Wall.
Perl 1.0 was released to usenet's alt.comp.sources in 1987
PC Magazine named Perl a finalist for its 1998 Technical Excellence Award in the
Development Tool category.
Perl is listed in the Oxford English Dictionary.
PERL Features
Perl takes the best features from other languages, such as C, awk, sed, sh, and BASIC,
among others.
Perls database integration interface (DBI) supports third-party databases including Oracle,
Sybase, Postgres, MySQL and others.
Perl works with HTML, XML, and other mark-up languages.
Perl supports Unicode.
Perl is Y2K compliant.
Perl supports both procedural and object-oriented programming.
Perl interfaces with external C/C++ libraries through XS or SWIG.
Perl is extensible. There are over 500 third party modules available
Good question. The sort answer is interpreted, which means that your code can be run as is,
without a compilation stage that creates a nonportable executebale program.
Tradional compilers convert programs into machine language. When you run a Perl program, it's
first compiled into a bytecode, which is then converted ( as the program runs) into machine
instructions. So it is not quite the same as shells, or Tcl, which are "strictly" interpreted without an
intermediate representation. Nor it is like most versions of C or C++, which are compiled directly
into a machine dependent format. It is somewhere in between, along with Python andawk and
Emacs .elc files.
SYNTAX OVERVIEW
A Perl script or program consists of one or more statements. These statements are simply written in
the script in a straightforward fashion. There is no need to have a main() function or anything of
that kind.
Comments start with a hash symbol and run to the end of the line:
# This is a comment
Whitespace is irrelevant:
However, only double quotes "interpolate" variables and special characters such as
newlines (\n ):
print 42;
You can use parentheses for functions' arguments or omit them according to your
personal taste. They are only required occasionally to clarify issues of precedence.
Following two statements produce same result.
print("Hello, world\n");
print "Hello, world\n";
Regardless of the program you choose to use, a PERL file must be saved with a .pl (.PL) file
extension in order to be recognized as a functioning PERL script. File names can contain numbers,
symbols, and letters but must not contain a space. Use an underscore (_) in places of spaces.
Assuming you are already on Unix $ prompt. So now open a text file hello.pl using vi or
vim editor and put the following lines inside your file.
#!/usr/bin/perl
Now to run hello.pl Perl program from the Unix command line, issue the following
command at your UNIX $ prompt:
$perl hello.pl
Hello, World
$perl -v
You can use -e option at command line which lets you execute Perl statements from the
command line.
Scalar
Array
Hash
Here my is a keyword which has been explained in the same section at the bottom.
A scalar values can be strings, integers or floating point numbers, and Perl will
automatically convert between them as required. There is no need to pre-declare your
variable types. Scalar values can be used in various ways:
print $animal;
print "The animal is $animal\n";
print "The square of $answer is ", $answer * $answer, "\n";
There are a number of "magic" scalars with names that look like punctuation or line noise. These
special variables are used for all kinds of purposes and they wioll be discussed in Special Variables
which is the "default variable". It's
sections. The only one you need to know about for now is $_
used as the default argument to a number of functions in Perl, and it's set implicitly by
certain looping constructs.
Arrays are zero-indexed but you can change this setting by changing default variable $
[ or $ARRAY_BASE. Here's how you get at elements in an array:
The special variable $#array tells you the index of the last element of an array:
You might be tempted to use $#array + 1 to tell you how many items there are in an
array. Don't bother. As it happens, using @array where Perl expects to find a scalar value
("in scalar context") will give you the number of elements in the array:
The elements we're getting from the array start with a $ because we're getting just a single value
out of the array -- you ask for a scalar, you get a scalar.
This is called an "array slice". You can do various useful things to lists as follows:
There are a couple of special arrays too, such as @ARGV (the command line arguments to your
script) and @_ (the arguments passed to a subroutine). These are documented in next section
"Special Variables".
You can use whitespace and the => operator to lay them out more nicely:
my %fruit_color = (
apple => "red",
You can get at lists of keys and values with keys() and values() built-in functions.
Hashes have no particular internal order, though you can sort the keys and loop through them. Just
like special scalars and arrays, there are also special hashes. The most well known of these is %ENV
which contains environment variables.
More complex data types can be constructed using references, which allow you to build lists and
hashes within lists and hashes. A reference is a scalar value and can refer to any other Perl data
type. So by storing a reference as the value of an array or hash element, you can easily create lists
and hashes within lists and hashes.
The following example shows a 2 level hash of hash structure using anonymous hash
references.
my $variables = {
scalar => {
description => "single item",
sigil => '$',
},
array => {
description => "ordered list of items",
};
Following line will print $
print "$variables->{'scalar'}->{'sigil'}\n" ;
Variable Context:
Here @animals is an array, but when it is used in scalar context then it returnes number
of elements contacined in it as following.
Another examples:
my $number = 30;
Here $number is an scalar and contained number in it but when it is called along with a
string then it becomes number which is 0, in stead of string:
Escaping Characters:
In PERL we use the backslash (\) character to escape any type of character that might
interfere with our code. Below is the example
Case Sensitivity:
Variable names are case sensitive; $foo, $FOO, and $fOo are all separate variables as far as Perl is
concerned.
Variable Scoping:
Throughout the previous section all the examples have used the following syntax:
my $var = "value";
$var = "value";
However, the above usage will create global variables throughout your program, which is bad
programming practice. But my creates lexically scoped variables instead. The variables are
scoped to the block (i.e. a bunch of statements surrounded by curly-braces) in which they
are defined. Have a look at the following example:
my $a = "foo";
if ($some_condition) {
my $b = "bar";
print $a; # prints "foo"
Using my in combination with a use strict; at the top of your Perl scripts means that the interpreter
will pick up certain common programming errors. For instance, in the example above, the final print
$b would cause a compile-time error and prevent you from running the program. Using strict is
highly recommended. Following is the usage:
use strict;
my $a = "foo";
if ($some_condition) {
my $b = "bar";
print $a; # prints "foo"
Loop (for/foreach/while/until/continue)
Subroutine
If more than one value is listed, the list must be placed in parentheses. All listed elements
must be legal lvalues. Scoped--magical built-ins like $/ must currently be localized with
local instead.
Lexical scopes of control structures are not bounded precisely by the braces that delimit
their controlled blocks; control expressions are part of that scope, too. Thus in the loop
the scope of $line extends from its declaration throughout the rest of the loop construct
(including the continue clause), but not beyond it.
Similarly, in the conditional the scope of $answer extends from its declaration through
the rest of that conditional, including any elsif and else clauses, but not beyond it.
We encourage the use of lexically scoped variables. Use the following line at the top of your
program file to avoid any error. This will remind you to scope the variable
using local or mykeyword.
Lexical scoping is done with my, which works more like C's auto declarations.
Because local is a run-time operator, it gets executed each time through a loop. Consequently, it's
more efficient to localize your variables outside the loop.
If you localize a special variable, you'll be giving a new value to it, but its magic won't go
away. That means that all side-effects related to this magic still work with the localized
value. This feature allows code like this to work :
PERL SCALARS
Scalar variables are simple variables containing only one element--a string or a number. Strings
may contain any symbol, letter, or number. Numbers may contain exponents, integers, or decimal
values. The bottom line here with scalar variables is that they contain only one single piece of data.
What you see is what you get with scalar variables.
#!/usr/bin/perl
$number = "5";
$exponent = "2 ** 8";
$string = "Hello, PERL!";
$float = 12.39;
Scalar Strings
Strings are scalar as we mentioned previously. There is no limit to the size of the string, any
amount of characters, symbols, or words can make up your strings.
When defining a string you may use single or double quotations, you may also define
them with the q subfunction. Following are examples of how to define strings
Strings can be formatted to your liking using formatting characters. Some of these
characters also work to format files created in PERL. Think of these characters as
miniature functions.
Character Description
\L Transform all letters to lowercase
\l Transform the next letter to lowercase
\U Transform all letters to uppercase
\u Transform the next letter to uppercase
\n Begin on a new line
\r Applys a carriage return
\t Applys a tab to the string
\f Applys a formfedd to the string
\b Backspace
\a Bell
\e Escapes the next character
\0nn Creates Octal formatted numbers
\xnn Creates Hexideciamal formatted numbers
\cX Control characters, x may be any character
\Q Backslash (escape) all following non-alphanumeric characters.
\E Ends \U, \L, or \Q functions
print "$newline\n";
print "$small\n";
print "$ALLCAPS\n";
print "$PARTAILCAPS\n";
print "$backslah\n";
Two arguments must be sent with our substr() function, the string you wish to index and
the index number. If two arguments are sent, PERL assumes that you are replacing every
character from that index number to the end.
#!/usr/bin/perl
substr($mystring, 7) = "World!";
Because we only specified one numeric parameter for the string, PERL assumed we
wanted to replace every character after the 7th, with our new string. If we throw a third
parameter in our function we can replace only a chunk of our string with a new string.
#!/usr/bin/perl
$mystring = "Hello, PERL!";
print "Before replacement : $mystring\n";
substr($mystring, 3, 6) = "World!";
Multiline Strings
If you want to introduce multiline strings into your programs, you can use standard
quotes:
$string = 'This is
a multiline
string';
But this is messy and is subject to the same basic laws regarding interpolation and quote usage. We
could get around it using the q// or qq// operator with different delimiters.
print <<EOF;
This is
a multiline
string
EOF
V-Strings
V-strings can be a useful way of introducing version numbers and IP addresses into Perl.
They are any literal that begins with a v and is followed by one or more dot-separated
elements. For example:
$name = v77.97.114.116.105.110;
Numeric Scalars
Perl supports a number of a fairly basic methods for specifying a numeric literal in
decimal:
Note that the numbers are not stored internally using these bases.Perl converts the literal
representation into a decimal internally.
PERL ARRAYS
An array is just a set of scalars. It's made up of a list of individual scalars that are stored within a
single variable. You can refer to each scalar within that list using a numerical index.
Array Creation:
Array variables are prefixed with the @ sign and are populated using either parentheses
or the qw operator. For example:
The second line uses the qw// operator, which returns a list of strings, separating the
delimited string by white space. In this example, this leads to a four-element array; the
first element is 'this' and last (fourth) is 'array'. This means that you can use newlines
within the specification:
@days = qw/Monday
Tuesday
...
Sunday/;
$array[0] = 'Monday';
...
$array[6] = 'Sunday';
When extracting individual elements from an array, you must prefix the variable with a
dollar sign and then append the element index within square brackets after the name. For
example:
#!/usr/bin/perl
Array indices start at zero, so in the preceding example we.ve actually printed "Tue". You
can also give a negative index.in which case you select the element from the end, rather
than the beginning, of the array. This means that
print $shortdays[0]; # Outputs Mon
print $shortdays[6]; # Outputs Sun
print $shortdays[-1]; # Also outputs Sun
print $shortdays[-7]; # Outputs Mon
#!/usr/bin/perl
@10 = (1 .. 10);
@100 = (1 .. 100;
@1000 = (100 .. 1000);
@abc = (a .. z);
Array Size
The size of an array can be determined using scalar context on the array - the returned
value will be the number of elements in the array:
@array = (1,2,3);
print "Size: ",scalar @array,"\n";
The value returned will always be the physical size of the array, not the number of valid
elements. You can demonstrate this, and the difference between scalar @array and
$#array, using this fragment:
#!/uer/bin/perl
@array = (1,2,3);
$array[50] = 4;
There are only four elements in the array that contain information, but the array is 51
elements long, with a highest index of 50.
Here scalar function is used to enforce scalar context so that @array can return size of the
array otherwise @array will return a lisrt of all the elements contacined in it.
When adding elements using push() or shift() you must specify two arguments, first the
array name and second the name of the element to add. Removing an element with pop()
or shift() only requires that you send the array as an argument.
#!/usr/bin/perl
# Define an array
@coins = ("Quarter","Dime","Nickel");
print "First Statement : @coins";
print "\n";
You can also extract a "slice" from an array - that is, you can select more than one item
from an array in order to produce another array.
@weekdays = @shortdays[0,1,2,3,4];
The specification for a slice must a list of valid indices, either positive or negative, each separated
by a comma. For speed, you can also use the .. range operator:
@weekdays = @shortdays[0..4];
Ranges also work in lists:
@weekdays = @shortdays[0..2,6,7];
Replacing elements is possible with the splice() function. Splice() requires a handful of
arguments and the formula reads:
Essentially, you send PERL an array to splice, then direct it to the starting element, count
through how many elements to replace, and then fill in the missing elements with new
information.
#!/usr/bin/perl
@nums = (1..20);
splice(@nums, 5,5,21..25);
print "@nums";
Here actual replacement begins after the 5th element, starting with the number 6. Five elements
are then replaced from 6-10 with the numbers 21-25
With the split function, it is possible to transform a string into an array. To do this simply
define an array and set it equal to a split function. The split function requires two
arguments, first the character of which to split and also the string variable.
#!/usr/bin/perl
# Define Strings
$astring = "Rain-Drops-On-Roses-And-Whiskers-On-Kittens";
$namelist = "Larry,David,Roger,Ken,Michael,Tom";
# Strings are now arrays. Here '-' and ',' works as delimeter
@array = split('-',$astring);
@names = split(',',$namelist);
Likewise, we can use the join() function to rejoin the array elements and form one long,
scalar string.
#!/usr/bin/perl
# Define Strings
$astring = "Rain-Drops-On-Roses-And-Whiskers-On-Kittens";
$namelist = "Larry,David,Roger,Ken,Michael,Tom";
# Strings are now arrays. Here '-' and ',' works as delimeter
@array = split('-',$astring);
@names = split(',',$namelist);
print $string1;
print "\n" ;
print $string2;
Sorting Arrays
The sort() function sorts each element of an array according to ASCII Numeric standards.
#!/usr/bin/perl
# Define an array
@foods = qw(pizza steak chicken burgers);
print "Before sorting: @foods\n";
Please note that sorting is performed based on ASCII Numeric value of the words. So the best
option is to first transform every element of the array into lowercase letters and then perform
the sort function.
The Lists
Lists are really a special type of array - .essentially, a list is a temporary construct that
holds a series of values. The list can be "hand" generated using parentheses and the
comma operator,
@array = (1,2,3);
or it can be the value returned by a function or variable when evaluated in list context:
Here, the @array is being evaluated in list context because the join function is expecting a list.
Because a list is just a comma-separated sequence of values, you can combine lists
together:
@numbers = (1,3,(4,5,6));
The embedded list just becomes part of the main list.this also means that we can combine
arrays together:
@numbers = (@odd,@even);
Functions that return lists can also be embedded to produce a single, final list:
@numbers = (primes(),squares());
The list notation is identical to that for arrays - .you can extract an element from an array
by appending square brackets to the list and giving one or more indices:
#!/usr/bin/perl
$one = (5,4,3,2,1)[4];
Similarly, we can extract slices, although without the requirement for a leading @
character:
#!/usr/bin/perl
@newlist = (5,4,3,2,1)[1..3];
PERL HASHES
Hashes are an advanced form of array. One of the limitations of an array is that the information
contained within it can be difficult to get to. For example, imagine that you have a list of people and
their ages.
The hash solves this problem very neatly by allowing us to access that @ages array not
by an index, but by a scalar key. For example to use age of different people we can use
thier names as key to define a hash.
Creation of Hash
Hashes are created in one of two ways. In the first, you assign a value to a named key on
a one-by-one basis:
$ages{Martin} = 28;
In the second case, you use a list, which is converted by taking individual pairs from the
list: the first element of the pair is used as the key, and the second, as the value. For
example,
For clarity, you can use => as an alias for , to indicate the key/value pairs:
%hash = ('Fred' => 'Flintstone',
'Barney' => 'Rubble');
You can extract individual elements from a hash by specifying the key for the value that
you want within braces:
print $hash{Fred};
Extracting Slices
You can extract slices out of a hash just as you can extract slices from an array. You do,
however, need to use the @ prefix because the return value will be a list of corresponding
values:
#!/uer/bin/perl
You can get a list of all of the keys from a hash by using keys
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
print "The following are in the DB: ",join(', ',values %ages),"\n";
These can be useful in loops when you want to print all of the contents of a hash:
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
foreach $key (%ages)
{
print "$key is $ages{$key} years old\n";
}
The problem with this approach is that (%ages) returns a list of values. So to resolve this problem
we have each function which will retun us key and value pair as given below
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
while (($key, $value) = each %ages)
{
print "$key is $ages{$key} years old\n";
}
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
if (exists($ages{"mohammad"}))
{
print "mohammad if $ages{$name} years old\n";
}
else
{
print "I don't know the age of mohammad\n";
}
Sorting/Ordering Hashes
There is no way to simply guarantee that the order in which a list of keys, values, or
key/value pairs will always be the same. In fact, it's best not even to rely on the order
between two sequential evaluations:
#!/usr/bin/perl
If you want to guarantee the order, use sort, as, for example:
If you are accessing a hash a number of times and want to use the same order, consider
creating a single array to hold the sorted sequence, and then use the array (which will
remain in sorted order) to iterate over the hash. For example:
Hash Size
You get the size - that is, the number of elements - from a hash by using scalar context on
either keys or values:
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
print "Hash size: ",scalar keys %ages,"\n";
#!/usr/bin/perl
%ages = ('Martin' => 28, 'Sharon' => 35, 'Rikke' => 29);
Some variables have a predefined and special meaning in Perl. They are the variables that use
punctuation characters after the usual variable indicator ($, @, or %), such as $_ ( explained
below ).
Most of the special variables have an english like long name eg. Operating System Error variable $!
can be written as $OS_ERROR. But if you are going to use english like names then you would have
to put one line "use English;" at the top of your program file. This guides the interter to pickup
exact meaning of the variable.
The most commonly used special variable is $_, which contains the default input and
pattern-searching string. For example, in the following lines:
#!/usr/bin/perl
foreach ('hickory','dickory','doc') {
print;
print "\n";
}
The first time the loop is executed, "hickory" is printed. The second time around, "dickory" is
printed, and the third time, "doc" is printed. That's because in each iteration of the loop, the current
string is placed in $_, and is used by default by print. Here are the places where Perl will assume $_
even if you don't specify it:
Various unary functions, including functions like ord and int, as well as the all file tests (-f, -
d) except for -t, which defaults to STDIN.
Various list functions like print and unlink.
The pattern-matching operations m//, s///, and tr/// when used without an =~ operator.
The default iterator variable in a foreach loop if no other variable is supplied.
The implicit iterator variable in the grep and map functions.
The default place to put an input record when a line-input operation's result is tested by
itself as the sole criterion of a while test (i.e., ). Note that outside of a while test, this will
not happen.
Here is the list of all the scalar special variables. We have listed corresponding english
like names along with the symbolic names.
$ARG
The current input line number of the last filehandle that was read. An
$. explicit close on the filehandle resets the line number.
$NR
The input record separator; newline by default. If set to the null string, it
$/ treats blank lines as delimiters.
$RS
$OFS
$ORS
Like "$," except that it applies to list values interpolated into a double-
$" quoted string (or similar interpreted string). Default is a space.
$LIST_SEPARATO
R
$SUBSCRIPT_SEPARATO
R
$FORMAT_FORMFEE
D
$FORMAT_LINE_BREAK_CHARACTER
S
$ACCUMULATO
R
Contains the output format for printed numbers (deprecated).
$#
$OFMT
The status returned by the last pipe close, backtick (``) command, or
$? system operator.
$CHILD_ERRO
R
If used in a numeric context, yields the current value of the errno variable,
$! identifying the last system call error. If used in a string context, yields the
corresponding system error string.
$OS_ERROR or
$ERRNO
The Perl syntax error message from the last eval command.
$@
$EVAL_ERROR
$PROCESS_ID
or $PID
$REAL_USER_I
D or $UID
$EFFECTIVE_USER_I
D or $EUID
$REAL_GROUP_I
D or $GID
The effective gid of this process.
$)
$EFFECTIVE_GROUP_I
D or $EGID
Contains the name of the file containing the Perl script being executed.
$0
$PROGRAM_NAM
E
The index of the first element in an array and of the first character in a
$[ substring. Default is 0.
$PERL_VERSIO
N
$DEBUGGING
$EXTENDED_OS_ERRO
R
$SYSTEM_FD_MA
X
The current value of the inplace-edit extension. Use undef to disable inplace
$^I editing.
$INPLACE_EDI
T
Contains the name of the operating system that the current Perl binary was
$^O compiled for.
$OSNAME
The internal flag that the debugger clears so that it doesn't debug itself.
$^P
$PERLDB
The time at which the script began running, in seconds since the epoch.
$^T
$BASETIME
$WARNING
The name that the Perl binary itself was executed as.
$^X
$EXECUTABLE_NAM
E
The array into which the input lines are split when the -a command-line
@F switch is given.
The special filehandle that refers to anything following the __END__ token
DATA in the file containing the script. Or, the special filehandle for anything
following the __DATA__ token in a required file, as long as you're reading
data in the same package __DATA__ was found in.
The special filehandle used to cache the information from the last stat, lstat,
_ (underscore) or file test operator.
Represents the filename at the point in your program where it's used. Not
__FILE__ interpolated into strings.
Represents the current line number. Not interpolated into strings.
__LINE__
$MATCH
The string preceding whatever was matched by the last successful pattern
$` match.
$PREMATCH
The string following whatever was matched by the last successful pattern
$' match.
$POSTMATCH
The last bracket matched by the last search pattern. This is useful if
$+ you don't know which of a set of alternative patterns was matched.
For example: /Version: (.*)|Revision: (.*)/ && ($rev = $+);
$LAST_PAREN_MATC
H
$OUTPUT_AUTOFLUS
H
$FORMAT_LINES_PER_PAG
E
The number of lines left on the page of the currently selected output
$- channel.
$FORMAT_LINES_LEF
T
The name of the current report format for the currently selected output
$~ channel. Default is the name of the filehandle.
$FORMAT_NAM
E
The name of the current top-of-page format for the currently selected
$^ output channel. Default is the name of the filehandle with _TOP
appended.
$FORMAT_TOP_NAM
E
The conditional statements are if and unless, and they allow you to control the execution of your
script. There are five different formats for the if statement:
if (EXPR)
if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
if (EXPR) BLOCK elsif (EXPR) BLOCK ...
if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
STATEMENT if (EXPR)
The first format is classed as a simple statement, since it can be used at the end of another
statement without requiring a block, as in:
In this instance, the message will only be printed if the expression evaluates to a true value.
The second format is the more familiar conditional statement that you may have come
across in other languages:
if ($date == $today)
{
print "Happy Birthday!\n";
}
The third format allows for exceptions. If the expression evaluates to true, then the first
block is executed; otherwise (else), the second block is executed:
if ($date == $today)
{
print "Happy Birthday!\n";
}
else
{
print "Happy Unbirthday!\n";
}
The fourth form allows for additional tests if the first expression does not return true. The
elsif can be repeated an infinite number of times to test as many different alternatives as
are required:
if ($date == $today)
{
print "Happy Birthday!\n";
}
elsif ($date == $christmas)
{
print "Happy Christmas!\n";
}
The fifth form allows for both additional tests and a final exception if all the other tests
fail:
if ($date == $today)
{
print "Happy Birthday!\n";
}
elsif ($date == $christmas)
{
print "Happy Christmas!\n";
}else
{
print "Happy Unbirthday!\n";
}
The unless statement automatically implies the logical opposite of if, so unless the EXPR is true,
execute the block. This means that the statement
print "Happy Unbirthday!\n" unless ($date == $today);
is equivalent to
For example, the following is a less elegant solution to the preceding if...else. Although it
achieves the same result, example:
PERL LOOPINGS
1. while
2. for
3. until
4. foreach
In each case, the execution of the loop continues until the evaluation of the supplied expression
changes.
In the case of a while loop execution continues while the expression evaluates to true.
The until loop executes while the loop expression is false and only stops when the
expression evaluates to a true value.
The list forms of the for and foreach loop are special cases. They continue until the end of
the supplied list is reached.
while Loops
The while loop has three forms:
while EXPRLABEL
while (EXPR) BLOCKLABEL
while (EXPR) BLOCK continue BLOCK
In first form, the expression is evaluated first, and then the statement to which it applies is
evaluated. For example, the following line increases the value of $linecount as long as we continue
to read lines from a given file:
For example, the following line increases the value of $linecount as long as we continue
to read lines from a given file:
To create a loop that executes statements first, and then tests an expression, you need to combine
while with a preceding do {} statement. For example:
do
{
$calc += ($fact*$ivalue);
} while $calc <100;
In this case, the code block is executed first, and the conditional expression is only evaluated at the
end of each loop iteration.
The second two forms of the while loop repeatedly execute the code block as long as the
result from the conditional expression is true. For example, you could rewrite the
preceding example as:
The continue block is executed immediately after the main block and is primarily used as
a method for executing a given statement (or statements) for each iteration, irrespective
of how the current iteration terminated. It is somehow equivalent to for loop
{
my $i = 0;
while ($i <100)
{ ... }
continue
{
$i++;
}
}
This is equivalent to
for (my $i = 0; $i < 100; $i++)
{ ... }
for Loops
A for loop is basically a while loop with an additional expression used to reevaluate the
original conditional expression. The basic format is:
The first EXPR is the initialization - the value of the variables before the loop starts iterating. The
second is the expression to be executed for each iteration of the loop as a test. The third expression
is executed for each iteration and should be a modifier for the loop variables.
Thus, you can write a loop to iterate 100 times like this:
for ($i=0;$i<100;$i++)
{
...
}
You can place multiple variables into the expressions using the standard list operator (the
comma):
for(;;)
{
...
}
until Loops
The inverse of the while loop is the until loop, which evaluates the conditional expression and
reiterates over the loop only when the expression returns false. Once the expression returns true,
the loop ends.
In the case of a do.until loop, the conditional expression is only evaluated at the end of
the code block. In an until (EXPR) BLOCK loop, the expression is evaluated before the
block executes. Using an until loop, you could rewrite the previous example as:
do
{
$calc += ($fact*$ivalue);
} until $calc >= 100;
This is equivalent to
do
{
$calc += ($fact*$ivalue);
} while $calc <100;
foreach Loops
The last loop type is the foreach loop, which has a format like this:
Using a for loop, you can iterate through the list using:
for ($index=0;$index<=@months;$index++)
{
print "$months[$index]\n";
}
This is messy, because you.re manually selecting the individual elements from the array
and using an additional variable, $index, to extract the information. Using a foreach loop,
you can simplify the process:
foreach (@months)
{
print "$_\n";
}
The foreach loop can even be used to iterate through a hash, providing you return the list
of values or keys from the hash as the list:
Labled Loops
Labels can be applied to any block, but they make the most sense on loops. By giving
your loop a name, you allow the loop control keywords to specify which loop their
operation should be applied to. The format for a labeled loop is:
The next keyword skips the remainder of the code block, forcing the loop to proceed to
the next value in the loop. For example:
while (<DATA>)
{
next if /^#/;
}
Above code would skip lines from the file if they started with a hash symbol. If there is
acontinue block, it is executed before execution proceeds to the next iteration of the loop.
The lastkeyword ends the loop entirely, skipping the remaining statements in the code
block, as well as dropping out of the loop. The last keyword is therefore identical to the
break keyword in C and Shellscript. For example:
while ()
{
last if ($found);
}
Would exit the loop if the value of $found was true, whether the end of the file had actually been
reached or not. The continue block is not executed.
The redo keyword reexecutes the code block without reevaluating the conditional statement for the
block before the main
loop. This skips the remainder of the code block and also the continue
code block is reexecuted. For example, the following code would read the next line from
a file if the current line terminates with a backslash:
while(<DATA>)
{
if (s#\\$#)
{
$_ .= <DATA>;
redo;
}
}
Here is an example showing how labels are used in inner and outer loops
OUTER:
while(<DATA>)
{
chomp;
@linearray=split;
foreach $word (@linearray)
{
next OUTER if ($word =~ /next/i)
}
}
goto Statement
There are three basic forms: goto LABEL, goto EXPR, and goto &NAME. In each case, execution
is moved from the current location to the destination.
In the case of goto LABEL, execution stops at the current point and resumes at the point of the
label specified.
The goto &NAME statement is more complex. It allows you to replace the currently executing
subroutine with a call to the specified subroutine instead.
PERL OPERATORS
There are many Perl operators but here are a few of the most common ones:
Arithmetic Operators
+ addition
- subtraction
* multiplication
/ division
(Why do we have separate numeric and string comparisons? Because we don't have special variable
types, and Perl needs to know whether to sort numerically (where 99 is less than 100) or
alphabetically (where 100 comes before 99).
( and , or and not aren't just in the above table as descriptions of the operators -- they're also
supported as operators in their own right. They're more readable than the C-style operators, but
have different precedence to && and friend.
Miscellaneous Operators
= assignment
. string concatenation
x string multiplication
.. range operator (creates a list of numbers)
$a += 1; # same as $a = $a + 1
$a -= 1; # same as $a = $a - 1
$a .= "\n"; # same as $a = $a . "\n";
The basics of handling files are simple: you associate a filehandle with an external entity (usually a
file) and then use a variety of operators and functions within Perl to read and update the data stored
within the data stream associated with the filehandle.
A filehandle is a named internal Perl structure that associates a physical file with a name. All
filehandles are capable of read/write access, so you can read from and update any file or device
associated with a filehandle. However, when you associate a filehandle, you can specify the mode in
which the filehandle is opened.
Here FILEHANDLE is the file handle returned by open function and EXPR is the expression having file
name and mode of opening the file.
Following is the syntax to open file.txt in read-only mode. Here less than < signe indicates
that file has to be opend in read-only mode
open(DATA, "<file.txt");
Here DATA is the file handle which will be used to read the file. Here is the example
which will open a file and will print its content over the screen.
#!/usr/bin/perl
open(DATA, "<file.txt");
while(<DATA>)
{
print "$_";
}
Open Function
Following is the syntax to open file.txt in writing mode. Here less than > signe indicates
that file has to be opend in writing mode
open(DATA, ">file.txt");
This example actually truncates (empties) the file before opening it for writing, which may not be the
desired effect. If you want to open a file for reading and writing, you can put a plus sign before the >
or < characters.
open(DATA, "+<file.txt");
You can open a file in append mode. In this mode writing point will be set to the end of the
file
Entities Definition
< or r Read Only Access
> or w Creates, Writes, and Truncates
>> or a Writes, Appends, and Creates
+< or r+ Reads and Writes
+> or w+ Reads, Writes, Creates, and Truncates
+>> or a+ Reads, Writes, Appends, and Creates
Sysopen Function
The sysopen function is similar to the main open function, except that it uses the
systemopen() function, using the parameters supplied to it as the parameters for the system
function:
For example, to open a file for updating, emulating the +<filename format from open:
You can use O_CREAT to create a new file and O_WRONLY- to open file in write only mode and
O_RDONLY - to open file in read only mode.
The PERMS argument specifies the file permissions for the file specified if it has to be created. By
default it takes 0x666
Value Definition
O_RDWR Read and Write
O_RDONLY Read Only
O_WRONLY Write Only
O_CREAT Create the file
O_APPEND Append the file
O_TRUNC Truncate the file
O_EXCL Stops if file already exists
O_NONBLOCK Non-Blocking usability
Close Function
To close a filehandle, and therefore disassociate the filehandle from the corresponding file, you use
the close function. This flushes the filehandle's buffers and closes the system's file descriptor.
close FILEHANDLE
close
The main method of reading the information from an open filehandle is the
<FILEHANDLE> operator. In a scalar context it returns a single line from the filehandle.
For example:
#!/usr/bin/perl
When you use the <FILEHANDLE> operator in a list context, it returns a list of lines from
the specified filehandle. For example, to import all the lines from a file into an array:
#!/usr/bin/perl
getc Function
The getc function returns a single character from the specified FILEHANDLE, or STDIN if none is
specified:
getc FILEHANDLE
getc
If there was an error, or the filehandle is at end of file, then undef is returned instead.
read Function
The read function reads a block of information from the buffered filehandle: This function is used to
read binary data from the file.
The length of the data read is defined by LENGTH, and the data is placed at the start of SCALAR if no
OFFSET is specified. Otherwise data is placed after OFFSET bytes in SCALAR. The function returns the
number of bytes read on success, zero at end of file, or undef if there was an error.
print Function
For all the different methods used for reading information from filehandles, the main function for
writing information back is the print function.
The print function prints the evaluated value of LIST to FILEHANDLE, or to the current
output filehandle (STDOUT by default). For example:
Copying Files
Here is the example which opens an existing file file1.txt and read it line by line and
generate another copy file2.txt
#!/usr/bin/perl
Renaming a file
Here is an example which shows how we can rename a file file1.txt to file2.txt. Assuming
file is available in /usr/test directory.
#!/usr/bin/perl
This function rename takes two arguments and it just rename existing file
Here is an example which shows how to delete a file file1.txt using unlink function.
#!/usr/bin/perl
unlink ("/usr/test/file1.txt");
tell Function
The first requirement is to find your position within a file, which you do using the tell function:
tell FILEHANDLE
tell
This returns the position of the file pointer, in bytes, within FILEHANDLE if specified, or the current
default selected filehandle if none is specified.
seek Function
The seek function positions the file pointer to the specified number of bytes within a file:
The function uses the fseek system function, and you have the same ability to position relative to
three different points: the start, the end, and the current position. You do this by specifying a value
for WHENCE.
Zero sets the positioning relative to the start of the file. For example, the line sets the file
pointer to the 256th byte in the file.
For example, to perform a quick test of the various permissions on a file, you might use a
script like this:
#/usr/bin/perl
my (@description,$size);
if (-e $file)
{
push @description, 'binary' if (-B _);
push @description, 'a socket' if (-S _);
push @description, 'a text file' if (-T _);
push @description, 'a block special file' if (-b _);
push @description, 'a character special file' if (-c _);
push @description, 'a directory' if (-d _);
push @description, 'executable' if (-x _);
push @description, (($size = -s _)) ? "$size bytes" : 'empty';
print "$file is ", join(', ',@description),"\n";
}
Here is the list of features which you can check for a file
Operator Description
-A Age of file (at script startup) in days since modification.
-B Is it a binary file?
-C Age of file (at script startup) in days since modification.
-M Age of file (at script startup) in days since modification.
-O Is the file owned by the real user ID?
-R Is the file readable by the real user ID or real group?
-S Is the file a socket?
-T Is it a text file?
-W Is the file writable by the real user ID or real group?
-X Is the file executable by the real user ID or real group?
-b Is it a block special file?
-c Is it a character special file?
-d Is the file a directory?
-e Does the file exist?
-f Is it a plain file?
-g Does the file have the setgid bit set?
-k Does the file have the sticky bit set?
-l Is the file a symbolic link?
-o Is the file owned by the effective user ID?
-p Is the file a named pipe?
-r Is the file readable by the effective user or group ID?
-s Returns the size of the file, zero size = empty file.
-t Is the filehandle opened by a TTY (terminal)?
-u Does the file have the setuid bit set?
-w Is the file writable by the effective user or group ID?
-x Is the file executable by the effective user or group ID?
-z Is the file size zero?
Working with Directories
Here is an example which opens a directory and list out all the files available inside this
directory.
#!/usr/bin/perl
Another example to print the list of C source code files, you might use
#!/usr/bin/perl
REGULAR EXPRESSIONS
A regular expression is a string of characters that define the pattern or patterns you are viewing.
The syntax of regular expressions in Perl is very similar to what you will find within other regular
expression.supporting programs, such as sed, grep, and awk.
The basic method for applying a regular expression is to use the pattern binding operators =~ and !
~. The first operator is a test and assignment operator.
There are three regular expression operators within Perl
The forward slashes in each case act as delimiters for the regular expression (regex) that you are
specifying. If you are comfortable with any other delimiter then you can use in place of forward
slash.
The match operator, m//, is used to match a string or statement to a regular expression.
For example, to match the character sequence "foo" against the scalar $bar, you might
use a statement like this:
if ($bar =~ /foo/)
The m// actually works in the same fashion as the q// operator series.you can use any combination
of naturally matching characters to act as delimiters for the expression. For example, m{}, m(),
and m>< are all valid.
You can omit the m from m// if the delimiters are forward slashes, but for all other delimiters you
must use the m prefix.
Note that the entire match expression.that is the expression on the left of =~ or !~ and the
match operator, returns true (in a scalar context) if the expression matches. Therefore the
statement:
Will set $true to 1 if $foo matches the regex, or 0 if the match fails.
In a list context, the match returns the contents of any grouped expressions. For example,
when extracting the hours, minutes, and seconds from a time string, we can use:
The match operator supports its own set of modifiers. The /g modifier allows for global
matching. The /i modifier will make the match case insensitive. Here is the complete list
of modifiers
Modifier Description
i Makes the match case insensitive
m Specifies that if the string has newline or carriage
return characters, the ^ and $ operators will now
match against a newline boundary, instead of a
string boundary
o Evaluates the expression only once
s Allows use of . to match a newline character
x Allows you to use white space in the expression for clarity
g Globally finds all matches
cg Allows the search to continue even after a global match fails
For example, you can use this to get the first and last elements within a list:
#!/usr/bin/perl
foreach (@list)
{
$first = $1 if ?(foo.*)?;
$last = $1 if /(foo.*)/;
}
print "First: $first, Last: $last\n";
The substitution operator, s///, is really just an extension of the match operator that allows
you to replace the text matched with some new text. The basic form of the operator is:
s/PATTERN/REPLACEMENT/;
The PATTERN is the regular expression for the text that we are looking for. The REPLACEMENT is a
specification for the text or regular expression that we want to use to replace the found text with.
For example, we can replace all occurrences of .dog. with .cat. using
$string =~ s/dog/cat/;
Another example:
#/user/bin/perl
Modifier Description
i Makes the match case insensitive
m Specifies that if the string has newline or carriage
return characters, the ^ and $ operators will now
match against a newline boundary, instead of a
string boundary
o Evaluates the expression only once
s Allows use of . to match a newline character
x Allows you to use white space in the expression
for clarity
g Replaces all occurrences of the found expression
with the replacement text
e Evaluates the replacement as if it were a Perl statement,
and uses its return value as the replacement text
Translation
Translation is similar, but not identical, to the principles of substitution, but unlike
substitution, translation (or transliteration) does not use regular expressions for its search
on replacement values. The translation operators are:
tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds
The translation replaces all occurrences of the characters in SEARCHLIST with the
corresponding characters in REPLACEMENTLIST. For example, using the "The cat sat
on the mat." string we have been using in this chapter:
#/user/bin/perl
print "$string\n";
Modifier Description
c Complement SEARCHLIST.
d Delete found but unreplaced characters.
s Squash duplicate replaced characters.
The /d modifier deletes the characters matching SEARCHLIST that do not have a
corresponding entry in REPLACEMENTLIST. For example:
#!/usr/bin/perl
print "$string\n";
The last modifier, /s, removes the duplicate sequences of characters that were replaced,
so:
#!/usr/bin/perl
$string = 'food';
$string = 'food';
$string =~ tr/a-z/a-z/s;
print $string;
You don't just have to match on fixed strings. In fact, you can match on just about
anything you could dream of by using more complex regular expressions. Here's a quick
cheat sheet:
Character Description
. a single character
\s a whitespace character (space, tab, newline)
\S non-whitespace character
\d a digit (0-9)
\D a non-digit
\w a word character (a-z, A-Z, 0-9, _)
\W a non-word character
[aeiou] matches a single character in the given set
[^aeiou] matches a single character outside the given set
(foo|bar|baz) matches any of the alternatives specified
Quantifiers can be used to specify how many of the previous thing you want to match on,
where "thing" means either a literal character, one of the metacharacters listed above, or a
group of characters or metacharacters in parentheses.
Character Description
* zero or more of the previous thing
+ one or more of the previous thing
? zero or one of the previous thing
{3} matches exactly 3 of the previous thing
{3,6} matches between 3 and 6 of the previous thing
{3,} matches 3 or more of the previous thing
The ^ metacharacter matches the beginning of the string and the $ metasymbol matches the end of
the string.
#!/usr/bin/perl
Matching Boundaries
The \b matches at any word boundary, as defined by the difference between the \w class and the \
W class. Because \w includes the characters for a word, and \W the opposite, this normally means
the termination of a word. The \B assertion matches any position that is not a word
boundary. For example:
/\bcat\b/ # Matches 'the cat sat' but not 'cat on the mat'
/\Bcat\B/ # Matches 'verification' but not 'the cat on the mat'
/\bcat\B/ # Matches 'catatonic' but not 'polecat'
/\Bcat\b/ # Matches 'polecat' but not 'catatonic'
Selecting Alternatives
The | character is just like the standard or bitwise OR within Perl. It specifies alternate
matches within a regular expression or group. For example, to match "cat" or "dog" in an
expression, you might use this:
if ($string =~ /cat|dog/)
You can group individual elements of an expression together in order to support complex
matches. Searching for two people.s names could be achieved with two separate tests,
like this:
Grouping Matching
$string =~ /(\S+)\s+(\S+)/;
and
$string =~ /\S+\s+\S+/;
However, the benefit of grouping is that it allows us to extract a sequence from a regular
expression. Groupings are returned as a list in the order in which they appear in the
original. For example, in the following fragment we have pulled out the hours, minutes,
and seconds from a string.
$time =~ m/(\d+):(\d+):(\d+)/;
my ($hours, $minutes, $seconds) = ($1, $2, $3);
When groups are used in substitution expressions, the $x syntax can be used in the
replacement text. Thus, we could reformat a date string using this:
#!/usr/bin/perl
$date = '03/26/1999';
$date =~ s#(\d+)/(\d+)/(\d+)#$3/$1/$2#;
print "$date";
For example, in the following code we have used \G so that we can search to the correct
position and then extract some information, without having to create a more complex,
single regular expression:
#!/usr/bin/perl
$string =~ /:\s+/g;
($time) = ($string =~ /\G(\d+:\d+:\d+)/);
$string =~ /.+\s+/g;
($date) = ($string =~ m{\G(\d+/\d+/\d+)});
The \G assertion is actually just the metasymbol equivalent of the pos function, so between regular
expression calls you can continue to use pos, and even modify the value of pos (and therefore \G)
by using pos as an lvalue subroutine:
#!/usr/bin/perl
PERL SUBROUTINES
The two terms function and subroutine are used interchangeably in Perl. A function is a named code
block that is generally intended to process specified input values into an output value, although this
is not always the case. For example, the print function takes variables and static text and prints the
values on the screen.
Subroutines, like variables, can be declared (without defining what they do) or declared and
defined. To simply declare a subroutine, you use one of the following forms:
sub NAME
sub NAME PROTO
sub NAME ATTRS
sub NAME PROTO ATTRS
where NAME is the name of the subroutine you are creating, PROTO is the prototype for the
arguments the subroutine should expect when called, and ATTRS is a list of attributes that the
subroutine exhibits.
If you want to declare and define a function, then you need to include the BLOCK that defines its
operation:
sub BLOCK
sub PROTO BLOCK
sub ATTRS BLOCK
sub PROTO ATTRS BLOCK
NAME
NAME LIST
NAME (LIST)
&NAME
sub message
{
print "Hello!\n";
}
Function Arguments
The first argument you pass to the subroutine is available within the function as $_[0], the
second argument is $_[1], and so on. For example, this simple function adds two numbers
and prints the result:
sub add
{
$result = $_[0] + $_[1];
print "The result was: $result\n";
}
add(1,2);
The preceding subroutine is fairly simple, but what if you wanted to have named
arguments? The simple answer is to assign the values of @_ to a list of variables:
sub add
{
($numbera, $numberb) = @_;
The shift function is one of the .stack. operands supported by Perl. The shift function
returns (and removes) the first element of an array. For example:
sub add
{
my $numbera = shift;
my $numberb = shift;
The effect is exactly the same as we have shown earlier but we have just obtained the arguments in
a different way.
The return value of any block, including those used in subroutines, is taken as the value
of the last evaluated expression. For exampl,e the return value here is the result of the
calculation.:
sub myfunc
{
$_[0]+$_[1];
}
You can also explicitly return a value using the return keyword:
sub myfunc
{
if (@_)
{
return $_[0]+$_[1];
}
else
{
return 0;
}
}
When called, return immediately terminates the current subroutine and returns the value to the
caller. If you don't specify a value then the return value is undef.
$name = getpwent();
($name, $passwd, $uid, $gid, $quota,
$comment, %gcos, $dir, $shell) = getpwent();
In the first case, the user expects a scalar value to be returned by the function, because that is
what the return value is being assigned to. In the second case, the user expects an array as the
return value, again because a list of scalars has been specified for the information to be inserted
into.
Here's another example, again from the built-in Perl functions, that shows the flexibility:
my $timestr = localtime(time);
In this example, the value of $timestr is now a string made up of the current date and
time, for example, Thu Nov 30 15:21:33 2000. Conversely:
($sec,$min,$hour,$mday,$mon,
$year,$wday,$yday,$isdst) = localtime(time);
Now the individual variables contain the corresponding values returned by localtime.
Lvalue subroutines
WARNING: Lvalue subroutines are still experimental and the implementation may change in future
versions of Perl.
my $val;
sub canmod : lvalue {
# return $val; this doesn't work, don't say "return"
$val;
}
sub nomod {
$val;
}
mysub(1,2,3);
@args = (2,3);
mysub(1,@args);
@args = (1,2,3);
mysub(@args);
Finally when we receive thes values in@_ variable then we can not recognize if we had passed one
array or two value arraysbecause finally it is getting merged into one.
If you want to work with and identify the individual lists passed to Perl, then you need to use
references:
The leading \ character tells Perl to supply a reference, or pointer, to the array. A
reference is actually just a scalar, so we can identify each list by assigning the reference
to each array within our subroutine. Now you can write your subroutineas follows:
sub simplesort
{
my ($listaref, $listbref ) = @_;
When you supply a hash to a subroutine or operator that accepts a list, the hash is
automatically translated into a list of key/value pairs. For example:
This will output .nameTomage19.. However, the same process works in reverse, so we
can extract a list and convert it to a hash:
sub display_hash
{
my (%hash) = @_;
foreach (%hash)
{
print "$_ => $hash{$_}\n";
}
}
In this case, we output the key/value pairs of the hash properly, displaying each pair on
its own line. As with arrays, care needs to be taken if you expect to pick out a single hash
from a list of arguments. The following will work because we extract the hash last:
sub display_has_regexp
{
my ($regex, %hash) = @_;
...
}
while this one won.t because we try to extract the hash first (there will be an extra
element, and Perl won.t know how to assign this to the hash):
sub display_has_regexp
{
my (%hash, $regex) = @_;
...
}
If you want to work with multiple hashes, then use references. For example, the
following subroutine returns the key intersection of two hashes:
sub intersection
{
my ($hasha, $hashb) = @_;
my %newhash;
foreach my $key (keys %{$hasha})
{
$newhash{$key} = $$hasha{$key} if (exists $$hashb{$key});
}
return %newhash;
}
PERL FORMATS
As stated earlier that Perl stands for Practical Extraction and Reporting Language, and we'll now
discuss using Perl to write reports.
Perl uses a writing template called a 'format' to output reports. To use the format feature of Perl,
you must:
Define a Format
Pass the data that will be displayed on the format
Invoke the Format
Define a Format
format FormatName =
fieldline
value_one, value_two, value_three
fieldline
value_one, value_two
.
FormatName represents the name of the format. The fieldline is the specific way the data should be
formatted. The values lines represent the values that will be entered into the field line. You end the
format with a single period.
fieldline can contain any text or fieldholders. Fieldholders hold space for data that will be
placed there at a later date. A fieldholder has the format:
@<<<<
This fieldholder is left-justified, with a field space of 5. You must count the @ sign and
the < signs to know the number of spaces in the field. Other field holders include:
@>>>> right-justified
@|||| centered
@####.## numeric field holder
@* multiline field holder
format EMPLOYEE =
===================================
@<<<<<<<<<<<<<<<<<<<<<< @<<
$name $age
@#####.##
$salary
===================================
.
In this example $name would be written as left justify within 22 character spaces and after that age
will be written in two spaces.
The problem is that the format name is usually the name of an open file handle, and the
write statement will send the output to this file handle. As we want the data sent to the
STDOUT, we must associate EMPLOYEE with the STDOUT filehandle. First, however,
we must make sure that that STDOUT is our selected file handle, using the select()
function
select(STDOUT);
We would then associate EMPLOYEE with STDOUT by setting the new format name
with STDOUT, using the special variable $~
$~ = "EMPLOYEE";
When we now do a write(), the data would be sent to STDOUT. Remember: if you didn't have
STDOUT set as your default file handle, you could revert back to the original file handle by assigning
the return value of select to a scalar value, and using select along with this scalar variable after the
special variable is assigned the format name, to be associated with STDOUT.
Kirsten 12
Mohammad 35
Suhi 15
Namrat 10
Everything looks fine. But you would be interested in adding a header to your report.
This header will be printed on top of each page. It is very simple to do this. Apart from
defining a template you would have to define a header which will have same name but
appended with _TOP keyword as follows
format EMPLOYEE_TOP =
------------------------
Name Age
------------------------
.
------------------------
Name Age
------------------------
Kirsten 12
Mohammad 35
Suhi 15
Namrat 10
What about if your report is taking more than one page ? You have a solution for that. Use $
% vairable along with header as follows
format EMPLOYEE_TOP =
------------------------
Name Age Page @<
------------------------ $%
.
------------------------
Name Age Page 1
------------------------
Kirsten 12
Mohammad 35
Suhi 15
Namrat 10
You can set the number of lines per page using special variable $= ( or $FORMAT_LINES_PER_PAGE
) By default $= will be 60
One final thing is left which is footer. Very similar to header, you can define a footer and
it will be written after each page. Here you will use _BOTTOM keyword instead of
_TOP.
format EMPLOYEE_BOTTOM =
End of Page @<
$%
.
------------------------
Name Age Page 1
------------------------
Kirsten 12
Mohammad 35
Suhi 15
Namrat 10
End of Page 1
For a complete set of variables related to formating, please refer to Perl Special Variablessection.
You can identify and trap an error in a number of different ways. Its very easy to trap errors in Perl
and then handling them properly. Here are few methods which can be used.
Using if
The ifstatement is the obvious choice when you need to check the return value from a
statement; for example:
if (open(DATA,$file))
{
...
}
else
{
die "Error: Couldn't open the file $!";
}
Alternatively, we can reduce the statement to one line in situations where it makes sense
to do so; for example:
Using unless
The unless function is the logical opposite to if: statements can completely bypass the
success status and only be executed if the expression returns false. For example:
unless(chdir("/etc"))
{
die "Error: Can't change directory!: $!";
}
The unlessstatement is best used when you want to raise an error or alternative only if the
expression fails. The statement also makes sense when used in a single-line statement:
For very short tests, you can use the conditional operator:
It's not quite so clear here what we are trying to achieve, but the effect is the same as using an if or
unless statement. The conditional operator is best used when you want to quickly return one of two
values within an expression or statement.
The warn function just raises a warning, a message is printed to STDERR, but no further
action is taken.
The die function works just like warn, except that it also calls exit. Within a normal
script, this function has the effect of immediately terminating execution.
Reporting an error in a module that quotes the module's filename and line number - this is
useful when debugging a module, or when you specifically want to raise a module-related,
rather than script-related, error.
Reporting an error within a module that quotes the caller's information so that you can
debug the line within the script that caused the error. Errors raised in this fashion are useful
to the end-user, because they highlight the error in relation to the calling script's
origination line.
The warn and die functions work slightly differently than you would expect when called
from within a module. For example, the simple module:
package T;
require Exporter;
@ISA = qw/Exporter/;
@EXPORT = qw/function/;
use Carp;
sub function
{
warn "Error in module!";
}
1;
use T;
function();
This is more or less what you might expect, but not necessarily what you want. From a module
programmer's perspective, the information is useful because it helps to point to a bug within the
module itself. For an end-user, the information provided is fairly useless, and for all but the
hardened programmer, it completely pointless.
The solution for such problems is the Carp module, which provides a simplified method for reporting
errors within modules that return information about the calling script. The Carp module provides
four functions: carp, cluck, croak, and confess. These functions are discussed below
The carp function is the basic equivalent of warn and prints the message to STDERR
without actually exiting the script and printing the script name.
The cluck function is a sort of supercharged carp, it follows the same basic principle but
also prints a stack trace of all the modules that led to the function being called, including
information on the original script.
The croak function is the equivalent of die, except that it reports the caller one level up.
Like die, this function also exits the script after reporting the error to STDERR:
croak "Definitely didn't work";
As with carp, the same basic rules apply regarding the including of line and file information
according to the warn and die functions.
The confess function is like cluck; it calls die and then prints a stack trace all the way up
to the origination script.
Each programmer will, of course, have his or her own preferences in regards to formatting, but
there are some general guidelines that will make your programs easier to read, understand, and
maintain
The most important thing is to run your programs under the -w flag at all times. You may turn it off
explicitly for particular portions of code via the no warnings pragma or the $^W variable if you
must. You should also always run under use strict or know the reason why not. The use sigtrap and
even use diagnostics pragmas may also prove useful.
Regarding aesthetics of code lay out, about the only thing Larry cares strongly about is that the
closing curly bracket of a multi-line BLOCK should line up with the keyword that started the
construct. Beyond that, he has other preferences that aren't so strong:
4-column indent.
Opening curly on same line as keyword, if possible, otherwise line up.
Space before the opening curly of a multi-line BLOCK.
One-line BLOCK may be put on one line, including curlies.
No space before the semicolon.
Semicolon omitted in "short" one-line BLOCK.
Space around most operators.
Space around a "complex" subscript (inside brackets).
Blank lines between chunks that do different things.
Uncuddled elses.
No space between function name and its opening parenthesis.
Space after each comma.
Long lines broken after an operator (except and and or).
Space after last parenthesis matching on current line.
Line up corresponding items vertically.
Omit redundant punctuation as long as clarity doesn't suffer.
Here are some other more substantive style issues to think about:
Just because you CAN do something a particular way doesn't mean that you
SHOULD do it that way. Perl is designed to give you several ways to do
anything, so consider picking the most readable one. For instance
is better than
because the second way hides the main point of the statement in a modifier. On
the other hand
is better than
because the main point isn't whether the user typed -v or not.
Don't go through silly contortions to exit a loop at the top or the bottom, when
Perl provides the last operator so you can exit in the middle. Just "outdent" it a
little to make it more visible:
LINE:
for (;;) {
statements;
last LINE if $foo;
next LINE if /^#/;
statements;
}
Don't be afraid to use loop labels--they're there to enhance readability as well as to allow
multilevel loop breaks. See the previous example.
Avoid using grep() (or map()) or `backticks` in a void context, that is, when you just throw
away their return values. Those functions all have return values, so use them. Otherwise
use a foreach() loop or the system() function instead.
For portability, when using features that may not be implemented on every machine, test
the construct in an eval to see if it fails. If you know what version or patchlevel a particular
feature was implemented, you can test $] ($PERL_VERSION in English) to see if it will be
there. The Config module will also let you interrogate values determined by the Configure
program when Perl was installed.
Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a
problem.
While short identifiers like $gotit are probably ok, use underscores to separate words in
longer identifiers. It is generally easier to read $var_names_like_this than
$VarNamesLikeThis, especially for non-native speakers of English. It's also a simple rule
that works consistently with VAR_NAMES_LIKE_THIS.
Package names are sometimes an exception to this rule. Perl informally reserves lowercase
module names for "pragma" modules like integer and strict. Other modules should begin
with a capital letter and use mixed case, but probably without underscores due to
limitations in primitive file systems' representations of module names as files that must fit
into a few sparse bytes.
If you have a really hairy regular expression, use the /x modifier and put in some
whitespace to make it look a little less like line noise. Don't use slash as a delimiter when
your regexp has slashes or backslashes.
Always check the return codes of system calls. Good error messages should go to
STDERR, include which program caused the problem, what the failed system call
and arguments were, and (VERY IMPORTANT) should contain the standard
system error message for what went wrong. Here's a simple but sufficient
example:
opendir(D, $dir) or die "can't opendir $dir: $!";
Think about reusability. Why waste brainpower on a one-shot when you might want to do
something like it again? Consider generalizing your code. Consider writing a module or
object class. Consider making your code run cleanly with use strict and use warnings (or -
w) in effect. Consider giving away your code. Consider changing your whole world view.
Consider... oh, never mind.
Be consistent.
Be nice.
ADVANCED PERL
PERL SOCKETS
NOTE: If you are aware of Unix Sockets then you can leave introduction part
What is a socket?
Just another bit of computer jargon? Going back a little into networking history, it is a Berkeley
UNIX mechanism of creating a virtual duplex connection between processes. This was later ported
on to every known OS enabling communication between systems across geographical location
running on different OS software. If not for the socket, most of the network communication between
systems would never ever have happened.
Taking a closer look; a typical computer system on a network receives and sends information as
desired by the various applications running on it. This information is routed to the system, since a
unique IP address is designated to it. On the system, this information is given to the relevant
applications which listen on different ports. For example a net browser listens on port 80 for
information. Also we can write applications which listen and send information on a specific port
number.
For now, let's sum up that a socket is an IP address and a port, enabling connection.
To explain the socket we will take an example of Client - Server Programming. To complete a client
server architecture we would have to go through the following steps
Creating A Server
Create a socket with socket call.
Bind the socket to a port address with bind call.
Listen to the socket at the port address with listen call.
Accept client connections with accept call.
Creating A Client
Create a socket with socket call.
Connect (the socket) to the remote machine with connect call.
Create a socket
The first step in establishing a network connection is creating a socket, with the sockt()
function
socket creates a SOCKET. The other three arguments are integers which should have the following
values for TCP/IP connections.
1. The address family (For TCP/IP, that's AF_INET, probably 2 on your system )
2. The port number ( for example 21 )
3. The internet address of the computer ( for example 10.12.12.168 )
As the bind() is used by a server which does not need to know its own address so the
argument list looks like this:
The or die clause is very important - if a server dies without outstanding connections the port won't
be immediately reusable unless you use the option SO_REUSEADDR using setsockopt() function.
Here pack() function is being used to pack all the data into binary format.
If this is a server program then it is required to issue a call to listen() on the specified port.
The above call is mandatory for all the servers and here QUEUESIZE is the maximum number of
outstanding connection request allowed. Generally, listen() is used in an infinite loop. As soon as
one connection arrives the server deals with it and then goes back to listen for more connections.
Accepting connections
If this is a server program then it is required to issue a call to access() function to accept the
incoming connections.
The accept call receive SOCKET descriptor returned by socket() function. Upon successful
completion of this call, a new socket descriptor is returned. All future communication between client
and server then takes place over NEW_SOCKET and SOCKET returns to what it does best :
listen()ing for a new connection. If access() call fails then it returns FLASE which is defined in
Socket module which we have used initially.
You will often see accept() used inside a while loop as follows
while(1) {
accept( NEW_SOCKET, SOCKT );
.......
}
Now all the calls related to server are over and let us see a call which will be required by the client
Connection Establishment
If you are going to prepare client program then after using socket() call you would have to use
another call connect() to connect to the server.
Here ADDRESS is a socket address similar to bind call, except that it contains the IP address of
the remote server.
$port = 21; # the ftp port
$server_ip_address = "10.12.12.168";
connect( SOCKET, pack( 'Sn4x8', AF_INET, $port, $server ))
or die "Can't connect to port $port! \n";
If you connect to the server successfully then you can start sending your commands to the server
using SOCKET descriptor.
use strict;
use Socket;
# accepting a connection
my $client_addr;
while ($client_addr = accept(NET_SOCKET, SOCKET)) {
# send them a message, close connection
print NEW_SOCKET "Smile from the server";
close NEW_SOCKET;
}
Now starting above Server
To run the server in background mode issue the following command on Unix prompt
$sever.pl&
use strict;
use Socket;
my $line;
while ($line = <SOCKET>) {
print "$line\n";
}
close SOCKET or die "close: $!";
Packages enable the construction of modules which, when used, won't clobbber variables
and functions outside of the modules's own namespace
If the named package does not exists, a new namespace is first created.
The package stays in effect until either another package statement is invoked, or
until the end of the end of the current block or file.
You can explicitly refer to variables within a package using the :: package qualifier
$PACKAGE_NAME::VARIABLE_NAME
For Example:
$i = 1; print "$i\n"; # Prints "1"
package foo;
$i = 2; print "$i\n"; # Prints "2"
package main;
print "$i\n"; # Prints "1"
You may define any number of code blocks named BEGIN and END which act as
constructors and destructors respectively.
BEGIN { ... }
END { ... }
BEGIN { ... }
END { ... }
Every BEGIN block is executed after the perl script is loaded and compiled but
before any other statement is executed
Every END block is executed just before the perl interpreter exits.
The BEGIN and END blocks are particularly useful when creating Perl modules.
#!/usr/bin/perl
package Foo;
sub bar {
print "Hello $_[0]\n"
}
sub blat {
print "World $_[0]\n"
}
1;
The 1; at the bottom causes eval to evaluate to TRUE (and thus not fail)
#!/usr/bin/perl
require Foo;
Foo::bar( "a" );
Foo::blat( "b" );
Notice above that the subroutine names must be fully qualified (because they are isolated
in their own package)
It would be nice to enable the functions bar and blat to be imported into our own
namespace so we wouldn't have to use the Foo:: qualifier.
#!/usr/bin/perl
use Foo;
bar( "a" );
blat( "b" );
Notice that we didn't have to fully qualify the package's function names?
The use function will export a list of symbols from a module given a few added
statements inside a module
require Exporter;
@ISA = qw(Exporter);
Then, provide a list of symbols (scalars, lists, hashes, subroutines, etc) by filling the list
variable named @EXPORT: For Example
package Module;
require Exporter;
@ISA = qw(Exporter);
@EXPORT = qw(bar blat);
1;
-A omits the Autoloader code (best used by modules that define a large number of
infrequently used subroutines)
-X omits XS elements (eXternal Subroutine, where eXternal means external to
Perl, i.e. C)
So above command creates the following structure inside Person directory. Actual result
is shown above.
Changes
Makefile.PL
README
t/ (test files)
So finally you tar this directory structure into a file Person.tar and you can ship it. You
would have to update README file with the proper instructions. You can provide some
test examples files in t directory.
Installing a Perl Module is very easy. Use the following sequence to install any Perl
Module.
perl Makefile.PL
make
make install
The Perl interpreter has a list of directories in which it searches for modules (global array
@INC)
Before we start Object Oriented concept of perl, lets understand references and anonymous arrays
and hashes
References
A reference is, exactly as the name suggests, a reference or pointer to another object.
There are two types of references: symbolic and hard.
A symbolic reference enables you to refer to a variable by name, using the value of another
variable. For example, if the variable $foo contains the string "bar", the symbolic reference
to $foo refers to the variable $bar.
$foo = 'Bill';
$fooref = \$foo;
The $fooref variable now contains a hard reference to the $foo variable. You can do the
same with other variables:
$array = \@ARGV;
$hash = \%ENV;
$glob = \*STDOUT;
Anonymous Arrays
When you create a reference to an array directly - that is, without creating an intervening named
array - you are creating an anonymous array.
This line assigns an array, indicated by the enclosing square brackets instead of the normal
parentheses, to the scalar $array. The values on the right side of the assignment make up the
array, and the left side contains the reference to this array.
The @arrayarray now contains three elements; the third element is a reference to an anonymous
array of three elements.
Anonymous Hashes
Anonymous hashes are similarly easy to create, except you use braces instead of square
brackets:
The most direct way of dereferencing a reference is to prepend the corresponding data
type character ($ for scalars, @ for arrays, % for hashes, and & for subroutines) that you
are expecting in front of the scalar variable containing the reference. For example, to
dereference a scalar reference $foo, you would access the data as $$foo. Other examples
are:
Object Basics
There are three main terms, explained from the point of view of how Perl handles objects. The
terms are object, class, and method.
Within Perl, an object is merely a reference to a data type that knows what class it belongs
to. The object is stored as a reference in a scalar variable. Because a scalar only contains a
reference to the object, the same scalar can hold different objects in different classes.
A class within Perl is a package that contains the corresponding methods required to create
and manipulate objects.
A method within Perl is a subroutine, defined with the package. The first argument to the
method is an object reference or a package name, depending on whether the method
affects the current object or the class.
Perl provides a bless() function which is used to return a reference and which becomes an object.
Defining a Class
Its very simple to define a class. In Perl, a class is corresponds to a Package.To create a class in
Perl, we first build a package. A package is a self-contained unit of user-defined variables and
subroutines, which can be re-used over and over again. They provide a separate namespace within
a Perl program that keeps subroutines and variables from conflicting with those in other packages.
package Person;
The scope of the package definition extends to the end of the file, or until another package keyword
is encountered.
One can use any kind of Perl variable as an object in Perl. Most Perl programmers choose either
references to arrays or hashes.
Let's create our constructor for our Person class using a Perl hash reference;
When creating an object, you need to supply a constructor. This is a subroutine within a
package that returns an object reference. The object reference is created by blessing a
reference to the package's class. For example:
package Person;
sub new
{
my $class = shift;
my $self = {
_firstName => shift,
_lastName => shift,
_ssn => shift,
};
# Print all the values just for clarification.
print "First Name is $self->{_firstName}\n";
print "Last Name is $self->{_lastName}\n";
print "SSN is $self->{_ssn}\n";
bless $self, $class;
return $self;
}
Every method of a class passes first argument as class name. So in the above example class name
would be "Person". You can try this out by printing value of $class. Next rest of the arguments will
be rest of the arguments passed to the method.
You can use simple hash in your consturctor if you don't want to assign any value to any
class variable. For example
package Person;
sub new
{
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
Defining Methods
Other object-oriented languages have the concept of security of data to prevent a programmer from
changing an object data directly and so provide accessor methods to modify object data. Perl does
not have private variables but we can still use the concept of helper functions methods and ask
programmers to not mess with our object innards.
sub getFirstName {
return $self->{_firstName};
}
sub setFirstName {
my ( $self, $firstName ) = @_;
$self->{_firstName} = $firstName if defined($firstName);
return $self->{_firstName};
}
Lets have a look into complete example: Keep Person package and helper functions into
Person.pm file
#!/usr/bin/perl
package Person;
sub new
{
my $class = shift;
my $self = {
_firstName => shift,
_lastName => shift,
_ssn => shift,
};
# Print all the values just for clarification.
print "First Name is $self->{_firstName}\n";
print "Last Name is $self->{_lastName}\n";
print "SSN is $self->{_ssn}\n";
bless $self, $class;
return $self;
}
sub setFirstName {
my ( $self, $firstName ) = @_;
$self->{_firstName} = $firstName if defined($firstName);
return $self->{_firstName};
}
sub getFirstName {
my( $self ) = @_;
return $self->{_firstName};
}
1;
Now create Person object in mail.pl fileas follows
#!/usr/bin/perl
use Person;
Inheritance
Object-oriented programming sometimes involves inheritance. Inheritance simply means allowing
one class called the Child to inherit methods and attributes from another, called the Parent, so you
don't have to write the same code again and again. For example, we can have a class Employee
which inherits from Person. This is referred to as an "isa" relationship because an employee is a
person. Perl has a special variable, @ISA, to help with this. @ISA governs (method) inheritance.
Perl searches the class of the specified object for the specified object.
Perl searches the classes defined in the object class's @ISA array.
If a matching method still cannot be found, then Perl searches for the method within the
UNIVERSAL class (package) that comes as part of the standard Perl library.
If the method still hasn't been found, then Perl gives up and raises a runtime exception.
So to create a new Employee class that will inherit methods and attributes from our Person class,
we simply code: Keep this code into Employee.pm
#!/usr/bin/perl
package Employee;
use Person;
use strict;
our @ISA = qw(Person); # inherits from Person
Now Employee Class has all the methods and attributes inherited from Person class and
you can use it as follows: Use main.pl file to test it
#!/usr/bin/perl
use Employee;
Method Overriding
The child class Employee inherits all the methods from parent class Person. But if you
would like to override those methods in your child class then you can do it by givig your
implementation. You can add your additional functions in child class. It can done as
follows: modify Employee.pm file
#!/usr/bin/perl
package Employee;
use Person;
use strict;
our @ISA = qw(Person); # inherits from Person
# Override constructor
sub new {
my ($class) = @_;
sub getLastName {
my( $self ) = @_;
return $self->{_lastName};
}
1;
#!/usr/bin/perl
use Employee;
Default Autoloading
Perl offers a feature which you would not find any many other programming languages: a default
subroutine.
If you define a function called AUTOLOAD() then any calls to undefined subroutines
will call AUTOLOAD() function. The name of the missing subroutine is accessible
within this subroutine as $AUTOLOAD. This function is very useful for error handling
purpose. Here is an example to implement AUTOLOAD, you can implement this
function in your way.
sub AUTOLOAD
{
my $self = shift;
my $type = ref ($self) || croak "$self is not an object";
my $field = $AUTOLOAD;
$field =~ s/.*://;
unless (exists $self->{$field})
{
croak "$field does not exist in object/class $type";
}
if (@_)
{
return $self->($name) = shift;
}
else
{
return $self->($name);
}
}
In case you want to implement your destructore which should take care of closing files or doing
some extra processing then you need to define a special method called DESTROY. This method will
be called on the object just before Perl frees the memory allocated to it. In all other respects, the
DESTROY method is just like any other, and you can do anything you like with the object in order to
close it properly.
A destructor method is simply a member function (subroutine) named DESTROY which will be
automatically called
For Example:
package MyClass;
...
sub DESTROY
{
print " MyClass::DESTROY called\n";
}
Here is another nice example which will help you to understand Object Oriented
Concepts of Perl. Put this source code into any file and execute it.
#!/usr/bin/perl
sub new
{
print " MyClass::new called\n";
my $type = shift; # The package/type name
my $self = {}; # Reference to empty hash
return bless $self, $type;
}
sub DESTROY
{
print " MyClass::DESTROY called\n";
}
sub MyMethod
{
print " MyClass::MyMethod called!\n";
}
sub new
{
print " MySubClass::new called\n";
my $type = shift; # The package/type name
my $self = MyClass->new; # Reference to empty hash
return bless $self, $type;
}
sub DESTROY
{
print " MySubClass::DESTROY called\n";
}
sub MyMethod
{
my $self = shift;
$self->SUPER::MyMethod();
print " MySubClass::MyMethod called!\n";
}
$myObject = MyClass->new();
$myObject->MyMethod();
$myObject2 = MySubClass->new();
$myObject2->MyMethod();
DATABASE MANAGEMENT
This session will teach you how to access Oracle Database and other databases using PERL.
Starting from Perl 5 it has become very easy to write database applications using DBI. DBI stands
for Database independent interface for Perl which means DBI provides an abstraction layer between
the Perl code and the underlying database, allowing you to switch database implementations really
easily.
The DBI is a database access module for the Perl programming language. It defines a set of
methods, variables, and conventions that provide a consistent database interface, independent of
the actual database being used.
DBI is independent of any database available in backend. You can use DBI whether you are working
with Oracle, MySQL or Informix etc. This is clear from the following architure diagram.
Here DBI is responsible of taking all SQL commands through the API, or Application Programming
Interface, and to dispatch them to the appropriate driver for actual execution. And finally DBI is
responsible of taking results from the driver and giving back it to the calling scritp.
Throughout this chapter following notations will be used and it is recommended that you
should also follow the same convention.
Database Connection
Assuming we are going to work with MySQL database. Before connecting to a database make sure
followings:
This table is having fields FIRST_NAME, LAST_NAME, AGE, SEX and INCOME.
#!/usr/bin/perl
use DBI
use strict;
my $driver = "mysql";
my $database = "TESTDB";
my $dsn = "DBI:$driver:database=$database";
my $userid = "testuser";
my $password = "test123";
my $dbh = DBI->connect($dsn, $userid, $password )
or die $DBI::errstr;
If a connection is established with the datasource then a Database Handle is returned and saved
into $dbh for further use otherwise $dbh is set to undef value and $DBI::errstr returns an error
string.
INSERT Operation
INSERT operation is required when you want to create your records into TEST_TABLE. So once our
database connection is established, we are ready to create records into TEST_TABLE. Following is
the procedure to create single record into TEST_TABLE. You can create many records in similar
fashion.
Prearing SQL statement with INSERT statement. This will be done using prepare() API.
Executing SQL query to select all the results from the database. This will be done
usingexecute() API.
If everything goes fine then commit this operation otherwise you can rollback complete
transaction. Commit and Rollback are explained in next sections.
There may be a case when values to be entered is not given in advance. In such case
binding values are used. A question mark is used in place of actual value and then actual
values are passed through execute() API. Following is the example.
my $first_name = "john";
my $last_name = "poul";
my $sex = "M";
my $income = 13000;
my $age = 30;
my $sth = $dbh->prepare("INSERT INTO TEST_TABLE
(FIRST_NAME, LAST_NAME, SEX, AGE, INCOME )
values
(?,?,?,?)");
$sth->execute($first_name,$last_name,$sex, $age, $income)
or die $DBI::errstr;
$sth->finish();
$dbh->commit or die $DBI::errstr;
READ Operation
READ Operation on any databasse means to fetch some useful information from the database. So
once our database connection is established, we are ready to make a query into this database.
Following is the procedure to query all the records having AGE greater than 20. This will take four
steps
Prearing SQL query based on required conditions. This will be done using prepare() API.
Executing SQL query to select all the results from the database. This will be done
usingexecute() API.
Fetching all the results one by one and printing those results.This will be done
usingfetchrow_array() API.
There may be a case when condition is not given in advance. In such case binding values are used.
A question mark is used in place of actual value and then actual value is passed
throughexecute() API. Following is the example.
$age = 20;
my $sth = $dbh->prepare("SELECT FIRST_NAME, LAST_NAME
FROM TEST_TABLE
WHERE AGE > ?");
$sth->execute( $age ) or die $DBI::errstr;
print "Number of rows found :" + $sth->rows;
while (my @row = $sth->fetchrow_array()) {
my ($first_name, $last_name ) = @row;
print "First Name = $first_name, Last Name = $last_name\n";
}
$sth->finish();
UPDATE Operation
UPDATE Operation on any databasse means to update one or more records already available in the
database. Following is the procedure to update all the records having SEX as 'M'. Here we will
increase AGE of all the males by one year. This will take three steps
Prearing SQL query based on required conditions. This will be done using prepare() API.
Executing SQL query to select all the results from the database. This will be done
usingexecute() API.
If everything goes fine then commit this operation otherwise you can rollback complete
transaction. See next section for commit and rollback APIs.
There may be a case when condition is not given in advance. In such case binding values are used.
A question mark is used in place of actual value and then actual value is passed
throughexecute() API. Following is the example.
$sex = 'M';
my $sth = $dbh->prepare("UPDATE TEST_TABLE
SET AGE = AGE + 1
WHERE SEX = ?");
$sth->execute('$sex') or die $DBI::errstr;
print "Number of rows updated :" + $sth->rows;
$sth->finish();
$dbh->commit or die $DBI::errstr;
In some case you would like to set a value which is not given in advance so you can use
binding value as follows. In this example income of all males will be set to 10000.
$sex = 'M';
$income = 10000;
my $sth = $dbh->prepare("UPDATE TEST_TABLE
SET INCOME = ?
WHERE SEX = ?");
$sth->execute( $income, '$sex') or die $DBI::errstr;
print "Number of rows updated :" + $sth->rows;
$sth->finish();
DELETE Operation
DELETE operation is required when you want to delete some records from your database. Following
is the procedure to delete all the records from TEST_TABLE where AGE is equal to 30. This
operation will take following steps.
Prearing SQL query based on required conditions. This will be done using prepare() API.
Executing SQL query to delete required records from the database. This will be done
usingexecute() API.
If everything goes fine then commit this operation otherwise you can rollback complete
transaction.
$age = 30;
my $sth = $dbh->prepare("DELETE FROM TEST_TABLE
WHERE AGE = ?");
$sth->execute( $age ) or die $DBI::errstr;
print "Number of rows deleted :" + $sth->rows;
$sth->finish();
$dbh->commit or die $DBI::errstr;
Using do Statement
If you're doing an UPDATE, INSERT, or DELETE there is no data that comes back from the database,
so there is a short cut to perform this operation. You can use do statement to execute any of the
command as follows.
do returns a true value if it succeeded, and a false value if it failed. Actually, if it succeeds it returns
the number of affected rows. In the example it would return the number of rows that were actually
deleted.
COMMIT Operation
Commit is the operation which gives a green signal to database to finalize the changes and after
this operation no change can be reverted back.
ROLLBACK Operation
If you are not satisfied with all the changes and you want to revert back those changes then
userollback API.
Begin Transaction
Many databases support transactions. This means that you can make a whole bunch of queries
which would modify the databases, but none of the changes are actually made. Then at the end you
issue the special SQL query COMMIT, and all the changes are made simultaneously. Alternatively,
you can issue the query ROLLBACK, in which case all the queries are thrown away.
begin_work API enables transactions (by turning AutoCommit off) until the next call to
commit or rollback. After the next commit or rollback, AutoCommit will automatically
be turned on again.
AutoCommit Option
If your transactions are simple, you can save yourself the trouble of having to issue a lot
of commits. When you make the connect call, you can specify an AutoCommit option
which will perform an automatic commit operation after every successful query. Here's
what it looks like:
When you make the connect call, you can specify a RaiseErrors option that handles errors
for you automatically. When an error occurs, DBI will abort your program instead of
returning a failure code. If all you want is to abort the program on an error, this can be
convenient. Here's what it looks like:
Disconnecting Database
To disconnect Database connection, use disconnect API.
The transaction behaviour of the disconnect method is, sadly, undefined. Some database systems
(such as Oracle and Ingres) will automatically commit any outstanding changes, but others (such as
Informix) will rollback any outstanding changes. Applications not using AutoCommit should explicitly
call commit or rollback before calling disconnect.
Undefined values, or undef, are used to indicate NULL values. You can insert and update
columns with a NULL value as you would a non-NULL value. These examples insert and
update the column age with a NULL value:
$sth = $dbh->prepare(qq{
INSERT INTO TEST_TABLE (FIRST_NAME, AGE) VALUES (?, ?)
});
$sth->execute("Joe", undef);
However, care must be taken when trying to use NULL values in a WHERE clause.
Consider:
Binding an undef (NULL) to the placeholder will not select rows which have a NULL age! At least for
database engines that conform to the SQL standard. Refer to the SQL manual for your database
engine or any SQL book for the reasons for this. To explicitly select NULLs you have to say "WHERE
age IS NULL".
A common issue is to have a code fragment handle a value that could be either defined or
undef (non-NULL or NULL) at runtime. A simple technique is to prepare the appropriate
statement as needed, and substitute the placeholder for non-NULL cases:
available_drivers
@ary = DBI->available_drivers;
@ary = DBI->available_drivers($quiet);
Returns a list of all available drivers by searching for DBD::* modules through the directories in
@INC. By default, a warning is given if some drivers are hidden by others of the same name in
earlier directories. Passing a true value for $quiet will inhibit the warning.
installed_drivers
%drivers = DBI->installed_drivers();
Returns a list of driver name and driver handle pairs for all drivers 'installed' (loaded) into the
current process. The driver name does not include the 'DBD::' prefix.
data_sources
@ary = DBI->data_sources($driver);
Returns a list of data sources (databases) available via the named driver. If $driver is empty or
undef, then the value of the DBI_DRIVER environment variable is used.
quote
$sql = $dbh->quote($value);
$sql = $dbh->quote($value, $data_type);
Quote a string literal for use as a literal value in an SQL statement, by escaping any
special characters (such as quotation marks) contained within the string and adding the
required type of outer quotation marks.
For most database types, quote would return 'Don''t' (including the outer quotation
marks). It is valid for the quote() method to return an SQL expression that evaluates to
the desired string. For example:
$quoted = $dbh->quote("one\ntwo\0three")
err
$rv = $h->err;
or
$rv = $DBI::err
or
$rv = $h->err
Returns the native database engine error code from the last driver method called. The code is
typically an integer but you should not assume that. This is equivalent to $DBI::err or $h->err.
errstr
$str = $h->errstr;
or
$str = $DBI::errstr
or
$str = $h->errstr
Returns the native database engine error message from the last DBI method called. This has the
same lifespan issues as the "err" method described above. This is equivalent to $DBI::errstr or $h-
>errstr.
rows
$rv = $h->rows;
or
$rv = $DBI::rows
This returns the number of rows effected by previous SQL statement and equivalent to $DBI::rows.
trace
$h->trace($trace_settings);
DBI sports an extremely useful ability to generate runtime tracing information of what it's doing,
which can be a huge time-saver when trying to track down strange problems in your DBI programs.
You can use different values to set trace level. These values varies from 0 to 4. The value 0 means
disable trace and 4 means generate complete trace.
First, prepare calls can take a long time. The database server has to compile the SQL and
figure out how it is going to run the query. If you have many similar queries, that is a
waste of time.
Second, it will not work if $first_name contains a name like O'Brien or D'Fecto or some
other name with an '. The ' has a special meaning in SQL, and the database will not
understand when you ask it to prepare an SQL statement.
Finally, if you're going to be constructing your query based on a user input then
it's unsafe to simply interpolate the input directly into the query, because the user
can construct a strange input in an attempt to trick your program into doing
something it didn't expect. For example, suppose the user enters the following
bizarre value for $input:
SELECT *
FROM TEST_TABLE
WHERE first_name = 'x'
or first_name = first_name or first_name = 'y'
The part of this query that our sneaky user is interested in is the second or clause. This
clause selects all the records for which first_name is equal to first_name; that is, all of
them.
Thus don't use interpolated statement instead use bind value to prepare dynamic SQL
statement.
CGI PROGRAMMING
What is CGI ?
The Common Gateway Interface, or CGI, is a set of standards that define how information
is exchanged between the web server and a custom script.
The CGI specs are currently maintained by the NCSA and NCSA defines CGI is as follows:
The Common Gateway Interface, or CGI, is a standard for external gateway programs to
interface with information servers such as HTTP servers.
Web Browsing
To understand the concept of CGI, lets see what happens when we click a hyper link to browse a
particular web page or URL.
Your browser contacts the HTTP web server and demand for the URL ie. filename.
Web Server will parse the URL and will look for the filename in if it finds that file then sends
back to the browser otherwise sends an error message indicating that you have requested a
wrong file.
Web browser takes response from web server and displays either the received file or error
message.
However, it is possible to set up the HTTP server so that whenever a file in a certain directory is
requested that file is not sent back; instead it is executed as a program, and whatever that program
outputs is sent back for your browser to display. This function is called the Common Gateway
Interface or CGI and the programs are called CGI scripts. These CGI programs can be a PERL Script,
Shell Script, C or C++ program etc.
Before you proceed with CGI Programming, make sure that your Web Server supports CGI and it is
configured to handle CGI Programs. All the CGI Programs be executed by the HTTP server are kept
in a pre-configured directory. This directory is called CGI Directory and by convention it is named as
/cgi-bin. By convention PERL CGI files will have extention as .cgi.
Here is a simple link which is linked to a CGI script called hello.cgi. This file is being kept in /cgi-bin/
directory and it has following content. Before running your CGI program make sure you have chage
mode of file using chmod 755 hello.cgi UNIX command.
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print '<html>';
print '<head>';
print '<title>Hello Word - First CGI Program</title>';
print '</head>';
print '<body>';
print '<h2>Hello Word! This is my first CGI program</h2>';
print '</body>';
print '</html>';
1;
This hello.cgi script is a simple PERL script which is writing its output on STDOUT file ie. screen.
There is one important and extra feature available which is first line to be printed Content-
type:text/html\r\n\r\n. This line is sent back to the browser and specifiy the content type to be
displayed on the browser screen. Now you must have undertood basic concept of CGI and you can
write many complicated CGI programs using PERL. This script can interact with any other exertnal
system also to exchange information such as RDBMS.
HTTP Header
For Example
Content-type:text/html\r\n\r\n
There are few other important HTTP headers which you will use frequently in your CGI
Programming.
Header Description
Content-type: String A MIME string defining the format of the file being returned. Example is
Content-type:text/html
Expires: Date String The date the information becomes invalid. This should be used by the
browser to decide when a page needs to be refreshed. A valid date string
should be in the format 01 Jan 1998 12:00:00 GMT.
Location: URL String The URL that should be returned instead of the URL requested. You can
use this filed to redirect a request to any file.
Content-length: String The length, in bytes, of the data being returned. The browser uses this
value to report the estimated download time for a file.
All the CGI program will have access to the following environment variables. These
variables play an important role while writing any CGI program.
CONTENT_TYPE The data type of the content. Used when the client is sending attached
content to the server. For example file upload etc.
CONTENT_LENGTH The length of the query information. It's available only for POST requests
HTTP_COOKIE Return the set cookies in the form of key & value pair.
HTTP_USER_AGENT The User-Agent request-header field contains information about the user
agent originating the request. Its name of the web browser.
QUERY_STRING The URL-encoded information that is sent with GET method request.
REMOTE_ADDR The IP address of the remote host making the request. This can be useful
for logging or for authentication purpose.
REMOTE_HOST The fully qualified name of the host making the request. If this information
is not available then REMOTE_ADDR can be used to get IR address.
REQUEST_METHOD The method used to make the request. The most common methods are GET
and POST.
SERVER_SOFTWARE The name and version of the software the server is running.
Here is small CGI program to list out all the CGI variables. Click this link to see the result Get
Environment
#!/usr/bin/perl
1;
This HTTP header will be different from the header mentioned in previous section.
For example,if you want make a FileName file downloadable from a given link then its
syntax will be as follows.
#!/usr/bin/perl
# HTTP Header
print "Content-Type:application/octet-stream; name=\"FileName\"\r\n";
print "Content-Disposition: attachment; filename=\"FileName\"\r\n\n";
The GET method sends the encoded user information appended to the page request. The
page and the encoded information are separated by the ? character as follows:
http://www.test.com/cgi-bin/hello.cgi?key1=value1&key2=value2
The GET method is the defualt method to pass information from browser to web server and it
produces a long string that appears in your browser's Location:box. Never use the GET method if
you have password or other sensitive information to pass to the server. The GET method has size
limtation: only 1024 characters can be in a request string.
This information is passed using QUERY_STRING header and will be accessible in your CGI Program
through QUERY_STRING environment variable
You can pass information by simply concatenating key and value pairs alongwith any URL or you
can use HTML <FORM> tags to pass information using GET method.
Here is a simple URL which will pass two values to hello_get.cgi program using GET method.
http://www.tutorialspoint.com/cgi-bin/hello_get.cgi?first_name=ZARA&last_name=ALI
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Hello - Second CGI Program</title>";
print "</head>";
print "<body>";
print "<h2>Hello $first_name $last_name - Second CGI Program</h2>";
print "</body>";
print "</html>";
1;
Here is a simple example which passes two values using HTML FORM and submit
button. We are going to use same CGI script hello_get.cgi to handle this imput.
Here is the actual output of the above form, You enter First and Last Name and then click submit
button to see the result.
First Name:
Last Name:
Below is hello_post.cgi script to handle input given by web browser. This script will
handle GET as well as POST method.
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Hello - Second CGI Program</title>";
print "</head>";
print "<body>";
print "<h2>Hello $first_name $last_name - Second CGI Program</h2>";
print "</body>";
print "</html>";
1;
Let us take again same examle as above, which passes two values using HTML FORM
and submit button. We are going to use CGI script hello_post.cgi to handle this imput.
Here is the actual output of the above form, You enter First and Last Name and then click submit
button to see the result.
First Name:
Last Name:
Maths Physics
Below is checkbox.cgi script to handle input given by web browser for radio button.
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Checkbox - Third CGI Program</title>";
print "</head>";
print "<body>";
print "<h2> CheckBox Maths is : $maths_flag</h2>";
print "<h2> CheckBox Physics is : $physics_flag</h2>";
print "</body>";
print "</html>";
1;
Passing Radio Button Data to CGI Program
Radio Buttons are used when only one option is required to be selected.
Here is example HTML code for a form with two radio button:
Maths Physics
Below is radiobutton.cgi script to handle input given by web browser for radio button.
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Radio - Fourth CGI Program</title>";
print "</head>";
print "<body>";
print "<h2> Selected Subject is $subject</h2>";
print "</body>";
print "</html>";
1;
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Text Area - Fifth CGI Program</title>";
print "</head>";
print "<body>";
print "<h2> Entered Text Content is $text_content</h2>";
print "</body>";
print "</html>";
1;
#!/usr/bin/perl
print "Content-type:text/html\r\n\r\n";
print "<html>";
print "<head>";
print "<title>Dropdown Box - Sixth CGI Program</title>";
print "</head>";
print "<body>";
print "<h2> Selected Subject is $subject</h2>";
print "</body>";
print "</html>";
1;
How It Works
Your server sends some data to the visitor's browser in the form of a cookie. The browser may
accept the cookie. If it does, it is stored as a plain text record on the visitor's hard drive. Now, when
the visitor arrives at another page on your site, the cookie is available for retrieval. Once retrieved,
your server knows/remembers what was stored.
Expires : The date the cookie will expire. If this is blank, the cookie will expire when the
visitor quits the browser.
Domain : The domain name of your site.
Path : The path to the directory or web page that set the cookie. This may be blank if you
want to retrieve the cookie from any directory or page.
Secure : If this field contains the word "secure" then the cookie may only be retrieved with
a secure server. If this field is blank, no such restriction exists.
Name=Value : Cookies are set and retrviewed in the form of key and value pairs.
Setting up Cookies
This is very easy to send cookies to browser. These cookies will be sent along with HTTP
Header. Assuming you want to set UserID and Password as cookies. So it will be done as
follows
#!/usr/bin/perl
print "Set-Cookie:UserID=XYZ;\n";
print "Set-Cookie:Password=XYZ123;\n";
print "Set-Cookie:Expires=Tuesday, 31-Dec-2007 23:12:40 GMT";\n";
print "Set-Cookie:Domain=www.tutorialspoint.com;\n";
print "Set-Cookie:Path=/perl;\n";
print "Content-type:text/html\r\n\r\n";
...........Rest of the HTML Content....
From this example you must have understood how to set cookies. We use Set-Cookie HTTP header
to set cookies.
Here it is optional to set cookies attributes like Expires, Domain, and Path. It is notable that cookies
are set before sending magic line "Content-type:text/html\r\n\r\n.
Retrieving Cookies
This is very easy to retrieve all the set cookies. Cookies are stored in CGI environment
variable HTTP_COOKIE and they will have following form.
key1=value1;key2=value2;key3=value3....
#!/usr/bin/perl
$rcvd_cookies = $ENV{'HTTP_COOKIE'};
@cookies = split /;/, $rcvd_cookies;
foreach $cookie ( @cookies ){
($key, $val) = split(/=/, $cookie); # splits on the first =.
$key =~ s/^\s+//;
$val =~ s/^\s+//;
$key =~ s/\s+$//;
$val =~ s/\s+$//;
if( $key eq "UserID" ){
$user_id = $val;
}elsif($key eq "Password"){
$password = $val;
}
}
print "User ID = $user_id\n";
print "Password = $password\n";
CGI Module
Berkeley cgi-lib.pl