0% found this document useful (0 votes)
4 views

Perl

Perl is a programming language designed for text processing, with a straightforward syntax that does not require a main() function. It supports scalar, array, and hash variable types, along with various control structures like loops and conditional statements. The document also covers file handling, including opening, reading, writing, and manipulating files in Perl.

Uploaded by

aashiyan.shaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Perl

Perl is a programming language designed for text processing, with a straightforward syntax that does not require a main() function. It supports scalar, array, and hash variable types, along with various control structures like loops and conditional statements. The document also covers file handling, including opening, reading, writing, and manipulating files in Perl.

Uploaded by

aashiyan.shaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

Perl is a programming language developed by Larry Wall,

especially designed for processing text. It stands for


Practical Extraction and Report Language.

PERL Syntax Overview


A Perl script or program consists of one or more statements.
These statements are simply written in the script in a
straightforward fashion. There is no need to have a main()
function or anything of that kind.

Perl statements end in a semi-colon:


print "Hello, world";

Comments start with a hash symbol and run to the end of the
line:
# This is a comment
Whitespace is irrelevant:
print "Hello, world";

... except inside quoted strings:


# this would print with a linebreak in the middle
print "Hello
world";

Double quotes or single quotes may be used around literal strings:


print "Hello, world";
print 'Hello, world';

However, only double quotes "interpolate" variables and special characters


such as newlines (\n ):
print "Hello, $name\n"; # works fine
print 'Hello, $name\n'; # prints $name\n literally
• Numbers don't need quotes around them:
print 42;

• You can use parentheses for functions' arguments or omit them


according to your personal taste. They are only required occasionally to
clarify issues of precedence. Following two statements produce same
result.
print("Hello, world\n");
print "Hello, world\n";
• A PERL script can be created inside of any normal simple-text editor
program. There are several programs available for every type of
platform. There are many programs designed for programmers available
for download on the web.
• Regardless of the program you choose to use, a PERL file must be
saved with a .pl (.PL) file extension in order to be recognized as a
functioning PERL script. File names can contain numbers, symbols, and
letters but must not contain a space. Use an underscore (_) in places of
spaces.
• First PERL Program:
Assuming you are already on Unix $ prompt. So now open a text file
hello.pl using vi or vim editor and put the following lines inside your file.
#!/usr/bin/perl
# This will print "Hello, World" on the screen
print "Hello, world";

#!/usr/bin is the path where you have installed PERL

• Execute PERL Script:


Before you execute your script be sure to change the mode of the script
file and give execution priviledge, generally a setting of 0755 works
perfectly.
Now to run hello.pl Perl program from the Unix command line, issue the
following command at your UNIX $ prompt:
$perl hello.pl
This will produce following result:
Hello, World
PERL Variable Types
Perl has three built in variable types:
• Scalar
• Array
• Hash

• Scalar variable type


A scalar represents a single value as follows:
my $animal = "camel";
my $answer = 42;
Here my is a keyword which has been explained in the same section at the
bottom.
A scalar values can be strings, integers or floating point numbers, and Perl
will automatically convert between them as required. There is no need to
pre-declare your variable types. Scalar values can be used in various
ways:

print $animal;
print "The animal is $animal\n";
print "The square of $answer is ", $answer * $answer, "\n";
$_ which is the "default variable". It's used as the default
argument to a number of functions in Perl, and it's set
implicitly by certain looping constructs.
print; # prints contents of $_ by default

Array variable type:


An array represents a list of values:
my @animals = ("camel", "llama", "owl");
my @numbers = (23, 42, 69);
my @mixed = ("camel", 42, 1.23);
Arrays are zero-indexed but you can change this setting by
changing default variable $[ or $ARRAY_BASE. Here's how
you get at elements in an array:
print $animals[0]; # prints "camel"
print $animals[1]; # prints "llama"
The special variable $#array tells you the index of the last
element of an array:
print $mixed[$#mixed]; # last element, prints 1.23
if (@animals < 5) { ... } # Here @animals will return 3

• To get multiple values from an array:


@animals[0,1]; # gives ("camel", "llama");
@animals[0..2]; # gives ("camel", "llama", "owl");
@animals[1..$#animals]; # gives all except the first element

• my @sorted = sort @animals;


• my @backwards = reverse @numbers;
# Here sort and reverse are Perl's built-in functions
• Hash variable type:
A hash represents a set of key/value pairs. Actaully hash are type of
arrays with the exception that hash index could be a number or string.
They are prefixed by % sign as follows:

my %fruit_color = ("apple", "red", "banana", "yellow");


You can use whitespace and the => operator to lay them out more nicely:
my %fruit_color = (
apple => "red",
banana => "yellow",
);
To get a hash elements:
$fruit_color{"apple"}; # gives "red“

• You can get at lists of keys and values with keys() and values() built-in
functions.
my @fruits = keys %fruit_colors;
my @colors = values %fruit_colors;

PERL Conditional Statements
print "Happy Birthday!\n" if ($date == $today);
In this instance, the message will only be printed if the expression evaluates
to a true value.
if ($date == $today) { print "Happy Birthday!\n"; }

if ($date == $today) { print "Happy Birthday!\n"; } else { print "Happy


Unbirthday!\n"; }

if ($date == $today) { print "Happy Birthday!\n"; } elsif ($date == $christmas)


{ print "Happy Christmas!\n"; }

if ($date == $today) { print "Happy Birthday!\n"; } elsif ($date == $christmas)


{ print "Happy Christmas!\n"; }else { print "Happy Unbirthday!\n"; }

($date == $today) ? print "Happy B.Day!\n" : print "Happy Day!\n";


PERL Loops
• while Loops
In first form, the expression is evaluated first, and then the statement to
which it applies is evaluated. For example, the following line increases
the value of $linecount as long as we continue to read lines from a given
file:
$linecount++ while ();

• To create a loop that executes statements first, and then tests an


expression, you need to combine while with a preceding do {}
statement. For example:
do
{
$calc += ($fact*$ivalue);
} while $calc <100;
you could rewrite the preceding example as:
while($calc < 100)
{
$calc += ($fact*$ivalue);
}

for Loops
you can write a loop to iterate 100 times like this:
for ($i=0;$i<100;$i++) { ... }

You can place multiple variables into the expressions using the standard
list operator (the comma):
for ($i=0, $j=0;$i<100;$i++,$j++)

You can create an infinite loop like this:


for(;;)
{
... }
until Loops
• In the case of a do.until loop, the conditional expression is only evaluated
at the end of the code block. In an until (EXPR) BLOCK loop, the
expression is evaluated before the block executes. Using an until loop,
you could rewrite the previous example as:
do
{
$calc += ($fact*$ivalue);
} until $calc >= 100;
This is equivalent to
do
{
$calc += ($fact*$ivalue);
} while $calc <100;
Loop Control - next, last and redo
There are three loop control keywords: next, last, and redo.
The next keyword skips the remainder of the code block, forcing the loop
to proceed to the next value in the loop. For example:
while (<DATA>)
{
next if /^#/;
}
Above code would skip lines from the file if they started with a hash symbol.

• The last keyword ends the loop entirely, skipping the remaining
statements in the code block, as well as dropping out of the loop. The
last keyword is therefore identical to the break keyword in C and
Shellscript. For example:
while ( )
{
last if ($found);
}
• Would exit the loop if the value of $found was true, whether the end of
the file had actually been reached or not. The continue block is not
executed.

• The redo keyword reexecutes the code block without reevaluating the
conditional statement for the loop. This skips the remainder of the code
block and also the continue block before the main code block is
reexecuted. For example, the following code would read the next line
from a file if the current line terminates with a backslash:
while(<DATA>)
{
if (s#\\$#)
{
$_ .= <DATA>; redo;
}
}
PERL Built-in Operators
• Arithmetic Operators
+ addition - subtraction * multiplication / division
• Numeric Comparison Operators
• == equality != inequality < less than > greater than <= less than or equal
>= greater than or equal
• String Comparison Operators
eq equality ne inequality lt less than gt greater than le less than or equal
ge greater than or equal
• Boolean Logic Operators
&& and || or ! not
• Miscellaneous Operators
= assignment
. string concatenation
x string multiplication
.. range operator (creates a list of numbers)
PERL Files & I/O
Following is the syntax to open file.txt in read-only mode. Here less than <
signe indicates that file has to be opend in read-only mode
open(DATA, "<file.txt");
Here DATA is the file handle which will be used to read the file. Here is the
example which will open a file and will print its content over the screen.
#!/usr/bin/perl
open(DATA, "<file.txt");
while(<DATA>)
{
print "$_"; }
• Open Function
Following is the syntax to open file.txt in writing mode. Here greater than >
signe indicates that file has to be opend in writing mode
open(DATA, ">file.txt");

• For example, to open a file for updating without truncating it:


open(DATA, "+<file.txt");
• You can open a file in append mode. In this mode writing point will be set
to the end of the file
open(DATA,">>file.txt") || die "Couldn't open file file.txt, $!";
A double >> opens the file for appending, placing the file pointer at the end,
so that you can immediately start appending information. However, you
can.t read from it unless you also place a plus sign in front of it:
open(DATA,"+>>file.txt") || die "Couldn't open file file.txt, $!";
• Following is the table which gives possible values of different modes
Entities Definition
< or r Read Only Access
> or w Creates, Writes, and Truncates
>> or a Writes, Appends, and Creates
+< or r+ Reads and Writes
+> or w+ Reads, Writes, Creates, and Truncates
+>> or a+ Reads, Writes, Appends, and Creates
• Sysopen Function
The sysopen function is similar to the main open function, except that it
uses the system open() function, using the parameters supplied to it as
the parameters for the system function:
For example, to open a file for updating, emulating the +<filename format
from open:
sysopen(DATA, "file.txt", O_RDWR);
or to truncate the file before updating:
sysopen(DATA, "file.txt", O_RDWR|O_TRUNC );

• You can use O_CREAT to create a new file and O_WRONLY- to open file
in write only mode and O_RDONLY - to open file in read only mode.
• The PERMS argument specifies the file permissions for the file specified
if it has to be created. By default it takes 0x666
• Following is the table which gives possible values of MODE
Value Definition
O_RDWR Read and Write
O_RDONLY Read Only
O_WRONLY Write Only
O_CREAT Create the file
O_APPEND Append the file
O_TRUNC Truncate the file
O_EXCL Stops if file already exists
O_NONBLOCK Non-Blocking usability

• Close Function
To close a filehandle, and therefore disassociate the filehandle from the
corresponding file, you use the close function. This flushes the
filehandle's buffers and closes the system's file descriptor.
close FILEHANDLE
close
If no FILEHANDLE is specified, then it closes the currently selected
filehandle. It returns true only if it could successfully flush the buffers and
close the file.
• close(DATA) || die "Couldn't close file properly";
• Reading and Writing Filehandles
Once you have an open filehandle, you need to be able to read and write
information. There are a number of different ways of reading and writing
data into the file.
The <FILEHANDL> Operator
The main method of reading the information from an open filehandle is the
<FILEHANDLE> operator. In a scalar context it returns a single line from
the filehandle. For example:
#!/usr/bin/perl
print "What is your name?\n";
$name = <STDIN>;
print "Hello $name\n";
When you use the <FILEHANDLE> operator in a list context, it returns a list
of lines from the specified filehandle. For example, to import all the lines
from a file into an array:
#!/usr/bin/perl
open(DATA,"<import.txt") or die "Can't open data";
@lines = <DATA>;
close(DATA);
• getc Function
The getc function returns a single character
from the specified FILEHANDLE, or STDIN if
none is specified:
getc FILEHANDLE
getc
If there was an error, or the filehandle is at end
of file, then undef is returned instead.
• print Function
For all the different methods used for reading information from filehandles,
the main function for writing information back is the print function.
print FILEHANDLE LIST
print LIST
print
The print function prints the evaluated value of LIST to FILEHANDLE, or to
the current output filehandle (STDOUT by default). For example:
print "Hello World!\n";

• Renaming a file
Here is an example which shows how we can rename a file file1.txt to
file2.txt. Assuming file is available in /usr/test directory.
#!/usr/bin/perl
rename ("/usr/test/file1.txt", "/usr/test/file2.txt" );
This function rename takes two arguments and it just rename existing file
• Copying Files
Here is the example which opens an existing file file1.txt and read it line by
line and generate another copy file2.txt
#!/usr/bin/perl
# Open file to read
open(DATA1, "<file1.txt");
# Open new file to write
open(DATA2, ">file2.txt");
# Copy data from one file to another.
while(<DATA1>)
{
print DATA2 $_;
}
close( DATA1 );
close( DATA2 );
• Deleting an exiting file
Here is an example which shows how to delete a file file1.txt
using unlink function.
#!/usr/bin/perl unlink ("/usr/test/file1.txt");
Locating Your Position Within a File
You can use to tell function to know the current position of a
file and seek function to point a particular position inside the
file.
tell Function
The first requirement is to find your position within a file, which
you do using the tell function:
tell FILEHANDLE
tell
This returns the position of the file pointer, in bytes, within
FILEHANDLE if specified, or the current default selected
filehandle if none is specified.
• seek Function
The seek function positions the file pointer to the specified number of
bytes within a file:
seek FILEHANDLE, POSITION, WHENCE
The function uses the fseek system function, and you have the same ability
to position relative to three different points: the start, the end, and the
current position. You do this by specifying a value for WHENCE.
Zero sets the positioning relative to the start of the file. For example, the
line sets the file pointer to the 256th byte in the file.
seek DATA, 256, 0;
PERL Regular Expressions
• A regular expression is a string of characters that define the pattern or
patterns you are viewing. The syntax of regular expressions in Perl is
very similar to what you will find within other regular
expression.supporting programs, such as sed, grep, and awk.
• The basic method for applying a regular expression is to use the pattern
binding operators =~ and !~. The first operator is a test and assignment
operator.

There are three regular expression operators within Perl


• Match Regular Expression - m//
• Substitute Regular Expression - s///
• Transliterate Regular Expression - tr///
The forward slashes in each case act as delimiters for the regular
expression (regex) that you are specifying. If you are comfortable with
any other delimiter then you can use in place of forward slash.
• The Match Operator
The match operator, m//, is used to match a string or statement to a
regular expression. For example, to match the character sequence "foo"
against the scalar $bar, you might use a statement like this:
if ($bar =~ /foo/)

The m// actually works in the same fashion as the q// operator series.you
can use any combination of naturally matching characters to act as
delimiters for the expression. For example, m{}, m(), and m>< are all
valid.

• You can omit the m from m// if the delimiters are forward slashes, but for
all other delimiters you must use the m prefix.
• Note that the entire match expression.that is the expression on the left of
=~ or !~ and the match operator, returns true (in a scalar context) if the
expression matches. Therefore the statement:
$true = ($foo =~ m/foo/);
Will set $true to 1 if $foo matches the regex, or 0 if the match fails.

• In a list context, the match returns the contents of any grouped


expressions. For example, when extracting the hours, minutes, and
seconds from a time string, we can use:
my ($hours, $minutes, $seconds) = ($time =~ m/(\d+):(\d+):(\d+)/);
• Match Operator Modifiers
The match operator supports its own set of modifiers. The /g modifier
allows for global matching. The /i modifier will make the match case
insensitive. Here is the complete list of modifiers
Modifier Description
I Makes the match case insensitive
m Specifies that if the string has newline or carriage return
characters, the ^ and $ operators will now match against a newline
boundary, instead of a string boundary
o Evaluates the expression only once
s Allows use of . to match a newline character
x Allows you to use white space in the expression for clarity

g Globally finds all matches


cg Allows the search to continue even after a global match
fails
• Matching Only Once
There is also a simpler version of the match operator - the ?PATTERN?
operator. This is basically identical to the m// operator except that it only
matches once within the string you are searching between each call to
reset.
For example, you can use this to get the first and last elements within a
list:
#!/usr/bin/perl
@list = qw/food foosball subeo footnote terfoot canic footbrdige/;
foreach (@list)
{
$first = $1 if ?(foo.*)?;
$last = $1 if /(foo.*)/;
}
print "First: $first, Last: $last\n";

This will produce following result


First: food, Last: footbrdige
• The Substitution Operator
The substitution operator, s///, is really just an extension of the match
operator that allows you to replace the text matched with some new text.
The basic form of the operator is:
s/PATTERN/REPLACEMENT/;
The PATTERN is the regular expression for the text that we are looking for.
The REPLACEMENT is a specification for the text or regular expression
that we want to use to replace the found text with.
For example, we can replace all occurrences of .dog. with .cat. using
$string =~ s/dog/cat/;

• Another example:
#/user/bin/perl
$string = 'The cat sat on the mat';
$string =~ s/cat/dog/;
print "Final Result is $string\n";
This will produce following result
The dog sat on the mat
• Translation
Translation is similar, but not identical, to the principles of substitution, but
unlike substitution, translation (or transliteration) does not use regular
expressions for its search on replacement values. The translation
operators are:
tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds
The translation replaces all occurrences of the characters in SEARCHLIST
with the corresponding characters in REPLACEMENTLIST. For example,
using the "The cat sat on the mat." string we have been using in this
chapter:
#/user/bin/perl
$string = 'The cat sat on the mat';
$string =~ tr/a/o/;
print "$string\n";

This will produce following result


The cot sot on the mot.
• Matching Boundaries
The \b matches at any word boundary, as defined by the difference
between the \w class and the \W class. Because \w includes the
characters for a word, and \W the opposite, this normally means the
termination of a word. The \B assertion matches any position that is not a
word boundary. For example:
/\bcat\b/ # Matches 'the cat sat' but not 'cat on the mat'
/\Bcat\B/ # Matches 'verification' but not 'the cat on the mat‘
/\bcat\B/ # Matches 'catatonic' but not 'polecat'
/\Bcat\b/ # Matches 'polecat' but not 'catatonic'
• Selecting Alternatives
The | character is just like the standard or bitwise OR within Perl. It
specifies alternate matches within a regular expression or group. For
example, to match "cat" or "dog" in an expression, you might use this:
if ($string =~ /cat|dog/)
You can group individual elements of an expression together in order to
support complex matches. Searching for two people.s names could be
achieved with two separate tests, like this:
if (($string =~ /Martin Brown/) || ($string =~ /Sharon Brown/))

This could be written as follows


if ($string =~ /(Martin|Sharon) Brown/)
• Grouping Matching
From a regular-expression point of view, there is no difference between
except, perhaps, that the former is slightly clearer.
$string =~ /(\S+)\s+(\S+)/;
and
$string =~ /\S+\s+\S+/;
However, the benefit of grouping is that it allows us to extract a sequence
from a regular expression. Groupings are returned as a list in the order in
which they appear in the original. For example, in the following fragment
we have pulled out the hours, minutes, and seconds from a string.
my ($hours, $minutes, $seconds) = ($time =~ m/(\d+):(\d+):(\d+)/);

• As well as this direct method, matched groups are also available within
the special $x variables, where x is the number of the group within the
regular expression. We could therefore rewrite the preceding example as
follows:
$time =~ m/(\d+):(\d+):(\d+)/;
my ($hours, $minutes, $seconds) = ($1, $2, $3);
• When groups are used in substitution expressions, the $x syntax can be used in
the replacement text. Thus, we could reformat a date string using this:
#!/usr/bin/perl
$date = '03/26/1999';
$date =~ s#(\d+)/(\d+)/(\d+)#$3/$1/$2#;
print "$date";
This will produce following result
1999/03/26
• Using the \G Assertion
The \G assertion allows you to continue searching from the point where the
last match occurred.
For example, in the following code we have used \G so that we can
search to the correct position and then extract some information, without
having to create a more complex, single regular expression:
#!/usr/bin/perl
$string = "The time is: 12:31:02 on 4/12/00";
$string =~ /:\s+/g;
($time) = ($string =~ /\G(\d+:\d+:\d+)/);
$string =~ /.+\s+/g;
($date) = ($string =~ m{\G(\d+/\d+/\d+)});
print "Time: $time, Date: $date\n";
This will produce following result
Time: 12:31:02, Date: 4/12/00

The \G assertion is actually just the metasymbol equivalent of the pos


function, so between regular expression calls you can continue to use
pos, and even modify the value of pos (and therefore \G) by using pos as
an lvalue subroutine:
• Regular Expression Variables
Regular expression variables include $, which contains whatever the last
grouping match matched; $&, which contains the entire matched string;
$`, which contains everything before the matched string; and $', which
contains everything after the matched string.
The following code demonstrates the result:
#!/usr/bin/perl
$string = "The food is in the salad bar";
$string =~ m/foo/;
print "Before: $`\n";
print "Matched: $&\n";
print "After: $'\n";

This code prints the following when executed:


Before: The
Matched: foo
After: d is in the salad bar

You might also like