0% found this document useful (0 votes)
57 views11 pages

Introduction To Perl: Part 1

This document provides an introduction to the Perl programming language, including what Perl is, how it is used, why it is useful for biology, its basic features like variables and arrays, and examples of simple Perl scripts for tasks like text parsing and calculating square roots. Key points covered include that Perl is an interpreted language used for tasks like parsing text files and web interfaces, it has advantages like powerful regular expressions and system command access, and disadvantages like terse hard-to-read code. Examples demonstrate basic Perl syntax and how to write simple programs.

Uploaded by

apuijian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views11 pages

Introduction To Perl: Part 1

This document provides an introduction to the Perl programming language, including what Perl is, how it is used, why it is useful for biology, its basic features like variables and arrays, and examples of simple Perl scripts for tasks like text parsing and calculating square roots. Key points covered include that Perl is an interpreted language used for tasks like parsing text files and web interfaces, it has advantages like powerful regular expressions and system command access, and disadvantages like terse hard-to-read code. Examples demonstrate basic Perl syntax and how to write simple programs.

Uploaded by

apuijian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction to Perl: part 1

EPFL Bioinformatics II 13 Mar 2006


What is Perl ?
How is Perl is used ?
Why is Perl useful in Biology ?
Basics of Perl
Simple text file parsing
Arrays

By the way: Perl stays for Practical Extraction and Report Language
What is Perl.
EPFL Bioinformatics II 13 Mar 2006
A Unix shell: standard Unix commands are accessible within Perl code
An interpreted programming language: each command is compiled on the
fly.
A complicated and rich programming language: contains many commands,
each command may do many different things at the same time.
A powerful but dangerous language: You are allowed to do almost
everything you want on your own risk. Typos often lead to unexpected
actions rather than compilation errors.

Pros and Cons of Perl
EPFL Bioinformatics II 13 Mar 2006
Advantages:
Powerful: You can program a complicated task with very few lines of
code. It often takes less than an hour to write a Perl script
Advanced support for regular expressions (string matching operations).
This is especially useful for text file parsing.
Easy access to system commands from within Perl
Support for web-based applications through CGI interface.

Disadvantages:
Code is very concise and usually difficult to read
Slow for computationally intensive tasks.
Little control over systems resources.
Not very transparent: Many trivial tasks like initialization of variables
are done automatically, but you dont know exactly how

For what purposes is Perl used in biology
EPFL Bioinformatics II 13 Mar 2006
To parse and reformat structured text files, e.g. nucleotide sequence entry
files.
Web-based program interfaces
For implementing computationally inexpensive algorithms
For testing computationally expensive algorithms on small data sets
For piloting complex data processing pipelines invoking several compiled
programs in succession
For automating any repetitive simple task
For educational purposes: You will not escape writing the Smith-
Waterman algorithm in Perl.
How is Perl used ?
EPFL Bioinformatics II 13 Mar 2006
On a Unix system, write a text file named e.g. myrog.pl.
#!/usr/local/bin/perl
#
print "Hello!\n";
Then make this file executable for you
% chmod +x myprog.pl
and call it like any Unix command:
myprog.pl
Note: the character sequence \n represents a new line (line feed) character.
Alternatively, Perl commands can also be submitted directly from the UNIX
commanc line:
% perl -e 'print "hello!\n"'
Indicates location of Perl
Interpreter on local machine
Basics of Perl
EPFL Bioinformatics II 13 Mar 2006
Lines starting with # are ignored (comment lines
Individual commands are separated by ;
Variable names start with a $
Blocks of commands are encompassed by {}
Example of a Perl script which computes and prints the square-roots of
integers 1 to 20.
#!/usr/local/bin/perl
#
for($i=1; $i <= 10; $i = $i+1) {
$sqrt = $i**0.5;
print "square-root of $i is $sqrt\n"
}
Result
EPFL Bioinformatics II 13 Mar 2006
#!/usr/local/bin/perl
#
for($i=1; $i <= 10; $i = $i+1) {
$sqrt = $i**0.5;
print "square-root of $i is $sqrt\n"
}
square-root of 1 is 1
square-root of 2 is 1.4142135623731
square-root of 3 is 1.73205080756888
square-root of 4 is 2
square-root of 5 is 2.23606797749979
square-root of 6 is 2.44948974278318
square-root of 7 is 2.64575131106459
square-root of 8 is 2.82842712474619
square-root of 9 is 3
square-root of 10 is 3.16227766016838
Example of a simple text parsing file:
EPFL Bioinformatics II 13 Mar 2006
#!/usr/local/bin/perl
#
$prt = 0;
while(<STDIN>) {
if(/^ID/) {$text = "$_"; $prt = 0}
if(/^OS.*Homo sapiens/) {$prt = 1}
if(/^DE/) {$text = $text . "$_"}
if(/^\/\// and $prt) {print "$text"}
}
This script scans a Swiss-Prot sequence library file and prints for each human entry
The ID and DE lines. The Swiss-Prot library file is read for the standard intput. The
script may be called as follows:

% text_parsing.pl < swissprot.dat
Example of a simple text parsing file:
EPFL Bioinformatics II 13 Mar 2006
#!/usr/local/bin/perl
#
$prt = 0;
while(<STDIN>) {
if(/^ID/) {$text = "$_"; $prt = 0}
if(/^OS.*Homo sapiens/) {$prt = 1}
if(/^DE/) {$text = $text . "$_"}
if(/^\/\// and $prt) {print "$text"}
}
Explanations:
while(expr){block} repeats block as long as expr returns true (1).
<STDIN> reads on line from standard intput and stores content
in predefined variable $_.
/^string/ returns true (1) if string is found at the beginning (^)
of $_.
$text . "$_" concatenates two character strings
\/\/ backslashes are needed to force slashes to interpreted
as ordinary characters without specific meaning
Arrays:
EPFL Bioinformatics II 13 Mar 2006
#!/usr/local/bin/perl
#
@numbers = ("one", "two", "three", "four");
print scalar(@numbers) . "\n";
print "$numbers[0]\n";
Example:
Output:
4
one
Notes:
Array names start with @
References to array elements start with $
The numbering of array elements starts with 0, as in C.
The function scalar returns the size of the array
Two-dimensional array:
EPFL Bioinformatics II 13 Mar 2006
#!/usr/local/bin/perl
#
@table = ( ["aa", "ab", "ac"], ["ba" , "bb" , "bc"] );
print scalar(@table) . "\n";
print "$#{@table}\n";
print "$#{$table[0]}\n";
print "$table[1][2]\n";
Example:
Output:
2
1
2
bc
Note:
$#{@array} returns the subscript of the last element of @array.

You might also like