0% found this document useful (0 votes)

148 views

Lecture14-Perl in Bioinformatics

This document discusses using the Perl programming language for bioinformatics tasks. It covers storing and manipulating DNA sequences, transcribing DNA to RNA, finding the reverse complement of a DNA strand, reading and writing protein sequences from files, using regular expressions to search strings, counting nucleotides, translating DNA to proteins, and simulating mutations. It also describes passing data between subroutines, scoping variables, and parsing annotation from FASTA format files. The final section outlines a lab assignment involving getting sequence data from a user, checking for valid FASTA format, and modeling the DNA replication, transcription and translation processes.

Uploaded by

noor ulain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views

Lecture14-Perl in Bioinformatics

Uploaded by

noor ulain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

BIO400

Lecture 13 & 14

Perl in Bioinformatics

Tamsila Parveen

1
Perl & Bioinformatics
• Perl is a popular programming language that's extensively used in
areas such as bioinformatics and web programming.

• Biological data is proliferating rapidly.

• GenBank and the Protein Data Bank have been growing exponentially
for some time now.

• Bioperl is an open source bioinformatics toolkit used by researchers

all over the world.

2
Store a DNA Sequence & Concatenating 2 DNA
• Let's write a small program that stores some DNA in a variable and
prints it to the screen.

Concatenating 2 DNA

3
Transcription: DNA to RNA
• Transcription of DNA to RNA is the outcome of the workings
of a delicate, complex, and error-correcting molecular
machinery

• Here it's a simple substitution. When DNA is transcribed to

RNA, all the T's are changed to U’s.

4
5
Calculating the reverse complement of a
strand of DNA
• The reverse function does exactly what's needed. It's designed
to reverse the order of elements.

• Next substitute all bases by their complements,

• A->T, T->A, G->C, C->G

6
Check if two strands of DNA are reverse
complements of each other

7
Flow Control
• Flow control is the order in which the statements of a program are executed.
• A program executes from the first statement at the top of the program to
the last statement at the
bottom, in order, unless told to do otherwise.
• There are two ways to tell a program to do otherwise: conditional
statements and loops.
• A conditional statement executes a group of statements only if the
conditional test succeeds; otherwise, it just skips the group of
statements.
• A loop repeats a group of statements until an associated test fails

8
Reading Proteins in Files
• First, create a file on your computer (use your text editor) and put
some protein sequence data into it.

• Even to download a file with protein sequence data in it, let's develop
a program that reads the protein sequence data from the file and stores
it into a variable

9
Regular Expressions
• Regular expressions let you easily manipulate strings of all sorts, such as DNA
and protein sequence data.
• What's great about regular expressions is that if there's something
you want to do with a string, you usually can do it with Perl regular expressions.
• Some regular expressions are very simple. For instance, you can just use the
exact text of what you're searching for as a regular expression
• if I was looking for the word
"bioinformatics" in the text of this book, I could use the regular expression:
/bioinformatics/
Some regular expressions can be more complex, however.

10
Searching for motifs

• One of the most

common things we do
in bioinformatics is to
look for motifs

• This program finds

motifs that the user
types in at the
keyboard

11
Counting Nucleotides
• There are many things you might want to know about a piece of DNA
• Two main points to count each type of nucleotide in a DNA
1. Split the DNA into an array of single bases
2. In a loop, look at each base in turn to iterate over the positions in the string
of DNA while counting

12
A subroutine to get Seq Data

• Pass data into the

subroutine as
arguments, and
then you collect
the return value(s)
of the subroutine.

13
Scoping
• By keeping all variables a subroutine uses active only within the
subroutine, you can make it safe to call the subroutines from
anywhere.
• You make the variables specific only to the subroutine by declaring
them as myvariables.
• my is a keyword defined in Perl that limits variables to the block in
which they are used
• Hiding variables and making them local to only a restricted part of a
program, is called scoping.
• In Perl, using my variables is known as lexical scoping, and it's an
essential part of modularizing the programs.
14
Translation
• The process of the formation of Protein is termed as Translation

• A subroutine to translate DNA sequence into a peptide

15
Parsing Annotations
• It's interesting to think about how to extract the useful information.
• The FEATURES table is certainly a key part of the story.

16
Random Nucleotide Generation
• Mutation is a fundamental topic in biology. Some of them do affect the
proteins and may result in diseases such as cancer

• Using randomization, it's possible to simulate and investigate the

mechanisms of mutations in DNA and their effect upon the biological
activity of their associated proteins.

17
Lab Assignment 4
• Get the sequence file from user

1. Critically check either is it a FASTA file or not (Error handling should

be used if file is of FASTA, other print “Sorry this is not FASTA file”)
2. Check the sequence data (Either DNA, RNA, or Protein)
3. Use Parsing Annotation and get all Features from Feature Table e.g.
(mRNA, CDS, & gene)
4. Code for the complete parsing (means get a DNA sequence, its
Replication, Transcription, and Translation process)
5. Tell the number of nucleotides and amino acids.
18
19

Get (eBook PDF) Introduction to Bioinformatics 5th Edition free all chapters
100% (6)
Get (eBook PDF) Introduction to Bioinformatics 5th Edition free all chapters
41 pages
Project Report On A Comparison of SEBI and SEC: in The Fulfilment of Master of Business Administration (2020-2022)
No ratings yet
Project Report On A Comparison of SEBI and SEC: in The Fulfilment of Master of Business Administration (2020-2022)
14 pages
Stock Audit Report
No ratings yet
Stock Audit Report
8 pages
Pearl
No ratings yet
Pearl
49 pages
Perl Bioinf 0411 PDF
No ratings yet
Perl Bioinf 0411 PDF
69 pages
Pattern Matching With Regular Expressions: Perl For Biologists
No ratings yet
Pattern Matching With Regular Expressions: Perl For Biologists
11 pages
Bif205 Perl-For-bioinformatics Eth 2.00 Sc04
No ratings yet
Bif205 Perl-For-bioinformatics Eth 2.00 Sc04
2 pages
Lab Assignments
100% (1)
Lab Assignments
4 pages
Perl Tutorial
No ratings yet
Perl Tutorial
10 pages
Beginning Perl For Bioinformatics-RVS
No ratings yet
Beginning Perl For Bioinformatics-RVS
49 pages
B Perl: Submitted To:S .N
No ratings yet
B Perl: Submitted To:S .N
8 pages
101060541_PERL_240529_094027
No ratings yet
101060541_PERL_240529_094027
3 pages
Bioperl Overview
No ratings yet
Bioperl Overview
47 pages
Bioinfo Course Notes M1 2020 Dr Mbulli
No ratings yet
Bioinfo Course Notes M1 2020 Dr Mbulli
56 pages
Bio-Perl: S B Mirza 1314 Bioinformatics 7 Semester (A.n)
No ratings yet
Bio-Perl: S B Mirza 1314 Bioinformatics 7 Semester (A.n)
13 pages
Bioinformatics
No ratings yet
Bioinformatics
22 pages
Lab Manual Bioinformatics Laboratory (Bt2308) V Semester B.Tech Degree Programme Department of Biotechnology
No ratings yet
Lab Manual Bioinformatics Laboratory (Bt2308) V Semester B.Tech Degree Programme Department of Biotechnology
28 pages
Bioinformatics final
No ratings yet
Bioinformatics final
18 pages
Brief Bioinform 2004 Laederach 95 6
No ratings yet
Brief Bioinform 2004 Laederach 95 6
2 pages
To Bioinformatics: Dan Lopresti
No ratings yet
To Bioinformatics: Dan Lopresti
43 pages
Perl Programming Exercises 1 - 'A B C'
No ratings yet
Perl Programming Exercises 1 - 'A B C'
29 pages
MSC - Bioinformatics - Year1 Detailing by Bioinformatics Centre SPPU - 03082023
No ratings yet
MSC - Bioinformatics - Year1 Detailing by Bioinformatics Centre SPPU - 03082023
33 pages
Z Bioinformatics
No ratings yet
Z Bioinformatics
14 pages
Pairwise Sequence Allignment
No ratings yet
Pairwise Sequence Allignment
108 pages
unit 1
No ratings yet
unit 1
24 pages
Bioinformatics: Farhan Haq, PHD Department of Biosciences Cui
No ratings yet
Bioinformatics: Farhan Haq, PHD Department of Biosciences Cui
24 pages
Bioinformatics and Quantumcomputing: Bio Informatics
No ratings yet
Bioinformatics and Quantumcomputing: Bio Informatics
10 pages
Exploring Database and Analyzing Protein Sequence
No ratings yet
Exploring Database and Analyzing Protein Sequence
70 pages
Bio Perl
100% (1)
Bio Perl
96 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
Computational Biology
100% (1)
Computational Biology
291 pages
Bio Tics
No ratings yet
Bio Tics
7 pages
Beginning Perl For Bioinformatics
No ratings yet
Beginning Perl For Bioinformatics
17 pages
2a.BioinfoServerDatabase (Proteomics)
No ratings yet
2a.BioinfoServerDatabase (Proteomics)
50 pages
Bio Informatics
No ratings yet
Bio Informatics
46 pages
IBT_DNA_seq_analysis
No ratings yet
IBT_DNA_seq_analysis
38 pages
Lab 1 - Introduction and Protocol
No ratings yet
Lab 1 - Introduction and Protocol
28 pages
Introduction To Programming in Perl
No ratings yet
Introduction To Programming in Perl
20 pages
Lec (1) - Introduction
No ratings yet
Lec (1) - Introduction
41 pages
00 Intro
No ratings yet
00 Intro
19 pages
Sequence Analysis Primer, 1st Edition full download
100% (8)
Sequence Analysis Primer, 1st Edition full download
17 pages
Stuart M. Brown-Bioinformatics - A Biologist's Guide To Biocomputing and The Internet-Eaton Publishing Company - Biotechniques Books (2000)
No ratings yet
Stuart M. Brown-Bioinformatics - A Biologist's Guide To Biocomputing and The Internet-Eaton Publishing Company - Biotechniques Books (2000)
189 pages
Bioinformatics Softwares: by Rifat Shahriyar Student No: 100705037P
No ratings yet
Bioinformatics Softwares: by Rifat Shahriyar Student No: 100705037P
20 pages
BioAlg10 9
No ratings yet
BioAlg10 9
69 pages
Bioinformatics Class Notes
No ratings yet
Bioinformatics Class Notes
12 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
33 pages
R NGS
No ratings yet
R NGS
29 pages
Bioinformatics1
No ratings yet
Bioinformatics1
37 pages
Bioinformatics Session1
No ratings yet
Bioinformatics Session1
35 pages
Applied Bioinformatics
100% (1)
Applied Bioinformatics
166 pages
Ch03 Molecular Biology Primer Part2
No ratings yet
Ch03 Molecular Biology Primer Part2
119 pages
Bioninformaticas Lecture - 1
No ratings yet
Bioninformaticas Lecture - 1
33 pages
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
No ratings yet
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
16 pages
Joint Beca-Ilri Hub, Slu and Unesco Advanced Genomics and Bioinformatics
No ratings yet
Joint Beca-Ilri Hub, Slu and Unesco Advanced Genomics and Bioinformatics
27 pages
8024 Bio Info
No ratings yet
8024 Bio Info
28 pages
Bio in For Matics
No ratings yet
Bio in For Matics
17 pages
Biopython Org DIST Docs Tutorial Tutorial HTML
No ratings yet
Biopython Org DIST Docs Tutorial Tutorial HTML
267 pages
Computer Manipulation of DNA and Protein Sequences
No ratings yet
Computer Manipulation of DNA and Protein Sequences
23 pages
Bio
No ratings yet
Bio
3 pages
Unit 6 - Bioinformatics
No ratings yet
Unit 6 - Bioinformatics
41 pages
Advanced Perl Techniques for Bioinformatics: Optimizing Data Analysis and Computational Biology
From Everand
Advanced Perl Techniques for Bioinformatics: Optimizing Data Analysis and Computational Biology
Adam Jones
No ratings yet
Neural Networks with Python
From Everand
Neural Networks with Python
Mei Wong
No ratings yet
Mallikarjun updated Resumes
No ratings yet
Mallikarjun updated Resumes
2 pages
Banking Law Notes 8
No ratings yet
Banking Law Notes 8
10 pages
Algebra Sheet - 1 - Crwill
No ratings yet
Algebra Sheet - 1 - Crwill
24 pages
When Things Impoverish An Approach To Marx S Analysis of Capitalism in Conjunction With Heidegger S Concern Over Technology
No ratings yet
When Things Impoverish An Approach To Marx S Analysis of Capitalism in Conjunction With Heidegger S Concern Over Technology
20 pages
Title Importance of Effective Communication in The Accounting
No ratings yet
Title Importance of Effective Communication in The Accounting
10 pages
1502A 1504 Datasheet
No ratings yet
1502A 1504 Datasheet
2 pages
Why Richest 1% Dont Buy Luxury Clothes
No ratings yet
Why Richest 1% Dont Buy Luxury Clothes
4 pages
LLAW2014 Questionpaper
No ratings yet
LLAW2014 Questionpaper
2 pages
"An Historical Perspective On Operations Management" Wilson, James M
No ratings yet
"An Historical Perspective On Operations Management" Wilson, James M
6 pages
Proposal 2023 Alamata
No ratings yet
Proposal 2023 Alamata
6 pages
Koc MS 002
100% (1)
Koc MS 002
30 pages
Manual Transmission Assemblycomponents
100% (1)
Manual Transmission Assemblycomponents
5 pages
Integriti Evidence Vault
No ratings yet
Integriti Evidence Vault
6 pages
Computer Organization & Architecture: Lalit Sharma, MCA II
No ratings yet
Computer Organization & Architecture: Lalit Sharma, MCA II
76 pages
Centre of Assessments For Excellence-COAE: ISO 21001:2018 (EOMS-Management Systems For Educational Organizations)
100% (4)
Centre of Assessments For Excellence-COAE: ISO 21001:2018 (EOMS-Management Systems For Educational Organizations)
14 pages
Implementation Plan (Sample) 1. Organizational Awareness and Approval
No ratings yet
Implementation Plan (Sample) 1. Organizational Awareness and Approval
2 pages
Aoxiang Golf Cart Brochure
No ratings yet
Aoxiang Golf Cart Brochure
28 pages
Hofstede's Dimensions3b
No ratings yet
Hofstede's Dimensions3b
16 pages
2.3 DEWA Training - Design of A Solar PV System - Part 2
100% (1)
2.3 DEWA Training - Design of A Solar PV System - Part 2
42 pages
DD2 Operators Manual 2017 - Unlocked PDF
No ratings yet
DD2 Operators Manual 2017 - Unlocked PDF
72 pages
Billie Jean Chords by Michael Jackson
No ratings yet
Billie Jean Chords by Michael Jackson
2 pages
RCM Training
67% (3)
RCM Training
147 pages
Sakeer Hussain.P
No ratings yet
Sakeer Hussain.P
13 pages
Download Complete (eBook PDF) Fiber-Optic Communication Systems 4th Edition PDF for All Chapters
100% (6)
Download Complete (eBook PDF) Fiber-Optic Communication Systems 4th Edition PDF for All Chapters
56 pages
Presented By: 1. Aman Shekar 2. Yogesh Shah 3. Karan Shah 4. Chandrakanth P 5. Ankit Barai 6. Prabhanshu Soni 7. Vaibhav Shah 8. Rachit Khurana 9. Sai Teja 10. Sherly Correa
No ratings yet
Presented By: 1. Aman Shekar 2. Yogesh Shah 3. Karan Shah 4. Chandrakanth P 5. Ankit Barai 6. Prabhanshu Soni 7. Vaibhav Shah 8. Rachit Khurana 9. Sai Teja 10. Sherly Correa
18 pages
ISP Presentation
No ratings yet
ISP Presentation
21 pages
Installation: +91 22 2872 8691 WWW - Technonicol.in Info@technonicol - in
No ratings yet
Installation: +91 22 2872 8691 WWW - Technonicol.in Info@technonicol - in
1 page
File Handling in C
No ratings yet
File Handling in C
37 pages

Lecture14-Perl in Bioinformatics

Uploaded by

Lecture14-Perl in Bioinformatics

Uploaded by

BIO400

• Biological data is proliferating rapidly.

• Bioperl is an open source bioinformatics toolkit used by researchers

• Here it's a simple substitution. When DNA is transcribed to

• Next substitute all bases by their complements,

• One of the most

• This program finds

• Pass data into the

• A subroutine to translate DNA sequence into a peptide

• Using randomization, it's possible to simulate and investigate the

1. Critically check either is it a FASTA file or not (Error handling should

You might also like