0% found this document useful (0 votes)

26 views28 pages

Lec 2 PDF

This document provides an overview of Biopython, a widely used Python package for bioinformatics. It discusses why Python is well-suited for bioinformatics applications due to its cross-platform use, built-in features, dynamic and modular nature. The document then describes several popular Python tools for bioinformatics including Biopython, PyMOL, Scikit-learn, and NumPy. Biopython is highlighted as an open-source collection of Python modules for biological computations that can work with DNA, RNA, protein sequences and structures. The document also discusses common data types used as inputs for Biopython.

Uploaded by

ziadmohamad3412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views28 pages

Lec 2 PDF

Uploaded by

ziadmohamad3412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

BioPython

Edited by
Python Programming
in bioinformatics
Why Python?
 Python can be installed and used on different platforms,
including Windows, Mac, and Linux.
 Python has several built-in features that make it well-
suited for bioinformatics applications.
 Python‟s dynamic and modular nature allows researchers
to reuse and share code, reducing development time and
increasing productivity.
 Python has a relatively simple syntax, making it easy to
learn and use.
 Python is a high-level language that offers advanced data
structures and functions that make it easy to work with
complex biological data.
Tools for Python Programming in
Bioinformatics

1. Biopython
 One of the most widely used bioinformatics packages for Python. Biopython is an
open-source collection of Python modules that provides a set of powerful and easy-
to-use tools for performing biological computations.
 Biopython requires very less code and comes up with the following advantages −

 Some of the tasks of Biopython are:

 Biopython provides tools for working with DNA, RNA, and protein sequences,
including sequence alignment, motif and pattern matching, and translation between
nucleotide and protein sequences.
 Biopython includes tools for working with protein structures, such as parsing and
manipulating PDB files and performing structure comparisons.
 Biopython supports file formats commonly used in bioinformatics, such as FASTA,
GenBank, and BLAST.
 Biopython includes tools for visualizing biological data, such as sequence alignment
plots and phylogenetic trees.
 BioSQL − Standard set of SQL tables for storing sequences plus features and
annotations.
2. PyMOL
 PyMOL is a free and open-source molecular
visualization software used in bioinformatics. It creates
high-quality images and animations of molecular
structures, which can be useful in a variety of applications
including drug discovery, protein engineering, and
molecular biology research.
 PyMOL is written in Python and can easily integrate with
other Python-based tools and libraries.
3. Scikit-learn
 Scikit-learn is a Python library that provides tools for machine
learning. It is a powerful and flexible tool for machine learning
applications in bioinformatics which provides a wide range of
algorithms and tools that can be used to analyze complex
biological datasets and make predictions about biological
systems.
 Some uses of Scikit-learn in bioinformatics are:
 It can be used to classify biological samples based on gene
expression data or proteomics data.
 It can be used to cluster biological samples or reduce the
dimensionality of large datasets.
 It can be used to develop machine learning models to predict
the structure of proteins and protein-protein interactions
based on their amino acid sequences.
4. NumPy (Numerical Python)
 NumPy is a Python library that is used for working with
numerical data in Python. It is extensively used in Pandas,
SciPy, Matplotlib, Scikit-learn, and many other scientific
Python packages. NumPy provides a multidimensional
array object called „ndarray‟ and can be used to perform a
wide range of mathematical operations on arrays.
To install and import Biopython:
What are the input data types for Biopython?
 Text file:
 1. Sequence file (sequence.txt)

 2. Cell Microarray
What are the input data types for Biopython?

 CSV file:

 FASTA File:
What are the input data types for Biopython?

 Other files format like:

 Blast output
 GenBank
 PubMed and Medline
 SCOP, including „dom‟ and „lin‟ files
 UniGene
 SwissProt
Overview on

Some key notes in Python

Data Types: Number Types

int, float, complex

1. Integer 2. Real 3. Complex

numbers: numbers: numbers:
>>> type(4) >>> type(4.5) >>> type(3+2j)

>>> (2+1j)**2
>>> 17/5
(3+4j)

3
Data Types: Strings

 Single quote:
>>> ’atg’
’atg’

 Double quote:
>>> ”atg”
’atg’
>>> ’This is a codon, isn’t it?’
Invalid Syntax

>>> ” This is a codon, isn’t it?” # Or >>> ’This is a codon, isn\’t it?’

This is a codon, isn’t it?

String Operators
 Escape character: Backslash „\‟ , gives special meaning for the
following character.
 To produce more readable outputs: print()
 String Operators: Construct Meaning
 Concatenate + \n Newline
 Copy or replicate * \t Tab
 Checks if first IS in second string in \\ Backslash
 Checks if first IS NOT in second string not in \” Double Qoute
>>> ’atg’ + ’gcc’
’atggcc’
>>> ’atg’ * 3
’atgatgatg’
>>> ’tg’ in ’atgatgatg’
True
>>> ’tc’ in ’atgatgatg’
False
Variables
 Variables are containers that store numbers, strings,
and other data types and structures.
 Variables are names given to values that can be changed.
 Variables are assigned values using the equal sign (=).
>>> codon = ’tag'
>>> dna_sequence = "gtcgcctaaccgtatatttttcccgt"
 A variable cannot be used if not assigned a
value, an error occurs.
>>> dna

NameError: name 'dna' is not defined

Variables
 Naming
 Select meaningful names: dnaSequence, is better than s.
 Follow naming rules:
 Case-sensitive :
 DnaSequence = 1
 DNASEQUENCE = 2
 Dnasequence = 3
 Consists of letters and numbers combinations, and
underscore.
 Dna1, dna_1, dnaSeq.
 Numbers should not be the first letter.
 Invalid: 1dna
 No special characters.
 dna#, dna@1
String Operators
 [i] : returns the character in index i in a string. (index)
 [i:j] : returns the substring between index i and index j in a string. (slice)
>>> dna="gatcccccgatattatttgc”
>>> dna[0]
'g’ - The first position in a string is position 0
>>> dna[-1]
'c’ - Counting from the right using negative
indices, begins with -1
>>> dna[-2]
'g’
>>> dna[0:3]
'gat’ - In slices: Start index included, end index
excluded
>>> dna[:3]
‘gat’ - Ommiting start index means use default, 0
>>> dna[2:]
‘tcccccgatattatttgc’ - Ommiting end index means use default, end
of string
Strings as Objects
• String variables are objects that can perform specific
actions using built-in methods:
>>> dna="gatcccccgatattatttgc
>>> len(dna)
20
>>> dna.count(‟t') - Count characthers
7
>>> dna.count(‟ga') - Count substrings
2
Strings Functions
>>> dna="gatcccccgatattatttgc”
>>> dna.upper() - Convert all to upper case, lower(): Lower
case
GATCCCCCGATATTATTTGC
>>> dna.find(‟ga') - Returns the first occurrence of „ga‟, -1 if not
found
0
>>> dna.find(‟at‟,5) - Returns the first occurrence of „ga‟ starting
from index 5
9
>>> dna.rfind(„ga‟) - Returns the last occurrence of „ga‟, -1 if not
8
>>> dna.islower() - True if all is lower case
True
>>> dna.isupper()
False
>>> dna.replace('a','A') - Replaces all ‟a‟ with ‟A‟
Inputs

>>> dna = input("Enter a DNA sequence, please:")

Enter a DNA sequence, please: agtagcatgaggagggacttc
>>> dna
agtagcatgaggagggacttc
Examples:
 Create a random DNA sequence of length 10
import random
alphabet = "AGCT"
sequence = ""
for i in range(10):
index = random.randint(0, 3)
sequence = sequence + alphabet[index]
Read from a text file
readlines(). read().

•readlines(x); read up to x bytes. If you read(x); read up to x bytes in a file. If

don’t supply a size, it reads all the data you don’t supply the size, it reads the
until it reaches a newline (\n) or the end entire file.
of a paragraph. The output is displayed as strings only
once.
Write a text file
Notes about file modes

What is + means in open()?

• The + adds either reading or writing to an existing open mode (update mode).
• The r means reading file; r+ means reading and writing the file.
• The w means writing file; w+ means reading and writing the file.
• The a means writing file, append mode; a+ means reading and writing file, append mode.
Examples:
 Difference between r and r+ in open()

with open('file.txt„, „r‟) as f: with open('file.txt', 'r+') as f:

print(f.read()) f.write("new line \n")
Output Output
On Terminal On Terminal
new line
welcome to python 1
welcome to python 1
welcome to python 2
welcome to python 2
welcome to python 3
welcome to python 3
welcome to python 4
welcome to python 4

with open('file.txt', 'r') as f:

f.write("test \n")
io.UnsupportedOperation: not writable
Examples:
 Difference between w and w+ in open()
with open('file.txt', 'w+') as f: with open('file.txt', 'w+') as f:
f.write("test 1\n") f.write("test 1\n")
f.write("test 2\n") f.write("test 2\n")
f.write("test 3\n") f.write("test 3\n")
Output f.seek(0)
file.txt lines = f.read()
test 1
test 2
print(lines)
test 3 Output
Terminal
test 1
test 2
test 3

Note: f. seek(0)  move the file pointer to begining

Examples:
 Difference between a and a+ in open()
with open('file.txt', 'a') as f: with open('file.txt', 'a+') as f:
f.write(“3") f.seek(0)
Output lines = f.readlines()
file.txt f.write("\n" + str(len(lines)))
welcome to python 1
welcome to python 2 Output
welcome to python 3 file.txt
welcome to python 4 welcome to python 1
3 welcome to python 2
welcome to python 3
welcome to python 4
4
Assignment
 Apply all the discussed functions on a text file produced
by your self

Control and Coordination Notes
93% (15)
Control and Coordination Notes
21 pages
Python GTU Study Material E-Notes Unit-1 12012021081509AM
100% (1)
Python GTU Study Material E-Notes Unit-1 12012021081509AM
29 pages
Python For Biologist
No ratings yet
Python For Biologist
24 pages
Python in Earth Science
No ratings yet
Python in Earth Science
86 pages
Analytics Python Programming
92% (13)
Analytics Python Programming
203 pages
Python AOS
No ratings yet
Python AOS
209 pages
Bio Python 202111
No ratings yet
Bio Python 202111
63 pages
Python Introduction
No ratings yet
Python Introduction
122 pages
4 2. Sequences
No ratings yet
4 2. Sequences
39 pages
Python Lectures 2
No ratings yet
Python Lectures 2
28 pages
PSP Unit 2
No ratings yet
PSP Unit 2
139 pages
Getting Started With Python in The Lab
No ratings yet
Getting Started With Python in The Lab
18 pages
Session One
No ratings yet
Session One
43 pages
INTRODUCTION TO PYTHON Version 1 WITH SO
No ratings yet
INTRODUCTION TO PYTHON Version 1 WITH SO
158 pages
Python Introduction 2019
No ratings yet
Python Introduction 2019
27 pages
Q1.What Is Dictionary Ans - Dictionaries in Python Is A Data Structure, Used To
No ratings yet
Q1.What Is Dictionary Ans - Dictionaries in Python Is A Data Structure, Used To
23 pages
2B Strings
No ratings yet
2B Strings
26 pages
Python Tutorial
No ratings yet
Python Tutorial
114 pages
Ec 02 2023
No ratings yet
Ec 02 2023
82 pages
05 ModulesFiles
No ratings yet
05 ModulesFiles
16 pages
Python
No ratings yet
Python
60 pages
Unit - 4
No ratings yet
Unit - 4
27 pages
Python Datatype
No ratings yet
Python Datatype
13 pages
EE250Unit1 Technologies
No ratings yet
EE250Unit1 Technologies
61 pages
Practical - Introduction - To - Python - UG Class
No ratings yet
Practical - Introduction - To - Python - UG Class
84 pages
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
No ratings yet
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
42 pages
Unit 4
No ratings yet
Unit 4
138 pages
Python For Bioinformatics - Docx (PDFDrive)
No ratings yet
Python For Bioinformatics - Docx (PDFDrive)
15 pages
Python Numpy-Github - Io
No ratings yet
Python Numpy-Github - Io
25 pages
Module 4
No ratings yet
Module 4
16 pages
Python Scripting For System Administration: Rebeka Mukherjee
No ratings yet
Python Scripting For System Administration: Rebeka Mukherjee
50 pages
Red: What You'd Actually Put in Python File: Common Differences Between Python and MATLAB, Ways To Approach Python
No ratings yet
Red: What You'd Actually Put in Python File: Common Differences Between Python and MATLAB, Ways To Approach Python
19 pages
Python
No ratings yet
Python
71 pages
An Introduction To Python Programming Language
No ratings yet
An Introduction To Python Programming Language
63 pages
An Introduction To Python and Its Use in Bioinformatics: Dr. Nancy Warter-Perez
No ratings yet
An Introduction To Python and Its Use in Bioinformatics: Dr. Nancy Warter-Perez
30 pages
Python Cheat-Sheet PDF
100% (1)
Python Cheat-Sheet PDF
29 pages
RemoveWatermark PYTHON+MID2
No ratings yet
RemoveWatermark PYTHON+MID2
8 pages
01 Python Variables Types and Basic Io
No ratings yet
01 Python Variables Types and Basic Io
8 pages
Babaoskag
No ratings yet
Babaoskag
76 pages
Introduction To Python: Arun Kumar
No ratings yet
Introduction To Python: Arun Kumar
41 pages
Python Notes Sarang Sir
No ratings yet
Python Notes Sarang Sir
24 pages
2.python Basic
No ratings yet
2.python Basic
85 pages
BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas
No ratings yet
BASIC - FUNCTIONALITIES - OF - PYTHON (1) Vikas
52 pages
2B Strings
No ratings yet
2B Strings
23 pages
Q-Step WS 02102019 Practical Introduction To Python
No ratings yet
Q-Step WS 02102019 Practical Introduction To Python
88 pages
1.1 (Co1, Co2)
No ratings yet
1.1 (Co1, Co2)
25 pages
Beel 1234 Lab 1 - Introduction To Python Programming
No ratings yet
Beel 1234 Lab 1 - Introduction To Python Programming
10 pages
PROGRAMMING
No ratings yet
PROGRAMMING
5 pages
664 PythonBasics PDF
100% (1)
664 PythonBasics PDF
42 pages
Numerical Computing: Scilab
No ratings yet
Numerical Computing: Scilab
33 pages
Industrial Visit Report
No ratings yet
Industrial Visit Report
23 pages
CT Hndpyth
No ratings yet
CT Hndpyth
11 pages
NumPy, SciPy and MatPlotLib
100% (1)
NumPy, SciPy and MatPlotLib
18 pages
Natural Science 6 PDF
80% (5)
Natural Science 6 PDF
90 pages
m1 Python Notes
No ratings yet
m1 Python Notes
51 pages
02 Handling Files
No ratings yet
02 Handling Files
18 pages
Python Basics (By Mark Wickert)
No ratings yet
Python Basics (By Mark Wickert)
42 pages
Intro To Scientific Python (2018-01-23) PDF
No ratings yet
Intro To Scientific Python (2018-01-23) PDF
16 pages
A Summer Training Report On Python and It's Libraries Under The Guidance of
No ratings yet
A Summer Training Report On Python and It's Libraries Under The Guidance of
20 pages
Python Numpy Tutorial
No ratings yet
Python Numpy Tutorial
22 pages
Basic Python For Scientists: Pim Schellart May 27, 2010
No ratings yet
Basic Python For Scientists: Pim Schellart May 27, 2010
21 pages
Simon Conway Morris-Life's Solution-Cambridge University Press (2003) PDF
100% (2)
Simon Conway Morris-Life's Solution-Cambridge University Press (2003) PDF
487 pages
Periodontal Surgery Biological Width
No ratings yet
Periodontal Surgery Biological Width
7 pages
Stanley N. Salthe-Development and Evolution - Complexity and Change in Biology-The MIT Press (1993)
No ratings yet
Stanley N. Salthe-Development and Evolution - Complexity and Change in Biology-The MIT Press (1993)
363 pages
Protocol 00001947man Hcytomag-60k
No ratings yet
Protocol 00001947man Hcytomag-60k
32 pages
Class Xi Half-Yearly 2025 Syllabus & Question Paper Pattern
No ratings yet
Class Xi Half-Yearly 2025 Syllabus & Question Paper Pattern
14 pages
The Doctrine of Signatures
100% (1)
The Doctrine of Signatures
5 pages
Affinity Chromatography
No ratings yet
Affinity Chromatography
5 pages
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
No ratings yet
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
56 pages
Sweet Corn Management
100% (1)
Sweet Corn Management
21 pages
Pbio Proteins
No ratings yet
Pbio Proteins
15 pages
Gordon Rattray Taylor - SCIEN
No ratings yet
Gordon Rattray Taylor - SCIEN
7 pages
Food For Thought - What You Eat Depends On Your Sex and Eating Companions
0% (1)
Food For Thought - What You Eat Depends On Your Sex and Eating Companions
4 pages
đề 3- dịp lễ
No ratings yet
đề 3- dịp lễ
14 pages
Assignment For Finals Forensic Chem
No ratings yet
Assignment For Finals Forensic Chem
6 pages
Tanin
No ratings yet
Tanin
15 pages
Modul FisTan 1
No ratings yet
Modul FisTan 1
45 pages
Hyalofast CP3
No ratings yet
Hyalofast CP3
9 pages
4th Laboratory Activity Gallus Domesticus
No ratings yet
4th Laboratory Activity Gallus Domesticus
4 pages
Biosciences Journals Term List
No ratings yet
Biosciences Journals Term List
146 pages
6.3 Defence Against Infectious Disease
No ratings yet
6.3 Defence Against Infectious Disease
37 pages
Eukariotic Cell Structures
No ratings yet
Eukariotic Cell Structures
10 pages
Sporicidal Activity of Glutaraldehyde
No ratings yet
Sporicidal Activity of Glutaraldehyde
13 pages
Mapping Rosella 1 - 2 Maret 2023 - 3
No ratings yet
Mapping Rosella 1 - 2 Maret 2023 - 3
1 page
Harrisons Principles of Internal Medicine, 19th Edition
No ratings yet
Harrisons Principles of Internal Medicine, 19th Edition
8 pages
AP Bio Fall Final Study Guide 16
No ratings yet
AP Bio Fall Final Study Guide 16
2 pages
Prelab: EXPERIMENT 2: Protein Quantification Applying Hartree-Lowry Assay
No ratings yet
Prelab: EXPERIMENT 2: Protein Quantification Applying Hartree-Lowry Assay
3 pages
Dasar Ilmu Tanah Proses Pedogenesis
No ratings yet
Dasar Ilmu Tanah Proses Pedogenesis
12 pages
Darwin Collector Activity 1
No ratings yet
Darwin Collector Activity 1
4 pages

Lec 2 PDF

Uploaded by

Lec 2 PDF

Uploaded by

BioPython

 Some of the tasks of Biopython are:

 Other files format like:

Some key notes in Python

int, float, complex

1. Integer 2. Real 3. Complex

This is a codon, isn’t it?

NameError: name 'dna' is not defined

>>> dna = input("Enter a DNA sequence, please:")

•readlines(x); read up to x bytes. If you read(x); read up to x bytes in a file. If

What is + means in open()?

with open('file.txt„, „r‟) as f: with open('file.txt', 'r+') as f:

with open('file.txt', 'r') as f:

Note: f. seek(0)  move the file pointer to begining

You might also like