0% found this document useful (0 votes)
62 views20 pages

Chapter 01

Bioinformatics
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views20 pages

Chapter 01

Bioinformatics
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Biological Sequences

Wiley Publishing. 2007. All Rights Reserved.

Learning Objectives
Crash course in molecular biology
Knowing the basic properties of the main
biological sequences: DNA, RNA, and
proteins

Outline

1. Protein sequences
2. DNA sequences
3. RNA sequences
4. Entire genomes

Proteins
Proteins are like small machines in the cell.
Proteins carry out most of the work in a cell.
Proteins are synthesized from RNA sequences.

Amino Acids
Proteins are made of 20 amino acids.
Each amino acid is small molecule made up of fewer than
100 atoms.
The 20 amino acids have similar terminations; they can be
chained to one another like Lego bricks.

Protein Sequences
Proteins are made of amino acids chained by peptide bonds.
Protein sequences are written from the N to the C-terminus.
Your average protein is 400 amino acids long.
The longest protein is 30,000 amino acids long.

Protein Structures
Proteins have well-defined
3-dimensional structures.
Hydrophobic amino acids
are in the proteins core.
Hydrophilic amino acids
are on the proteins
surface.

Techniques for Bioinformatic Analysis


of Proteins

Browsing protein databases (Ch. 4)


Predicting transmembrane segments (Ch. 6)
Predicting secondary structural elements (Ch. 11)
Predicting the 3D structure (Ch. 11)
Predicting the domains (Ch. 6)
Predicting the physico-chemical properties (Ch. 6)
Weight
Isoelectric point (pI)
Digestion patterns

Bioinformatic Analysis
of Proteins: More Techniques
Searching protein databases with BLAST (Ch. 7)
Comparing protein sequences with alignments (Ch. 8)
Making multiple-sequence alignments to look for conserved
patterns (Ch.9)
Visualizing 3D protein structures using the PDB database (Ch.
11)
Reconstructing the phylogenetic tree of a protein family (Ch.
13)

DNA
DeoxyriboNucleic Acid
Genomes and genes are made of DNA
DNA is the main support of heredity

DNA Sequences
DNA sequences are made of 4 nucleotides
Adenine

Guanine

Cytosine

Thymine

DNA Sequences can be very long


Human chromosomes contain hundreds of millions of

nucleotides
A tiny bacterium can contain a genome of several million
nucleotides

Nucleotides
Nucleotides have similar terminations.
Nucleotides are meant to be chained like Lego bricks.
Nucleotides can interact with each other:
Adenine with thymine (A with T)
Guanine with cytosine (G with C)

Double-strand DNA
DNA sequences always come in two strands.
The strands are complementary and opposite in orientation.
By convention, biologists write only the 5 and 3 strands.
Database-search programs search both strands automatically .

RNA
RiboNucleic Acid
RNA is a close relative of DNA
RNA has many functions
Provides coding for proteins
Helps synthesize proteins
Helps many basic processes in the cell

RNA is not very stable


RNA is synthesized and very often degraded
DNA, by contrast, is very stable

The RNA Sequence


RNA contains 4 nucleotides:
A, G, C, U
U is Uracil

RNA does not contain Thymine (T)


Uracil replaces Thymine in RNA

RNA is single-stranded

RNA Secondary Structures


RNA can make secondary structures
RNA can make 1 strand with itself as a secondary structure
Secondary structures are made of stems and loops

What Is the Length of


My Sequence ?
Protein sizes are expressed in amino acids or in Daltons
115 Daltons ~ 1 amino acid

DNA and RNA sequences length are expressed in

Base-pairs (bp)
One Kbp or Kb: 1 thousand base pairs
One Mbp or Mb: 1 million base pairs
One Gbp or Gb: 1 billion base pairs

The following terms often have the same meaning:

Base
Base-pair (bp)
Nucleotide (nt)
Positions, nucleotides, residues

Turning DNA into Proteins:


The Genetic Code
DNA gets transcribed into RNA
using nucleotide complementarity.

RNA gets translated into proteins


using the genetic code:
UCU UAU GCG UAA

SER-TYR-ALA-STOP

Some Bioinformatics Applications


For RNA and DNA

Using DNA-sequence databases (Ch.2,3)


Identifying restriction sites (Ch. 5)
Designing PCR primers (Ch. 5)
Predicting RNA secondary structures (Ch. 12)
Comparing DNA sequences (Ch. 7, 8, and 9)

Some Bioinformatics Applications


For Entire Genomes

Finding your favorite genome online in public


databases (Ch. 3)
Comparing your genome with others (Ch. 3, Ch. 7)
Predicting the genes in your genome (Ch. 5)
Building a phylogeny with your genome (Ch. 3)

You might also like