What is proteomics?
Proteomics is the analysis of the
protein complement to the genome
Gene Transcript Protein
Genomics Proteomics
Proteomics is multidisciplinary
Protein
Biochemistry
Analytical
Biology
Chemistry
Proteomics
Molecular
Bioinformatics
Biology
Proteomics Research
•Basic research:
To understand the molecular mechanisms
underlying life.
•Applied research:
Clinical testing for proteins associated with
pathological states (e.g. cancer).
Applications of Proteomics
Medical Signal Disease
Microbiology Transduction Mechanisms
Protein
Drug Discovery Glycoyslation
Expression
Profiling Post-
Proteome
Target ID translational Phosphorylation
Mining
Modifications
Differential Display Proteolysis
Proteomics
Yeast Genomics Yeast two-hybrid
Protein-
Affinity Purified Functional protein Co-precipitation
Protein Complexes Proteomics Interactions
Mouse Knockouts Structural Phage Display
Proteomics
Organelle Subproteome Protein
Composition Isolation Complexes
For example: Hemoglobin
Picks up oxygen in the lungs, travels through
the blood, and delivers it to the cells.
Hbβ Hbα
O2
Hbα Hbβ
hemoglobin
Sickle cell disease is caused
by a single amino acid change.
Normal Hbβ Mutated Hbβ
ATG GTG CAC CTG ACT CCT GAG GAG … ATG GTG CAC CTG ACT CCT GTG GAG …
M V H L T P E E… M V H L T P V E…
Summary – what is proteomics?
•Involves the study of proteins
•Proteomics is multidisciplinary
•Proteomics is being applied to both basic and clinical
research
Why study proteins?
What are PROTEINS?
Proteins are large, complex molecules
that serve diverse functional and
structural roles within cells.
Proteins do most of the work
in the cell
Enzyme
Protease
Transport
Degrades Protein
Hemoglobin
Motion Carries O2
Actin
Contracts Muscles
Regulation
Insulin
Controls Blood Glucose
Support
Keratin
Defense Forms Hair and
Antibody Nails
Fights Viruses
Proteins are comprised of amino
acid building blocks
O Amino acid 1 Amino acid 2
Acid
R1 R2
C OH
H C C O + H C C O
R
Variable
CH H2N OH H N OH
H
H2O
N H
H O R2 O
H Base R1 C C H C C
Dipeptide
H2N H N OH
Peptide Bond
Each amino acid has unique
chemical properties.
basic
acidic
Histidine Aspartate
Glutamate
Lysine
Arginine
non-polar hydrophobic
Valine
Proline
Alanine Isoleucine Phenylalanine
Leucine Methionine Tryptophan
polar hydrophilic
Serine Cysteine
Glutamine
Glycine Tyrosine Asparagine Threonine
Proteins are chains of amino acids.
O
C OH
N H
H
Short chains of amino acids are
called peptides.
N H
Proteins are polypeptide molecules H
that contain many peptide subunits.
Gene
Nucleus 3’
Messenger
Trp Ribonucleic Acid
tRNA (mRNA)
Ala
tRNA
Met
Amino Acid-
tRNA
Met transfer
5’ Ribosome
Large Subunit Ala RNA
Met Trp
Empty tRNA
Empty tRNA
A U G G C C U G G U A G
Small Subunit
Cytoplasm Ribonucleotides A G C U
Codon 1 A U G = Methionine Codon 3 U G G = Tryptophan
Codon 2 G C C = Alanine Codon 4 U A G = Stop
Translation is the synthesis of proteins in the cell.
Proteins have specific architecture
http://www.path.cam.ac.uk/~mrc7/igs/mikeimages.html
Proteins arrive at their final
structure in an ordered fashion
J. E. Wampler, 1996, http://bmbiris.bmb.uga.edu/wampler/tutorial/prot0.html
Summary – why study proteins?
•Biological workhorses that carry out most of the
functions within the cell
•Serve diverse functional and structural roles
•Composed of amino acids that are covalently
linked by peptide bonds
•Synthesized during the translation process
•Must fold correctly to perform their functions
Proteomic tools and methods
Proteomic tools to study proteins
• Protein isolation
• Protein separation
• Protein identification
Protein Isolation
How are proteins isolated?
• Mechanical Methods
– grinding – break open cell
– centrifugation – remove insoluble debris
• Chemical Methods
– detergent – breaks open cell compartments
– reducing agent – breaks specific protein
bonds
– heat – break peptide bonds to “linearize”
protein
Protein isolation procedure
Find a sample Grind sample in buffer
Pick it
Transfer to tube
Centrifuge to remove Heat the sample
insoluble material
“pure” protein
solution
Recover supernatant Keep solution for gel analysis
Protein X
“pure” protein
solution
Isolated Protein X
Summary – protein isolation
•Proteins can be isolated from a variety of samples
•Proteomics includes the use of both mechanical and
chemical methods to isolate proteins
•Opening cell or cellular compartments
•Breaking bonds and “linearizing” proteins
•Removal cell debris
Protein Separation
SDS-PAGE
Why separate proteins?
“PURE” Protein Solution
Tube 1 Tube 2
Increased Complexity Decreased Complexity
Decreased Protein ID Increased Protein ID
How to separate proteins?
Separating intact proteins is to take
advantage of their diversity in
physical properties, especially
isoelectric point and molecular weight
Methods of Protein Separation
• Sodium Dodecyl Sulfate –
Polyacrylamide Gel Electrophoresis
(SDS-PAGE)
• Isoelectric Focusing (IEF)
SDS-PolyAcrylamide Gel
Electrophoresis (SDS-PAGE) is a
widely used technique to separate
proteins in solution
SDS-PAGE separates only by
molecular weight
• Molecular weight is mass one molecule
• Dalton (Da) is a small unit of mass
used to express atomic and molecular
masses.
PAGE is widely used in
• Proteomics
• Biochemistry
• Forensics
• Genetics
• Molecular biology
Polyacrylamide gels separate
proteins and small pieces of DNA
• Major components of polyacrylamide gels
• Acrylamide – matrix material/ NEUROTOXIN
• Bis-acrylamide - cross-linking agent/ NEUROTOXINS
• TEMED - catalyst
• Ammonium persulfate - free radical initiator
Acrylamide
(matrix material)
NH2
O
Bisacrylam ide
(cross-linking agent)
H H
N N
Polymerization
N N
O O
TEMED
(catalyst)
Am monium persulfate
(free radical initiator) SO4
Polyacrylamide
(non-toxic)
Polyacrylamide
C ON2H C ON2H
Polyacrylamide
(non-toxic) O O
NH NH
Bis-acrylamide
C H2 cross links C H2
NH NH
O C ON2H
O
C ONH
Sodium dodecyl sulfate - SDS
The anionic detergent SDS unfolds or
denatures proteins
• Uniform linear shape
• Uniform charge/mass
ratio
One-dimensional polyacrylamide
gel electrophoresis (SDS-PAGE)
Cathode (-)
Anode (+)
Standard Sample1 Sample2
During SDS-PAGE proteins separate
according to their molecular weight
Cathode (-)
150 kDa
100 kDa
75 kDa
50 kDa
37 kDa
25 kDa
20 kDa
Bromophenol
Anode (+) Blue dye front
Standard Sample1 Sample2
Image of Real SDS-PAG
Cathode
250 kiloDaltons
150 kDa
100 kDa
75 kDa
50 kDa
37 kDa
25 kDa
20 kDa
Anode
Separation of Protein X
Cathode (-)
150 kDa
100 kDa
75 kDa
50 kDa
37 kDa
25 kDa Protein X 25 kDa
20 kDa
11 kDa
Bromophenol
Anode (+) Blue dye front
Standard Sample1 Sample2
Two-dimensional gel
electrophoresis (2-DGE)
1st dimension - isoelectric focusing
2nd dimension - SDS-PAGE
Most widely used protein separation technique in
proteomics
Capable of resolving thousands of proteins from a
complex sample (i.e. blood, organs, tissue…)
1st Dimension-Isoelectric
Focusing
Isoelectric focusing (IEF) is separation of
proteins according to native charge.
isoelectric point -pH at which net charge is zero
2-DGE
protein
3 pH gradient 10
samples
1st dimension
IEF Neutral at pH 3
150 kDa
100 kDa
75 kDa
50 kDa
2nd dimension
37 kDa
SDS-PAGE
25 kDa
20 kDa
11 kDa
2-DG
kDa 3 4 5 6 7 8 9 10 pI
100
75
mass
50
25
Arabidopsis developing leaf
2-DGE
3 4 5 6 7 8 9 10
150 kDa
100 kDa
75 kDa
50 kDa
2nd dimension 37 kDa
SDS-PAGE
25 kDa
20 kDa
11 kDa
Protein X
25 kDa
pI 5
1-DGE vs. 2-DGE
1-DGE (SDS-PAGE) 2-DGE
• High reproduciblity • Modest reproducibility
• Quick/Easy • Slow/Demanding
• Separates solely based • Separates based on pI and
on size size
• Modest resolution, • High resolution, not
dependent on complexity dependent on complexity
of sample of sample
Summary – protein separation
•Protein separation takes advantage physical
properties such as isoelectric point and molecular
weight
•SDS-PAGE is a widely used technique to separate
proteins
•1-DGE is a quick and easy method to separate protein
by size only
•2-DGE combines isoeletric focusing (IEF) and SDS-
PAGE to separate proteins by pI and size
Protein identification
mass spectrometry
Peptide mass intact protein x
fingerprinting protein digestion
Make proteolytic peptide
fragments - Digest the mass spectrometry
protein into peptides (using
trypsin)
intensity
Measure peptide masses -
“Weigh” the peptides in a m/z
mass spectrometer
mass
952.0984
1895.9057
Match peptide masses to 1345.6342
899.8743
protein or nucleotide 2794.9761
sequence database - Compare
the data to known proteins
and look for a match Protein ID
Protein digestion
We use the enzyme TRYPSIN to digest (cut) proteins
into peptides – trypsin cuts after Lysine (K) and
Arginine (R)
Protein X
????????K?????R????????
How does mass spectrometry
identify unknown proteins?
Basics of mass spectrometry
• determination of mass to charge ratio
(m/z)
• Mass spectrometer = very accurate
weighing scales
– third or fourth decimal place
We then “weigh” these peptides
with a Mass Spectrometer
????????K
?????R
????????
Mass Spectrometer
We then “weigh” these peptides
with a Mass Spectrometer
????????K 1106.55 Da
?????R 692.31 Da
????????
1002.37Da
Mass of peptides should be compared to
theoretical masses of known peptides
????????K = 1106.55 Da
?????R = 692.31 Da
???????? = 1002.37Da
Computation of theoretical masses of
known peptides known
Computer Peptides
Proteome = all protein sequences •
•
WEGETMILK
ADEMTYEK
1106.55
1105.23
• PLMEHGAK 1089.50
• LMEHHH 782.25
• ASTEER 692.31
• DMGEYIILES 1056.92
• EGEDMPAFY 1002.35
• CYHGMEI 984.36
• EFPKLYSEK 900.56
• YSEPYSSIIR 1102.34
• IESPLMIA 864.35
• AEFLYSR 600.21
• DLMILIYR 864.97
• METHIPEEK 795.36
• KISSMER 513.21
• PEPTIDEK 456.23
Digest Proteome with •
•
MANYCQWS
TYSMEDGHK
792.15
678.46
simulated Trypsin •
•
YMEPSATFGHR
GHLMEDFSAC
995.46
896.35
• HHFAASTR 564.88
• ALPMESS 469.12
Mass of peptides compared to
theoretical masses of all peptides
known, using a computer program.
Computer Peptides
• WEGETMILK 1106.55
• ADEMTYEK 1105.23
• PLMEHGAK 1089.50
????????K = 1106.55 Da •
•
LMEHHH
ASTEER
782.25
692.31
• DMGEYIILES 1056.92
• EGEDMPAFY 1002.35
• CYHGMEI 984.36
?????R = 692.31 Da •
•
EFPKLYSEK
YSEPYSSIIR
900.56
1102.34
• IESPLMIA 864.35
• AEFLYSR 600.21
• DLMILIYR 864.97
• METHIPEEK 795.36
???????? = 1002.37Da
• KISSMER 513.21
• PEPTIDEK 456.23
• MANYCQWS 792.15
• TYSMEDGHK 678.46
• YMEPSATFGHR 995.46
• GHLMEDFSAC 896.35
• HHFAASTR 564.88
• ALPMESS 469.12
Mass of peptides matched to
theoretical masses known peptides,
using a computer program.
Computer Peptides
• WEGETMILK 1106.55
• ADEMTYEK 1105.23
• PLMEHGAK 1089.50
????????K = 1106.55 Da •
•
LMEHHH
ASTEER
782.25
692.31
• DMGEYIILES 1056.92
• EGEDMPAFY 1002.35
• CYHGMEI 984.36
?????R = 692.31 Da •
•
EFPKLYSEK
YSEPYSSIIR
900.56
1102.34
• IESPLMIA 864.35
• AEFLYSR 600.21
• DLMILIYR 864.97
• METHIPEEK 795.36
• KISSMER 513.21
???????? = 1002.37Da • PEPTIDEK 456.23
• MANYCQWS 1002.37
• TYSMEDGHK 678.46
• YMEPSATFGHR 995.46
• GHLMEDFSAC 896.35
• HHFAASTR 564.88
• ALPMESS 469.12
The unknown peptides have been
identified
????????K = 1106.55 Da WEGETMILK
?????R = 692.31 Da ASTEER
???????? = 1002.37Da MANYCQWS
Protein X has been identified
????????K?????R????????
????????K?????R????????
????????K?????R????????
WEGETMILK AFTEER MANYCQWS
Summary – tools to study proteins?
•Proteins are digested into peptides
•Peptides are analyzed with a mass spectrometer
•Match observed peptide masses to theoretical
masses of all peptides in database
•Assemble those peptide matches into a protein
identification
Concluding points about Proteomics
-Proteomics is the analysis of all proteins
-Interdisciplinary research
-Essential to both basic and clinical research
-Protein are the workhorses of the cell
- Discovery research – drugs and diseases
-Proteomics tools allow identification of proteins