Practical On Protein Bioinformatics Practical
Practical On Protein Bioinformatics Practical
View the shape of a similar protein for which there is a crystal structure
1. Search for homologous proteins for which there are crystal structures.
You are provided with the amino acid sequence (named ‘Plasmodium AA’)
In a browser, go to the NCBI home page (https://www.ncbi.nlm.nih.gov/) and copy and
paste the sequence to the sequence box.
Select Protein BLAST and change the database box so that the Protein Data Bank
(PDB) is searched. It will be searched for structures of proteins with similar sequences.
Click BLAST.
a) Now write the following information based on the topmost blast hit (14 marks).
Accession number: 3ZH2_A
E-value: 0.0
Protein name: L-lactate Dehydrogenase
Organism: Plasmodium falciparum 3D7
Total Score: 637
Query cover: 100%
Percent identity: 99.37%
2. Download the 3D Cartesian Coordinates of the atom positions in the protein and
visualizing using PyMol.
The search results in a list of protein structures of decreasing sequence similarity.
You can see that several protein structures are available for the protein. Select the
entry with the accession number:’IT24_A’ and click on the accession number.
This takes you to a window showing information of the selected protein sequence.
There, an icon of a protein 3D structure can be seen to the right of the window. Click
on the 3D structure and click on the first link in the list of structures appearing on the
resultant page/ Click on the ‘Structure’ link under the ‘Related Information’ tab (make
sure you are selecting the structure with PDB ID: IT24).
This brings up a MMDB Structure (Molecule Modeling Database) Summary window.
Click on the PDB ID shown at the top right-side of the page.
This takes you to the protein data bank website. Here click on Download File in the
right-hand margin and select the PDB format.
o Make sure to avoid downloading the compressed file (‘PDB format (gz)’)
1
MAM 5108: Microbial Bioinformatics
2 Dr. Pasan Fernando
It would be a good idea to look in the PDB file for references to journal articles. You can find this
information in the PDB entry as well. Reading these papers is often essential for understanding
what you are looking at! For example, how much of the intact molecule is represented in the
PDB file? How was it prepared? How important the protein function for the organism?
Several programs exist to look at the structures, but we’re going to use PyMOL.
2
MAM 5108: Microbial Bioinformatics
3 Dr. Pasan Fernando
The image of the 3D structure will appear. When the image appears, you can rotate the
molecule by dragging the mouse with the left button pressed. You can zoom in and out
by pressing the right click button and simultaneously moving the cursor within the
viewer. You can also click on the molecule and ligands, which will give information on
the external GUI viewer. Take a screen shot of the visualization or save the image using
the following command.
File > Export Image As> PNG…> Draw antialiased OpenGl image
Most of the commands that are required for this practical can be accessed through the
Object Menu Panel on which you find A (Action), S (Show), H (Hide), L (Label), and C
(color) icons as shown below. You can select them on ‘all’ or the object with the PDB id
as the name (because there is only one protein in the view).
3
MAM 5108: Microbial Bioinformatics
4 Dr. Pasan Fernando
In the Object Menu Panel, start by clicking on letter C (color) of either object
C > by chain > by chain (elem C)
Then label the chains by:
L > chains
b) Again, save the image and report on below. How many chains are found in the structure (6
marks)? 2 chains
Yes.
Often the view is cluttered with water oxygens (by default in red color). Protein crystals are
quite “wet” and gelatinous; the structures obtained from crystals agree well with structures
obtained from proteins in solution by NMR. Most of the water molecules in crystals diffuse
randomly, making them “blurry” and invisible. The rare visible water molecules were tightly
bound and immobilized. Hydrogens cannot be resolved by X-ray crystallography; hence, need to
manually be added most of the time.
4
MAM 5108: Microbial Bioinformatics
5 Dr. Pasan Fernando
You can observe hydrogen atoms being added to the oxygen atoms clearly (of course, they will
be added to the protein amino acids as well, but this change is clear in the water molecules)
c) Now, save the image with the water molecules and paste it below (5 marks):
In the Object Menu Panel, start by clicking on letter C (color) of either object
C > by ss > Helix (red) Sheet (yellow) loop (green)
5
MAM 5108: Microbial Bioinformatics
6 Dr. Pasan Fernando
Alpha helices are red, beta strands are yellow, and loops are in green. Unfortunately, in this
case the ligands will be colored in green because PyMOL coloring scheme would not distinguish
them in this instance.
In the Object Menu Panel, start by clicking on letter C (color) of either object
C > spectrum > rainbow (elem C)
Each chain should begin blue, changing color through a rainbow series (green. yellow, orange)
and end in red.
Here are mnemonics. Synthesis begins with the old end: new residues are added to the new end.
Blue= cold= old (N-terminus of proteins 5’ end of nucleic acids)
Red=hot= new (C-terminus of proteins. 3’ end of nucleic acids)
6
MAM 5108: Microbial Bioinformatics
7 Dr. Pasan Fernando
e) Save the visualization and paste it below. Label the N and C termini in the structure (6
marks).
N termini
C termini
You can confirm the termini by labeling the amino acid residues
L > residues (oneletter)
7
MAM 5108: Microbial Bioinformatics
8 Dr. Pasan Fernando
You can see each amino acid represented with its one letter code and a number. The numbering
should start with the N-terminus and end with the C-terminus. Are the rainbow colors consistent
with the numbers?
Yes
What are the ligands and where do they bind with the protein?
In the Object Menu Panel, start by clicking on letter A (action) of the ‘all’ object.
A> present > ligand sites > cartoon
This will show the ligands and interacting amino acid atoms in stick representation while
maintaining cartoon representation for the rest of the protein.
f) Zoom in on each of the 3 ligands and take screenshots of each binding site below. Label
the ligands (9 marks).
8
MAM 5108: Microbial Bioinformatics
9 Dr. Pasan Fernando
9
MAM 5108: Microbial Bioinformatics
10 Dr. Pasan Fernando
You can also try other ligand site representations, such as surface, which are widely used in
publications.
Also, PyMOL is very popular for generating publication quality images. By following command
sequence below, you can generate different representations of the protein.
A > preset > publication (with/without solvent) or pretty (with/without solvent)
10
MAM 5108: Microbial Bioinformatics
11 Dr. Pasan Fernando
11