0% found this document useful (0 votes)
127 views9 pages

Pycontact - A Tool For Analysis of Non-Covalent Interactions in MD Trajectories

This document provides a tutorial for using PyContact, a tool for analyzing non-covalent interactions in molecular dynamics trajectories. It discusses how to install PyContact and its dependencies, load sample trajectory data for ubiquitin and Rpn11 proteins, visualize amino acid interactions over time, and apply basic filtering options. The goal is to elucidate interactions between these two key players in protein degradation pathways. PyContact offers a graphical interface and flexibility for interaction analysis without programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views9 pages

Pycontact - A Tool For Analysis of Non-Covalent Interactions in MD Trajectories

This document provides a tutorial for using PyContact, a tool for analyzing non-covalent interactions in molecular dynamics trajectories. It discusses how to install PyContact and its dependencies, load sample trajectory data for ubiquitin and Rpn11 proteins, visualize amino acid interactions over time, and apply basic filtering options. The goal is to elucidate interactions between these two key players in protein degradation pathways. PyContact offers a graphical interface and flexibility for interaction analysis without programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

PyContact - A Tool for Analysis of Non-Covalent Interactions in

MD Trajectories

Tutorial by M. Scheurer and P. Rodenkirch

Main Developers: Maximilian Scheurer and Peter Rodenkirch

University of Illinois at Urbana-Champaign, Theoretical and Computational Biophysics Group

University of Heidelberg, Biochemistry Center

July 2, 2017

Version: 1.0.1

Tutorial Version: 1.0.1

Code: https://github.com/maxscheurer/pycontact

Project Website: https://pycontact.github.io

1
1 INTRODUCTION 2

1 Introduction
Non-covalent interactions of biomolecules are known to be the cornerstones for biochemical pro-
cesses: They govern molecular recognition, induce conformational changes in proteins and exhibit
a plethora of other key functions in the cell. For example, cellular signaling is just working as a
result of specifically evolved interactions of biomolecules.
As atomic interactions through electrostatics, hydrogen bonds or hydrophobic properties of a
biomolecule, are invisible to the common microscopes, they are visible through the computational
microscope: Molecular dynamics (MD) simulations aim at describing the aforementioned inter-
actions at an atomic level of detail, thereby also yielding dynamics of proteins, DNA, ligands,
membranes or other small molecules. Thus, MD is the gateway to study non-bonding interactions
with high spatial and time resolution. The results of a plain MD simulation are the positions of
every atom at every timestep, called the trajectory. Depending on the system and simulation time,
this corresponds to a lot of data that needs to be analyzed. Furthermore, the different types of
interactions have to be distinguished and the level of detail that could be studied reaches from
interactions between individual atoms to complete protein chains. It depends on the scientific task
or question how specific the ”resolution” of the analysis ought to be.
To target these tasks, we provide a novel tool, PyContact, that is capable of non-covalent interac-
tion (or contact) analysis from MD simulation trajectories. Thereby, it offers high flexibility and
can be used without any programming experience, as it is a GUI (graphical user interface) applica-
tion in the first place. In the following sections, we will examine interactions of two key players in
proteasome-guided protein degradation, i.e. Ubiquitin (Ub) and Rpn11, a metallo-protease residing
in the lid of the 26S proteasome.
We already provide a short sample trajectory file. If you plan to start learning how to perform
MD simulations yourself, QwikMD http://www.ks.uiuc.edu/Research/vmd/plugins/qwikmd is
a good way to go.

User Contributions
PyContact is a very new tool, hence it is under constant development. The source code is publicly
available on GitHub (https://github.com/maxscheurer/pycontact). Your ideas, contributions
to novel features and bug reports are very much appreciated. If you want to contribute to the
project, opening Pull Requests or Issues directly on GitHub is the most convenient way. Sub-
scribing the project on GitHub, you can also follow the development progress and get to know the
development of PyContact in a very transparent manner.

How to cite
Available soon.
2 INSTALLATION AND REQUIRED SOFTWARE 3

2 Installation and required software


PyContact is written Python and thus requires a working Python version installed on your system.
Currently, a dependency (MDAnalysis) does not completely support Python3, therefore PyContact
also only supports Python2.7.
However, as soon as MDAnalysis will support Python3, PyContact will as well!

Windows Installation! Currently, PyContact is not running on Windows computers,


as MDAnalysis, a key dependency, does not work with the Windows operating system.

2.1 Installation in the Anaconda environment


Anaconda is a nice platform to manage Python packages. It is very useful for installation of
dependencies and brings its own Python binary, so that the system Python environment is not
affected.

1 Installing Anaconda and PyQt5


Go to the website https://conda.io/docs/install/full.html and download the Ana-
conda installation script for your operating system (Note that this website might not work in
older browser versions).
Run bash <downloaded_script>.sh -b -p /target/folder/conda
To activate Anaconda in the current session, execute
source /target/folder/conda/bin/activate
Now install PyQt5 with the following command:
conda install pyqt

2 Installing PyContact
To build the C/C++ modules, first install Cython by running
pip install cython
PyContact as such is available on pip. To install it, just run

pip install pycontact

in the terminal. This will download and install the other dependencies, which are available
on pip.

3 Basic analysis of protein interactions


Before taking off, open the PyContact tool by executing pycontact in your terminal.

1 Loading an MD trajectory
First, we need to load a trajectory (i.e. the coordinates for every simulation timestep) together
with the topology (information about bonds etc.) into the tool. To do so, click on File
3 BASIC ANALYSIS OF PROTEIN INTERACTIONS 4

→ Load Trajectory Data (or hit Ctrl+I / ⌘+I). Then, click on the Topology button and
select the rpn11_ubq.psf file in the tutorial folder. Afterwards, click on Trajectory and select
the trajectory file rpn11_ubq.dcd, also residing in the tutorial folder. As we want to elucidate
the interactions between Rpn11 (”segid RN11”) and Ubiquitin (”segid UBQ”), we will put
those in the input selections 1 and 2, respectively. See Tab. (1) for detailed description of
the input fields.

Protein-internal contacts. If you want to scan for protein- internal


interactions, type ”self” into the selection 2 text field. PyContact
will then find contacts between amino acids in selection 1 that are
more than 4 residues apart.

Finally, click on OK to load the trajectory into the program and run the atom-atom contact
analysis.
When the task is accomplished, the Status field should say ”50 frames loaded”.

Table 1: Description of input arguments for trajectory loading

Input Description Tutorial Value



distance cutoff maximal atom-atom distance (in A) 5.0
for contact scoring
angle cutoff cutoff angle for hydrogen bonds (in 120.0
deg)
acc-h cutoff distance cutoff between the hydrogen 2.5
bond acceptor and the H-atom
selection 1 atom selection text for the first selec- segid RN11
tion (contacts are mapped between
first and second selection) A
selection 2 atom selection text for the second se- segid UBQ
lection

A MDAnalysis selection language is used:


https://pythonhosted.org/MDAnalysis/documentation_pages/selections.html
3 BASIC ANALYSIS OF PROTEIN INTERACTIONS 5

2 Visualizing interactions between amino acids


Now that we have the initial analysis of atomic contacts run, we need to accumulate their
scores to the corresponding amino acids. Please click on Accumulate Scores and check resid
and resname for both selections. Thus, PyContact will accumulate the individual atom-atom
interaction scores to the respective amino acids, i.e. the residue name and the residue ID.
Click on OK to finally run this task. After the process is finished (indicated by the progress
bar), you should see some colorful stuff in the timeline. You do? Great, so let’s discuss what
all these colors mean... First, have a look at the first column of the timeline: You ought to
see a title of the contact, such as ”ASN133-THR9”. On the right hand side of the title, you’ll
see small boxes with different colors and color intensities: The individual boxes correspond
to the frames in the trajectory. The color intensity shows the contact score, and the color as
such points out wether the interaction is established by sidechain or backbone interactions
(helpful for protein and DNA interactions) 1 . Play around with the Accumulate Scores menu,
try different selections for the accumulation and see what you get.

3 Basic contact filtering

Table 2: Filtering Options in PyContact

Input Description Comment


Frame Stride Stride of the frames being dis-
played in the timeline
Timeline Range Timeline frames will be dis- If Filter Range box is checked,
played only in the given range only the displayed frames of
the contacts are taken into
account for further filtering.
Total Time Total contact lifetime B

Mean/Median/HB % Mean/Median contact score


or hydrogen bond per-
centage (HB %) must be
greater/smaller than the
given value B
Sort by: mean, median, Contacts are sorted ascend-
bb/sc type, contact type, to- ing or descending according
tal time, mean lifetime, me- to the selected property B
dian lifetime
Show only: hbonds, hy- Show contacts with the se-
drophobic, saltbridges, un- lected type only B
known

B Only applied if the corresponding checkbox is checked.

1
Green: sidechain-sidechain interaction
Yellow: backbone-sidechain interaction
Blue: backbone-backbone interaction
3 BASIC ANALYSIS OF PROTEIN INTERACTIONS 6

4 Exporting Contact Data

For various contact data exportation tasks, we provide a tool, accessible in the menu under
Tools → Export Contact Data or by hitting Ctrl+E / ⌘+E. The first tab available offers to save
the current contact view from the main window to a file. The format can be chosen in the dropdown
menu on the left, where the common png, as well as a svg vector graphics format are disposable.
The next tab brings us to the histogram tool, which allows a fast and clear visualization of useful
properties like Mean Score, Mean Lifetime or Hbond percentage (selectable on the right hand side
of the widget). Two major histogram options are available, General Histogram and Bin per Contact.
The former one groups the corresponding property values in numerical bins. If you wish to use the
analyzed contacts for the bin selection, you may choose Bin per Contact.
To update your choice and draw the histogram, simply click on Show Preview. Using the Bin per
Contact option, you may want to adjust the font size of the bin labeling, which can be easily
achieved with the bin per contact font size text field. To save the histogram, pick a suitable file
format from menu on the right and press Save Histogram to choose the file location.
Another option of visualization of our data can be found in the third tab, called Contact Map. Here
the same properties, which are also provided in the histogram tool can be plotted grayscale matrix,
using the two selections on the two axes respectively. Similar to the tools before, the view can be
updated via the Show Preview button and be saved by clicking on Save Map.
In order to view the trajectory data with VMD, go to the VMD tab and let the tool create a
suitable tcl script for you. If you wish to differentiate between single contacts, you may tick the
Split selections for each contact checkbox. Additional selection texts can be given in the two text
fields below.
Finally, for further data analysis purposes we provide you with a raw data export functionality in
the last tab, titled Plain Text. Pick your preferred combination of properties you want to export
with the checkboxes on left and click on Export to text, which file open the common file dialogue.
The data will be written in a tab separated based ASCII text file.
4 CONTACT AREA CALCULATION 7

4 Contact area calculation


To decipher the contact area between Rpn11 and Ubiquitin in a time-dependent manner, we can
use the built-in SASA widget in PyContact. Open the widget by clicking on Tools → Contact Area
Calculation or hit Ctrl+A / ⌘+A.

1 Selecting the trajectory


First, click on Load Data and select the topology and trajectory files of the Rpn11-Ubq
simulation as described in section 3.

2 Atom selections
To specify the atom selections for contact area calculations, one need to understand the
underlying principle of solvent-accessible surface area (SASA) calculations to some extent.
(...)
Let’s say we are interested in the contact area of Rpn11 to Ubq. Then we specify ”segid
RN11” in the Selection text field. We want to shrink the selection to the Ubq interface, so we

only select those residues in 5Aproximity by typing ”segid RN11 and around 5 segid UBQ”
in the Restriction text field. Finally, the program will need to subtract the SASA of protein
residues at the outside of the interface. Type ”protein” in the Selection 2 text field and check
the contact checkbox.

3 Running the calculation Choose the number of cores to run the calculation on. Then click
on Calculate to run the SASA/contact area calculations. The results will be plotted directly
in the GUI and are available for export. See Fig. 1 for an example output.
4 CONTACT AREA CALCULATION 8

Figure 1: Contact area calculation example. As explained in section 4, we calculated the time-
dependent evolution of the contact area between Rpn11 and Ubiquitin.
5 PYTHON SCRIPTING FOR JOB AUTOMATION 9

5 Python scripting for job automation


For users that want to automate analysis of their trajectories with PyContact, we provide an easy
to use interface for the most important features. The underlying workflow is suggested to be as
follows:

1. Writing the Python script:


• job configuration for loading the trajectory and atom-atom contact scoring are defined
• subsequent score accumulation parameters
• number of cores can also be set
• job results are to be saved as a .session file

2. Load the .session file into PyContact and proceed as usual

This scripting capability allows faster contact analyses for trajectories with identical parameters
for example. Furthermore, it shall motivate users to understand how PyContact works, dig into
the code and probably come up with own feature ideas and code.

You might also like