0% found this document useful (0 votes)
6 views

Tutorial 3

Uploaded by

jydpkmr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Tutorial 3

Uploaded by

jydpkmr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

2022 Beyond Rg Workshop

Tutorial 3: Ab initio reconstructions and model


fitting with SAS data

Introduction:
This tutorial covers the following advanced processing of SAXS data:
• Evaluating ambiguity of 3D shape reconstructions - AMBIMETER
• 3D reconstruction of bead models - DAMMIF/N and DAMAVER
• Comparing 3D bead model reconstructions and crystal structures (real space)
– SUPCOMB and ChimeraX
• Comparing scattering data and crystal structures (q space) - CRYSOL and
FoXS
• 3D reconstruction of electron density – DENSS
• Comparing 3D electron density reconstructions and crystal structures (real
space) – ChimeraX and PyMOL

Requirements:
1. BioXTAS RAW, version 2.1.4 (newest).
• Install instructions are available from:
https://bioxtas-raw.readthedocs.io/en/latest/install.html
• This tutorial assumes you are familiar with RAW.
2. ATSAS programs, version >=3.0.1.
• Download and install instructions are available from:
http://www.embl-hamburg.de/biosaxs/download.html
• Requires a free registration for academic users. Industrial users must
pay to use.
3. ChimeraX.
• Download and install instructions are available from:
https://www.cgl.ucsf.edu/chimera/download.html
4. PyMOL, version >=2.0.
• A free trial download is available from: https://pymol.org/2/
5. Internet connection.

Other useful materials:


1. There are RAW tutorial videos, which can be viewed here: https://bioxtas-
raw.readthedocs.io/en/latest/videos.html
2. ATSAS
• Manuals: http://www.embl-hamburg.de/biosaxs/manuals/
• User forum: http://www.saxier.org/forum/
3. DENSS
• Instructions and tips: http://denss.org/
4. Chimera references:
• http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/print.html#tips
• https://www.cgl.ucsf.edu/Outreach/Tutorials/GettingStarted.html

Page 1 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 1. Assessing ambiguity of 3D shape information -


AMBIMETER in RAW
Reconstructing a three-dimensional shape from a scattering profile is not guaranteed
a unique solution. This makes it important to determine what degree of ambiguity
might be expected in our reconstructions. The program AMBIMETER from the ATSAS
package does this by comparing the measured scattering profile to a library of
scattering profiles from relatively simple shapes. The more possible shapes that
could have generated the scattering profile, the greater ambiguity there will be in the
reconstruction. We will use RAW to run AMBIMETER.

1. Clear all of the data in RAW. Load the glucose_isomerase.out file that you
saved in the reconstruction_data folder in a previous part of the tutorial.
• Note: If you haven’t done the previous part of the tutorial, or forgot to
save the results, you can find the glucose_isomerase.out file in the
reconstruction_data/gi_complete folder.

2. Right click on the glucose_isomerase.out item in the IFT list. Select the
“AMBIMETER” option.
3. The new window will show the results of AMBIMETER. It includes the number
of shape categories that are compatible with the scattering profile, the
ambiguity score (also called an “a-score”, which is log base 10 of the number
of shape categories), and the AMBIMETER interpretation of whether or not
you can obtain a unique 3D reconstruction.

Page 2 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• According to the original paper


(https://doi.org/10.1107/S1399004715002576), “an a-score below
1.5 practically guarantees a unique ab initio shape determination,
whereas when the a-score is in the range 1.5–2.5 care should be
taken, perhaps involving cluster analysis, and for a-scores exceeding
2.5 unambiguous reconstruction without restrictions (for example, on
symmetry and/or anisometry) is highly unlikely.”
• Note: AMBIMETER can also save the compatible shapes (either all or
just the best fit). You can do that by selecting the output shapes to
save, giving it a save directory, and clicking run. We won’t be using
those shapes in this tutorial.

4. Click “OK” to exit the AMBIMETER window.


• Tip: After exiting the AMBIMETER window the results can be seen in
the Information panel when the IFT is selected in the IFTs list.

Page 3 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 2. 3D reconstruction with bead models – DAMMIF/N and


DAMAVER in RAW
Shape reconstruction in SAXS is typically done using bead models (also called
dummy atom models, or DAMs). The most common program used to generate these
shapes is DAMMIF (and, to a lesser degree, DAMMIN) from the ATSAS package. We
will use RAW to run DAMMIF/N. Because the shape reconstruction is not unique, a
number of distinct reconstructions are generated, and then a consensus shape is
made from the average of these reconstructions. The program DAMAVER from the
ATSAS package is the most commonly used program for building consensus shapes.
Also, this tutorial uses ATSAS 3.1.1, some pieces may be slightly different on older
versions of ATSAS.

1. Right click on the glucose_isomerase.out item in the IFT list. Select the
“Bead Model (DAMMIF/N)” option.
• Note: If necessary, load the glucose_isomerase.out file that you
saved in the reconstruction_data folder in a previous part of the
tutorial. If you haven’t done the previous part of the tutorial, or forgot
to save the results, you can find the glucose_isomerase.out file in
the reconstruction_data/gi_complete folder.
2. Running DAMMIF generates a lot of files. Click the “Select” button for the
output directory, make a new folder in the reconstruction_data directory
called gi_dammif and select that folder.
3. Change the number of reconstructions to 5 and the Mode to Fast (if
necessary).
• Note: It is generally recommended that you do 15-20 reconstructions
However, for the purposes of this exercise, or for obtaining an initial
quick look at results, 3-5 are enough.
• Note: For final reconstructions for a paper, DAMMIF should be run in
Slow mode. For this tutorial, or for obtaining an initial quick look at
results, Fast mode is fine.
4. Uncheck the “Refine average with dammin” checkbox.
• Note: For final reconstructions for a paper, DAMMIN refinement should
be done. However, it is quite slow, so for the purposes of this tutorial
we won't do it.
5. RAW can align the DAMMIF/N output with a PDB/mmCIF structure using
CIFSUP from the ATSAS package. To do so, check the ‘Align output to
PDB/mmCIF’ box and select the 1XIB_4mer.pdb file in the
reconstruction_data/gi_complete folder.
• Tip: If you’re not sure if you selected the correct file, hovering your
mouse over the filename will show the full path.

Page 4 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

6. Click the “Start” button.


• Note: The status panel will show you the overall status of the
reconstructions. You can look at the detailed status of each run by
clicking the appropriate tab in the log panel.
7. Note that by default the envelopes are aligned, clustered and averaged using
DAMAVER and then the aligned and averaged profile is refined using DAMMIN.

Page 5 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• Note: Some settings are accessible in the panel, and all settings can
be changed in the advanced settings panel.
8. Wait for all of the DAMMIF runs, DAMAVER and alignment to finish. Depending
on the speed of your computer this could take a bit.
• Question: Based on the AMBIMETER results from the previous part of
the tutorial, how good a reconstruction do you expect?
9. Once the reconstructions are finished, the window should automatically switch
to the results tab. If it doesn’t, click on the results tab.

Page 6 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

10.The results panel summarizes the results of the reconstruction run. At the top
of the panel there is the AMBIMETER evaluation of how ambiguous the
reconstructions might be. If DAMAVER was run, there are results from the
normalized spatial discrepancy (NSD), showing the mean and standard
deviation of the NSD, as well as how many of the reconstructions were
included in the average. If DAMAVER was run on 3 or more reconstructions,
and ATSAS >=2.8.0 is installed, there will be the output of SASRES which
provides information on the resolution of the reconstruction. If DAMAVER
found more than one cluster, the number of clusters and information on each
cluster is shown. Note that DAMCLUST (ATSAS <=3.1.0) provided more
information about the clusters, so some fields will be blank with ATSAS
>=3.1.1.
11.Information on each individual model is shown at the bottom. The summary
tab gives the model c2, Rg, Dmax, excluded volume, molecular weight
estimated from the excluded volume, and, if appropriate, mean NSD of the
model.
• Any models are rejected from the average by DAMAVER will be shown
in red in the models list.
• The model highlighted in blue is the ‘most probable’ model, this can be
used as your final bead model instead of doing a dammin refinement.
12.Also, each individual model has a tab which shows the data, the model fit,
and the residuals. Check that for each model the visual fit is good, and the
residuals are flat and randomly distributed about zero.

13.All of this information can be used to evaluate whether the reconstructions


where successful. See the section below for a discussion on how to evaluate
bead models.
• Question: Is this a trustworthy reconstruction?

Page 7 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

14.The results summary is automatically saved in a


<prefix>_dammif_results.csv file with the same name as the
reconstructions, in this case glucose_isomerase_dammif_results.csv. The
plots of the fit and residuals are saved in a .pdf file of the same name.
• Try: Open these files separately to see the results.
15.Click on the Viewer tab to open the model viewer.
• Note: The model viewer is intended for a fast first look at the results.
It is not up to the standards of a program like PyMOL.

16.Click and drag the model to spin it.


• Note: For glucose isomerase, the model should look more or less like a
flattened sphere.
17.Right click and drag the model to zoom in and out.
18.Use the “Model to display” menu in the Viewer Controls box to change which
reconstruction is displayed.
19.Click the “Close” button when you are finished looking at the results and
reconstructions.
20.The results from individual DAMMIF runs are saved in the selected output
folder with the name <prefix>_xx, where xx is the run number: 01, 02, etc.
For this tutorial, that would be glucose_isomerase_01,
glucose_isomerase_02, and so on. The different files produced are
described in the DAMMIF manual (https://www.embl-
hamburg.de/biosaxs/manuals/dammif.html#output).

Page 8 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• Note: Generally, the file of interest is the -1.cif file, in this case
glucose_isomerase_01-1.cif, glucose_isomerase_02-1.cif, etc.
21.If averaging was done with DAMAVER, the results are saved in the selected
output folder with the given prefix, in this case glucose_isomerase. The
output files generated are described in the DAMAVER manual
(https://www.embl-hamburg.de/biosaxs/manuals/damaver.html).
• Note: Generally, the file of interest is the generated damfilt mmCIF:
<prefix>_damfilt.cif. For this tutorial, those would be
glucose_isomerase_damfilt.cif.
22.If multiple clusters were found, the results are saved in the selected output
folder with the given prefix (for this tutorial, glucose_isomerase). The files
generated are described in the DAMAVER manual (https://www.embl-
hamburg.de/biosaxs/manuals/damaver.html).
23.If refinement was done with DAMMIN, the results are saved in the selected
output folder as refine_<prefix>, e.g. for this tutorial
refine_glucose_isomerase. The files generated are described in the
DAMMIN manual (https://www.embl-
hamburg.de/biosaxs/manuals/dammin.html#output).
• Note: Generally, the file of interest is the -1.cif file, in this case
refine_glucose_isomerase-1.cif.
24.If alignment to a reference PDB was done with SUPCOMB, the files aligned
depend on what other processing was done.
• If refinement was done, then there will be a single file named
refine_<prefix>_-1_aligned.cif. For this tutorial,
refine_glucose_isomerase-1_aligned.cif.
• If no refinement is done but averaging is done, then the damaver and
damfilt results are aligned, as well as the most probable model (the
blue highlighted model in the summary panel). The associated
filenames would be <prefix>_damaver_aligned.cif,
<prefix>_damfilt_aligned.cif, and <prefix>_##_-1_aligned.cif
where ## is the model number of the most probable model. For this
tutorial, glucose_isomerase_damaver_aligned.cif,
glucose_isomerase_damfilt_aligned.cif, and
glucose_isomerase_##-1_aligned.cif.
• If no refinement is done but clustering is done, then the representative
models of each cluster is aligned. The associated filenames would be
<prefix>_##-1_aligned.cif where ## is the model number of the
representative model. For this tutorial, that is
glucose_isomerase_##-1_aligned.cif.

Aside: Evaluating bead model reconstructions


SAXS data contains very limited information, both because it is measured at
relatively low q, and because it is measured from a large number of particles in
solution oriented at random angles. The SAXS scattering profile represents the
scattering from a single particle, averaged over all possible orientations. The
practical consequence of this is that there are often several possible shapes that

Page 9 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

could generate the same (or so similar as to be indistinguishable within experimental


noise) scattering profiles. As such, it may simply not be possible to generate a bead
model reconstruction from a dataset that accurately represents the solution shape,
regardless of the overall data quality. If the sample is flexible or otherwise exists in
multiple conformational or oligomeric states in solution the reconstruction is also
challenging or impossible. In summary, high quality SAXS data is not a
guarantee of a good bead model reconstruction. This makes it very
important to critically evaluate every reconstruction done, regardless of the
underlying data quality.

Criteria for a good DAMMIF/N reconstruction:


1. Ambiguity score < 2.5 (preferably < 1.5).
2. NSD < 1.0.
3. Few (0-2) models rejected from the average.
4. Only one cluster of models.
5. Model c2 near 1.0 for all models.
6. Model Rg and Dmax close to values from P(r) function for all models.
7. M.W. estimated from model volume close to expected M.W.

More about these criteria can be found below.

Ambiguity
As discussed in the previous section, AMBIMETER determines how many shapes
might have produced your measured scattering profile. Having an ambiguity score <
2.5 (ideally < 1.5) is important for a good reconstruction.

Normalized spatial discrepancy


DAMAVER reports a number of different results. The most useful is the normalized
spatial discrepancy (NSD). This is essentially a size normalized metric for comparing
how similar two different models are. When DAMAVER is run, it reports the average
and standard deviation of the NSD between all the reconstructions. It also reports
the average NSD for each model.

The average NSD is commonly used to evaluate the stability of the reconstruction.
Roughly speaking we evaluate reconstruction stability as:

• NSD < 0.6 - Good stability of reconstructions


• NSD between 0.6 and 1.0 - Fair stability of reconstructions
• NSD > 1.0 - Poor stability of reconstructions

Generally speaking, if your average NSD is less than 1.0, the reconstruction can
probably be trusted (if all of the other validation metrics also check out),
while if it is greater than 1.0 you should proceed with caution, or not use
the reconstructions at all.

The NSD is also used to determine which models to include in the average.

Page 10 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

If the average NSD of a given model is more than two standard deviations above
the overall average NSD, that model is not included in the average. If more
than ~2 models are rejected (out of 15), that may be a sign of an unstable
reconstruction.

Clusters
DAMCLUST creates clusters of models that are more similar to each other than they
are to the rest of the models. This is a way of assessing the ambiguity of the
reconstruction. If you have more than one cluster of models in your reconstructions,
you may have several distinct shapes that are being reconstructed by the DAMMIF
algorithm. This typically indicates that there are several distinct shapes in solution
that could generate the measured scattering profile, and so is another indication of a
highly ambiguous reconstruction.

The caveat to this is that with good quality data that is very low ambiguity
(ambiguity score from AMBIMETER < 0.5) and yields a set of reconstructions with a
very small average NSD (<0.5, typically) and NSD standard deviation (~0.01), I
have seen several (often >5) clusters identified with DAMCLUST. I believe that in
this case there are not actually multiple clusters, but the extremely low deviation
between the models is fooling the DAMCLUST algorithm.

Note that the different clusters should not be taken as representatives of different
distinct shapes in solution. Even if there are a finite number of distinct shapes
scattering in the solution (such as an open and closed state of a protein), the
measured scattering profile is an average of the scattering from each component,
and each individual reconstruction fits that measured scattering profile. As such,
there is no way for an individual reconstruction to fit just the scattering from one of
the components and so the different clusters cannot be representative of the
different shapes in the solution.

Model fit and parameters


Each model has the following parameters that can be used to evaluate the success of
an individual reconstruction: c2, Rg, Dmax, volume, molecular weight estimated from
volume, and the normalized residual of the model fit to the data. For a good fit to the
data, the model c2 should be close to 1 and the normalized residual between the
model fit and the data should be flat and randomly distributed about zero. However,
in my experience the normalized residual often shows some small systematic
deviations, and so this should not be too concerning. A c2 value significantly larger
than 1 (1.5-2 or larger) indicates either a poor fit to the data or that the uncertainty
for the data is underestimated. To differentiate between these two cases, look at the
normalized residual. If it is flat and randomly distributed, then the uncertainty is
most likely underestimated. If it shows significant systematic deviations then the fit
quality is poor.

The Rg and Dmax obtained from the model should be close to those calculated from
the P(r) function. If that is not the case, you should reevaluate your P(r) function

Page 11 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

and redo the reconstruction if necessary. If the discrepancy persists, it is an


indication that your reconstruction isn't a good representation of what is in solution,
and shouldn't be trusted. While there's no hard and fast rule here on how closely Rg
and Dmax should agree, my experience is generally that for high quality data Rg
agrees to better than ~5% and Dmax to ~10%.

The volume is reported for each bead model, but it is usually easier to compare the
molecular weight calculated from that volume with the expected molecular weight. In
this case, M.W. is calculated by dividing the volume (nominally representing the
sample's excluded volume) by an empirically determined constant of 1.66 (used in
RAW, other programs may use different values). This value is approximate, and
varies between roughly 1.5 and 2.0 depending on the shape of the macromolecule.
This M.W. is less well determined than other SAXS methods, given the variation in
the coefficient. As such, it is mostly useful for indicating general agreement between
the overall size of the reconstruction and the expected size. If the M.W. is different
from the expected M.W. by more than 20-25% you should consider the
reconstructions to be suspect.

Page 12 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 3. Compare crystal structure and bead model


reconstruction (real space) - SUPCOMB
The envelopes produced by DAMMIF/N and DAMAVER are randomly oriented, as are
any high-resolution model you might have of the structure. In order to properly
evaluate the two together, you need to align the high-resolution structure with the
envelope. A common way to do this is with CIFSUP from the ATSAS package.

1. We already aligned the glucose_isomerase models that we made with a high


resolution structure as part of the reconstruction tutorial above. However, if
you don’t do it when you make the models, you can after the fact as well.
Typically you want to use a DAMMIN refinement as your final reconstruction.
Since we skipped making one in the previous part of the tutorial, you can use
the data in the reconstruction_data/gi_complete/gi_dammif folder.
2. Open the SUPCOMB window by selecting Tools->ATSAS->Align
(SUPCOMB/CIFSUP) from the menu bar.

3. In the window that opens, ‘Target’ is the model that is aligned, whereas
‘Reference’ is the model that the target is aligned to. In other words, the
Reference model stays unchanged, while the target model is moved to best
align with the Reference.
4. Use the Reference ‘Select’ button to select the 1XIB_4mer.pdb file in the
reconstruction_data/gi_complete folder.
• Tip: Only the filename will show up in either the Reference or Target
box. If you hover your mouse over the filename it will show the full
path to the file.
5. Use the Target ‘Select’ button to select the refine_glucose_isomerase-
1.pdb file in the reconstruction_data/gi_complete/gi_dammif folder.

Page 13 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

6. Click the start button. CIFSUP will run, and you should see the ‘Status’ update
to ‘Running alignment’ and then ‘Alignment finished’.

7. When CIFSUP is finished, in the same folder as the target file you will see a
<target_name>_aligned.pdb file, which is the target model aligned with
the reference file.
8. Advanced settings can be accessed by clicking on the ‘Advanced Settings’ text
to expand the section. These settings are described in the CIFSUP manual
(https://www.embl-hamburg.de/biosaxs/manuals/supcomb.html).

Page 14 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 4a. Compare crystal structure and bead model


reconstruction (Real space) - ChimeraX
There are a number of molecular graphics programs you can use to visualize
envelopes. ChimeraX is the one we will use in this workshop, because it is free for
noncommercial use, and easy to install. PyMOL is another popular one to use, and is
covered in part 4b.

1. Typically, you want to use a reconstruction averaged from 20 models in slow


mode, with refinement then done on the average. Since we didn’t do that in
the previous part, for time reasons, you can find the appropriate
reconstructions in the
Example_Data/reconstruction_data/gi_complete/gi_dammif folder.
2. Open ChimeraX.
3. Use the File->Open option to open the
refine_glucose_isomerase_aligned.pdb and 1XIB_4mer.pdb from the
gi_complete/gi_dammif folder.
4. You will initially see just the high resolution structure, not the bead model.
5. If necessary, open the Model Panel and the Command Line by going to Tools-
>General and clicking the appropriate items (should default to open, the
models panel in the lower right hand side, the command line at the bottom).
6. In the Select->Chain menu choose the bead model (chain <filename>).
7. In the Actions->Cartoon menu, choose “Hide”.
8. In the Actions->Atoms/Bonds menu, choose “Show”.
9. In the Actions->Atoms/Bonds->Atom Style menu choose “Sphere”.
10.In the Select menu, choose “Clear”.
11.In the Graphics ribbon, in the Camera section select “View selected”.

Page 15 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

12.Now you need to set the dummy atom radius to be what DAMMIF/N used. To
find this, open the bead model .pdb file in a text editor, such as Notepad
(Windows) or TextEdit (MacOS).
13.Find the “Atomic Radius” (DAMMIF/DAMAVER) or “DAM packing radius”
(DAMMIN) number. Make a note of this radius. This is the bead size you need
to set.

14.In the command line, type “size atomradius X” (without quotes) where x is
the atomic radius you just found.
15.Your model is now the right size. You can either stop here, and just adjust the
settings of the beads, or you can make an envelope. Most typically models
are presented as an envelope, but either is fine. The next steps detail how to
make an envelope.
16.Look in the Model Panel. Identify the ID number of the bead model.
17.We now need to make a surface. In the command line type “molmap #y z”
(without quotes) where y is the ID number for the bead model and z is 3x the
bead size you found in the previous steps.
• Note: You should see an ‘envelope’ form, but it still needs some
adjustment to be useful.

Page 16 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• Tip: You can vary the final number, which sets the smoothness of the
envelope. I find that 3*(bead size) is reasonable, but it depends on the
size of the model beads, and how smooth you want your envelope. It
is generally a good idea in SAXS to leave various lumps, or to actually
be able to see the outline of beads, so that your audience (and you)
remembers that an envelope is NOT an electron density contour.
18.In the Model Panel, turn off the bead model by selecting it and clicking the
“Hide” button.

19.In the Volume Viewer controls that appear after the molmap command (lower
right of the window) click on the color box.

20.In the window that appears, set the Opacity to 40%. Change the color if you
want. Close the color window.
21.Click and drag to rotate your viewer and compare the envelope to the crystal
structure.

Page 17 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Page 18 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 4b. Compare crystal structure and SAXS data (real space)
– PyMOL (optional)
In addition to ChimeraX (Part 4b) you can also use PyMOL to visualize bead model
reconstructions/envelopes.

1. Typically, you want to use a reconstruction averaged from 20 models in slow


mode, with refinement then done on the average. Since we didn’t do that in
the previous part, for time reasons, you can find the appropriate
reconstructions in the
Example_Data/reconstruction_data/gi_complete/gi_dammif folder.
2. Open PyMOL.
3. Use the File->Open option to open the refine_glucose_isomerase-
1_aligned.pdb and 1XIB_4mer.pdb from the gi_complete/gi_dammif
folder.
4. You will initially see the high resolution model and a mess of spaghetti on the
screen.
5. Using the model Hide menu (‘H’) hide ‘everything’ for the bead model
(refine_glucose_isomerase-1_aligned).

6. For the bead model, use the model Show menu (‘S’) to show ‘spheres’.

Page 19 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

7. Now you need to set the dummy atom radius to be what DAMMIF/N used. To
find this, open the bead model .pdb file in a text editor, such as Notepad
(Windows) or TextEdit (MacOS).
8. Find the “Atomic Radius” (DAMMIF/DAMAVER) or “DAM packing radius”
(DAMMIN) number. Make a note of this radius. This is the bead size you need
to set.

Page 20 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

9. In the command line, enter the command “alter <model_name>, vdw=X”


where <model_name> is the name of the model in pymol,
refine_glucose_isomerase-1_aligned, and X is the atomic radius you just
found.

10.Click the ‘Rebuild’ button to refresh the view of the model.

Page 21 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

11.Your model is now displayed correctly with the beads. However, it is more
useful to set the beads as partly transparent, so you can see the high
resolution structure through them. Do this with the command “set
sphere_transparency, 0.6”
12.Instead of the sphere representation you can create an envelope that shows
the edges of the model. To do so, using the model Show menu show ‘surface’
and using the model Hide menu, hide ‘spheres’.
13.You can set the transparency of your surface with the command “set
transparency, 0.5”.
14.If you want to smooth out the surface you and adjust the probe radius using
the command “set solvent_radius, 3.0” (where you can vary the size from
3.0).
15.You can also improve the surface quality using the command “set
surface_quality, 1”
• Note that values larger than 1 may take a while to render.
16.Finally you might want to set the colors to something a bit nicer, to get a final
display of your envelope and high resolution structure.

Page 22 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 5. Compare crystal structure and SAXS data (q space) –


FoXS and CRYSOL
Envelopes make for pretty pictures, but usually the best way to test models against
SAXS data is to generate theoretical scattering profiles and compare them to your
measured scattering profile. In this part of the tutorial you will do this using two
different software packages to do this: FoXS and CRYSOL from the ATSAS package.

1. In your system file browser, make a folder called polymerase_theory in the


Example_Data/reconstruction_data folder. Copy the 2POL.pdb file from
the Example_Data/reconstruction_data/polymerase_complete folder
into this polymerase_theory folder.
2. Navigate to the FoXS website: https://modbase.compbio.ucsf.edu/foxs/
3. For Input Molecule line select ‘Choose File’. Select the 2POL.pdb file.
4. Click ‘Submit Form’ to calculate the scattering profile.

5. On the results page that appears, click on the Profile file name
(2POL.pdb.dat) to download. Move that downloaded file to your
polymerase_theory folder and rename it 2POL_foxs.dat.

Page 23 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

6. Open RAW if it is not already open. If it is open, clear all data loaded in RAW
(unless your reconstruction is still running. If it is, just remove any items in
the manipulation panel).
7. Load the polymerase.dat file in the Example_Data/reconstruction_data
folder.
8. Carry out Guinier, molecular weight, and GNOM analysis on the scattering
profile. Save the polymerase.out file in the reconstruction_data folder.
9. Load the 2POL_foxs.dat file in the polymerase_theory folder.
10.The theoretical scattering profiles extends from q=0 to q=0.5, a much wider
range than the measured profile. To make comparison easy you can trim the
q range of the 2POL_foxs.dat profile to match the polymerase.dat profile.
Use the triangle to show more options for the scattering profile and adjust the
qmin until it is 0.01. Adjust qmax until it is 0.24.
11.Star the polymerase.dat file, right click on the 2POL_foxs.dat file and
select the Other Options->Superimpose. In the dialog window that pops up,
select ‘Scale’.

Page 24 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• Question: How well do the theoretical scattering profile and the


measured profile agree? Can you adjust the scale factor to get them in
better agreement?
• Note: Typically, experimental scattering profiles do not agree perfectly
with profiles generated from crystal structures ‘blindly’. Most programs
will fit the crystal structure profile to the experimental data by altering
the hydration layer in the modeled scattering, as well as fitting an
overall scale factor and (sometimes) a constant offset.
12.To get a better fit from FoXS we can upload a scattering profile to fit the data
to. FoXS will adjust parameters like the hydration layer and excluded volume
to improve fitting (you can turn this off in the advanced options), as well as
the overall scale. Return to the initial FoXS page and add the
polymerase.dat file as the Experimental profile. Submit the job again to
rerun the calculation.
13.You’ll see in the results that this time the residual between the data and fit
are shown, and the c2 of the fit is calculated as well. Download the
2POL_polymerase.fit file. Copy it into your polymerase_theory folder and
rename it 2POL_foxs_fit.fit.
14.Open the 2POL_foxs_fit.fit file in RAW. You’ll notice that it loads in 2
curves. The 2POL_foxs_fit.fit is the experimental data. The 2POL_foxs_fit_FIT
is the generated theoretical scattering profile fit to the data.
• Question: What differences do you observe between the blind curve
and the fit curve?
• Note: On the fit results page, FOXS reports values of c1 and c2. These
are an overall scale factor for the excluded volume and an estimate of
the average number of water molecules for an exposed solvent atom.
These are fit to adjust the scattering profile. Significant changes in
these values between different fits can minimize differences between
the fits, for example if you’re fitting several different datasets to a
structure. This may lead to very different conclusions. For example.
The c2 of the fit data is 1.55, whereas without fitting, using the default
values of c1 and c2, it is 4.87. This is the difference between a good fit
and a poor fit. So pay attention to these parameters. There are some
methods that attempt to treat the hydration layer more accurately,
such as the WAXSiS tool (http://waxsis.uni-goettingen.de/) which runs
a short explicit-solvent all-atom MD simulation then calculates the
scattering from that. You might try this for comparison.
15.Another common tool for generating theoretical scattering profiles is CRYSOL
from the ATSAS package. We will use CRYSOL to generate a scattering profile
from the crystal structure that is fit to the experimental data. Copy the
polymerase.dat file into the polymerase_theory folder.
16.Open a terminal/command prompt and navigate to the polymerase_theory
folder (Appendix A).
17.Type “crysol 2POL.pdb polymerase.dat” (without quotes) at the command
prompt and hit enter.

Page 25 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• Note: The first file is the crystal structure from which to generate a
theoretical profile. The second file is the experimental data file to fit
the theoretical profile to.
• Note: Again, this fits parameters to adjust the excluded volume and
hydration layer contrast (Ra and Dro respectively).
• Note: You can also use CRYSOL without fitting to data.

18.This will generate several files. The file with the scattering profile is the
2POL00.fit file. Load it into RAW.
• Note: When it loads, it will load two scattering profiles. The
2POL00.fit profile is the experimental data (identical to the

Page 26 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

polymerase.dat file except that I(q) values of 0 have been added for
all q points below the minimum measured q). The 2POL00_FIT profile
is the theoretical profile.
19.Hide the 2POL00.fit profile.
20.Adjust the q range of the 2POL00_FIT file until qmin is 0.0097.
• Question: How does this theoretical profile compare to those from
FoXS?

Page 27 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 6. 3D reconstruction with electron density – DENSS


Another method for doing 3D shape reconstructions in SAXS yields actual electron
density, rather than bead models. There are many potential advantages to this, but
one significant one is easy handling of systems like membrane proteins surrounded
by lipids or detergents, which have more than one electron density. Bead models
typically only have two (molecule and solvent) or three bead densities, and so
typically fail to reconstruct these complex objects. DENSS (http://denss.org) has
been fully implemented in RAW and will be used to reconstruct these electron
densities.

1. Clear all of the data in RAW. Load the polymerase.out file located in the
reconstruction_data/polymerase_complete folder.
• Note: This is the P(r) function for the polymerase data without the
truncation to 8/Rg. When using DENSS you should use the full q range
of your data.
• Note: Unlike DAMMIF/N, DENSS can also be run on BIFT P(r)
functions.
2. Right click on the polymerase.out item in the IFT control panel. Select the
“Electron Density (DENSS)” option.
3. Running DENSS generates a lot of files. Click the “Select” button for the
output directory, make a new folder in the reconstruction_data directory
called polymerase_denss and select that folder.
4. Change the number of reconstructions to 4 and the mode to Fast.
• Note: It is generally recommended that you do at least 20
reconstructions. However, for the purposes of this tutorial, 4 are
enough.
• Note: For final reconstructions for a paper, DENSS should be run in
Slow mode. For this tutorial, or for obtaining an initial quick look at
results, Fast mode is fine.
5. RAW can align the DENSS output with a PDB structure. To do so, check the
‘Align output to PDB/MRC’ box and select the 2POL.pdb file in the
reconstruction_data/polymerase_complete folder.
• Tip: If you’re not sure if you selected the correct file, hovering your
mouse over the filename will show the full path to the file.

Page 28 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

6. Click the “Start” button.


• Note: The status panel will show you the overall status of the
reconstructions. You can look at the detailed status of each run by
clicking the appropriate tab in the log panel.
7. Note that by default the densities are aligned and averaged, including
enantiomer filtering, and a refined density is created from the average.
8. Wait for all of the DENSS runs and averaging to finish. Depending on the
speed of your computer this could take a bit.

Page 29 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

9. Once the reconstructions are finished, the window should automatically switch
to the results tab. If it doesn’t, click on the results tab.

10.The results panel summarizes the results of the reconstruction runs. If you
are using a .out file, then at the top of the panel there is the ambimeter
evaluation of how ambiguous the reconstructions might be (see earlier
tutorial section). If averaging was run there is an estimate of the
reconstruction resolution based on the Fourier shell correlation. In the models
section there are several tabs. The summary tab shows the c2, Rg, support

Page 30 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

volume, and RSC to the reference model. If any model was not included in
the averaging it is highlighted in red.
• Verify that the Rg is close to the expected value, and that the c2and
support volumes are relatively consistent between models.
• Note: The c2 is much too small at the moment. The DENSS algorithm
doesn’t properly compute c2 for smoothed data. For the moment c2
should only be used as a convergence criteria, not to evaluate the
model fit.
11.Individual model results are displayed in the numbered tabs. For each
individual model there are plots of: the original data and the model data
(scattering from density); the residual between the original data and the
model data; and c2, Rg and support volume vs. refinement step.
• Verify that the residual between the actual data and the model data is
small.
• Check that the c2, Rg, and support volume have all plateaued
(converged) by the final steps.

12.If the densities were averaged, the average tab will display the Fourier shell
correlation vs. resolution.
• Note: The reconstruction resolution is taken as the resolution in
angstroms where the correlation first crosses 0.5.

Page 31 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

13.The results summary shown in Summary tab is automatically saved as a


<prefix>_denss_results.csv csv file, e.g. for this data as
polymerase_denss_results.csv. All the plots shown on the individual
model tabs are automatically saved as a multi-page pdf file with the same
name.
• Try: Open the results .csv file in another program to see the results.
14.Click the “Close” button when you are finished looking at the results and
reconstructions.
15.The results from the individual DENSS runs are saved in the selected output
folder as <prefix>_xx.mrc where xx corresponds to the run number: 01,
02, etc. For this tutorial that would be polymerase_01.mrc,
polymerase_02.mrc, etc.
16.If averaging was done, final average density is saved in the selected output
folder as <prefix>_aver.mrc. For this tutorial, that would be
polymerase_aver.mrc.
17.If refinement was done, the final refined density is saved in the selected
output folder as <prefix>_refine.mrc. For this tutorial that would be
polymerase_refine.mrc.
18.If alignment to a reference model was done, the files aligned depend on what
other processing as done.
• If refinement was done, there will be a single file named
<prefix>_refine_aligned.mrc. For this tutorial,
polymerase_refine_aligned.mrc.
• If no refinement was done, but averaging was done, then the
averaged model is aligned. The associated filenames would be
<prefix>_average_aligned.mrc. For this tutorial,
polymerase_averaged_aligned.mrc.
• If no refinement or average is done, then ever calculated model is
aligned. The associated filenames would be
<prefix>_##_aligned.mrc where ## is the model number of a
model. For this tutorial, that is polymerase_##_aligned.mrc.

Page 32 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 7. Compare crystal structure and electron density


reconstruction (real space) – DENSS Alignment
The envelopes produced by DENSS are randomly oriented, as are any high-resolution
model you might have of the structure. In order to properly evaluate the two
together, you need to align the high-resolution structure with the envelope. A
common way to do this is with alignment tool from the DENSS program.

1. We already aligned the polymerase models that we made with a high


resolution structure as part of the reconstruction tutorial above. However, if
you don’t do it when you make the models, you can after the fact as well.
Typically you want to use a DENSS refinement as your final reconstruction.
You should have generated one in the previous part of the tutorial, in the
reconstruction_data/polymerase_denss folder.
2. Open the Electron Density Alignment window by selecting Tools->Electron
Density (DENSS) Alignment from the menu bar.

3. In the window that opens, ‘Target’ is the model that is aligned, whereas
‘Reference’ is the model that the target is aligned to. In other words, the
Reference model stays unchanged, while the target model is moved to best
align with the Reference.
4. Use the Reference ‘Select’ button to select the 2POL.pdb file in the
reconstruction_data/polymerase_complete folder.
• Tip: Only the filename will show up in either the Reference or Target
box. If you hover your mouse over the filename it will show the full
path to the file.
5. Use the Target ‘Select’ button to select the polymerase_refine.mrc file in
the reconstruction_data/polymerase_denss folder.

Page 33 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

6. Click the start button. DENSS alignment will run.


• Tip: By default, DENSS centers the Reference file. This writes out a file
named <reference_name>_centered.pdb in the same folder as the
reference file, which is what should be compared to the aligned file.
You can turn this off in the Advanced Settings.

7. When alignment is finished, in the same folder as the target file you will see a
<target_name>_aligned.mrc file. Compare this to the
<reference_name>_centered.pdb file in the reference file folder. In this
case those names are polymerase_refine_aligned.mrc and
2POL_centered.pdb.
8. You can change the advanced settings by expanding the Advanced Settings
section. These advanced settings are:
• Number of cores: Number of cores to use during alignment.
• Enantiomorphs: Whether to generate enantiomorphs of the Target
before doing the alignment.
• Center reference: Whether to center the reference model at the origin.
If used, this creates a <reference_name>_centered.pdb in the
same folder as the reference file.

Page 34 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

• PDB calc. resolution: The resolution of the density map created from
the Reference PDB model to compare with the Target model. This has
no effect if the Reference is already a density.

Page 35 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 8. Compare crystal structure and electron density


reconstruction (Real space) – ChimeraX and PyMOL
There are a number of molecular graphics programs you can use to visualize electron
densities. We will use ChimeraX to create a simple visualization of the density, and
we will use PyMOL to create a fancy visualization of the density.

Note: Significant portions of this tutorial are based on this tutorial by Thomas Grant:
https://www.tdgrant.com/denss/tips/

1. Typically, you want to use a reconstruction averaged from 20 models in slow


mode, with refinement then done on the average. Since we didn’t do that in
the previous part, for time reasons, you can find the appropriate
reconstructions in the
reconstruction_data/polymerase_complete/polymerase_denss folder.
2. Open ChimeraX.
3. Load in the 2POL_centered.pdb and polymerase_refine_aligned.mrc
files in the
reconstruction_data/polymerase_complete/polymerase_denss folder.
4. If necessary, open the Model Panel and the Command Line by going to Tools-
>General Controls and clicking the appropriate items (should default to open,
the models panel in the lower right hand side, the command line at the
bottom).
5. Look in the Model Panel. Identify the ID number of the electron density.
6. In the command line, type “volume #x encloseVolume 138040” (without
quotes) where #x is the ID number you just found and 138040 is a rough
estimate of the polyemrase’s volume.
• Note: In order to set a reasonable threshold for the electron density,
we will choose a threshold that creates a contour of the density that
encloses a total volume equal to the expected volume of the particle.
If you don’t have a good estimate of the total volume, for proteins a
rough estimate is that the volume in Å3 is 1.7*(mass in Da).
• Note: The polymerase’s mass is 81.2 kDa, so we estimate the volume
as 1.7*81200.
7. In the Volume Viewer window click on the Color box. Set the opacity to 40%.
• Note: You can also change the color if you want to.

Page 36 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

8. You now have a version of an envelope view for the electron density. If you
want a more advanced visualization, we can use PyMOL
9. Close Chimera.
10.Open PyMOL.
11.In PyMOL, open the 2POL_centered.pdb and polymerase_refine.mrc
files.
• Note: There is a bug in the DENSS alignment program for PDBs with
multiple chains, which can cause them to not load properly into
PyMOL.
12.In older versions of PyMOL, when you open polymerase_refine.mrc you will
get the ‘Map Import’ dialog. Check the ‘volume’ representation box and click
‘Load’.
13.In newer versions of PyMOL, when you open polymerase_refine.mrc, in the
‘A’ model menu select ‘Volume’. This will create a new model in the model
menu showing the volume.
14.In the model panel, click on the ‘H’ in the 2POL_centered line, and select
waters to hide the waters in the PDB model.
15.In the polymerase_refine_volume line click on the ‘C’ and select ‘rainbow’.
This creates an initial rainbow map for the density.

Page 37 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

16.In the polymerase_refine_volume line click on the ‘C’ and select ‘panel’.
This opens up a panel where you can adjust the colors. By dragging the
colored dot left or right you adjust the sigma threshold for the color map. By
dragging the colored dot up or down you adjust the opacity of the color.
17.You can also explicitly create a color ramp using the PyMOL command line.
Enter the following command and then hit enter: volume_ramp_new
colored_density, 2 blue 0.0 2.5 blue 0.01 5 cyan 0.01 7.5
green 0.01 10 yellow 0.01 15 red 0.01 200 red 0.03
18.Once you have created the color ramp you can apply it to the volume object
with the following command: volume_color
polymerase_refine_aligned_volume, colored_density
19.You may have noticed that the map isn’t particularly accurate, it’s got low
density in some parts of the ring, and some larger bulges on the outside. Why
might this be? How would you improve it?

Page 38 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Part 9. Online tools (optional)


Some of the ATSAS software is available online on the EMBL servers. There are
several advantages. One, the online version is often easier to use than the command
line. Two, you use someone else’s computational resources instead of your own.
There are also several disadvantages. First, not all of the tools are available. Second,
of the ones that are available many of the advanced options cannot be changed.
Third, because it is open to everyone, you often have to wait for other jobs to finish
running. This means the analysis can sometimes take longer than running it on your
own computer!

Depending on your needs, these can be very handy resources, and we encourage
you to check them out:
http://www.embl-hamburg.de/biosaxs/atsas-online/

The FoXS server, which you have used in a previous part of the tutorial, is another
handy online tool:
https://modbase.compbio.ucsf.edu/foxs/index.html

The WAXSiS server uses explicit-solvent all-atom MD simulations to model the


hydration layer around molecules, and is probably the most accurate (and slowest)
method to calculate a scattering profile from a structure, particularly for wide angle
data:
http://waxsis.uni-goettingen.de/

Page 39 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

Appendix A. The command prompt


As part of this tutorial, you will need to be able to open a terminal (mac OS X and
Linux) or command prompt (Windows) and navigate to a known directory.

Windows 7:
1. Open a command prompt by clicking on the start menu, searching for “cmd”
(no quotes) and running the cmd program.
2. Type “cd ” (no quotes)
3. Drag the folder you want to move to into the command prompt. It should
automatically put the folder path in the command prompt. For example, if you
put the Example_Data directory on your desktop, and wanted to move to it,
you should now see “cd C:\Users\<username>\Desktop\Example_Data” (no
quotes) on the command line.
4. Hit enter.
5. The command prompt should show you what directory you are in (listed to
the left of the prompt).
6. To check what files are in the current directory, type “dir” (no quotes) and hit
enter.

Windows 8:
1. Open a command prompt by clicking on the windows tile and clicking the
down arrow to show all apps. In the all apps screen select Command
Prompt in the Windows System section.
2. Steps 2-6 for Windows 7 also work for Windows 8.

Windows 10:
1. Open a command prompt by clicking on the windows/start menu, selecting All
Files, selecting Windows System, and clicking on Command Prompt.
2. Steps 2-6 for Windows 7 also work for Windows 10.

Mac OS X:
1. In the Applications/Utilities folder, open the Terminal app.
2. Type “cd ” (without quotes).
3. Drag the folder you want to move to into the terminal. It should automatically
put the folder path in the command prompt. For example, if you put the
Example_Data directory on your desktop, and wanted to move to it, you
should now see “cd /Users/<username>/Desktop/Example_Data” (no quotes)
on the command line.
4. Hit enter.
5. To check what directory you are in, type “pwd” (no quotes) and hit enter.
6. To check what files are in the current directory, type “ls” (no quotes) and hit
enter.

Linux:
It depends upon the flavor of Linux you are using. On a many Linux machines:
1. Open the folder in your system file manager.

Page 40 of 41 Jesse Hopkins


2022 Beyond Rg Workshop

2. Right click in the folder and select “Open in Terminal”

Otherwise, you can open a terminal and use the “cd” command to change to the
proper directory.

Page 41 of 41 Jesse Hopkins

You might also like