SREFLEX Chi2 and RMSD meaning ?

Linear (OLIGOMER), and non-linear (MIXTURE) analysis, singular value decomposition (SVDPLOT), addition of missing fragments (BUNCH, CORAL), analysis of flexible systems (EOM/RANCH & GAJOE), flexible refinement of high-resolution models (SREFLEX)
Post Reply
Message
Author
Nyshae
Active member
Posts: 24
Joined: 2015.11.03 14:38

SREFLEX Chi2 and RMSD meaning ?

#1 Post by Nyshae » 2017.02.06 13:30

Hi everyone!
I have what is maybe a super simple question: when I run SREFLEX on the online mode, I get as an output a summary files with, for each of the proposed refined models, a Chi2 value, an RMSD value and a clashes number. I just want to know: to what exactly are those values referring to? The fit of the new model to the old model or the fit to the SAXS data ?
Many thanks!

Alex
Active member
Posts: 692
Joined: 2007.08.09 21:10
Location: Planet Earth

Re: SREFLEX Chi2 and RMSD meaning ?

#2 Post by Alex » 2017.02.06 16:58

i would think that the Chi2 is the discrepancy to scattering data, RMSD to initial model and a clash number for nma-refined model.
HTH, Alex

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#3 Post by sasha » 2017.02.06 21:23

Nyshae wrote:Hi everyone!
I have what is maybe a super simple question: when I run SREFLEX on the online mode, I get as an output a summary files with, for each of the proposed refined models, a Chi2 value, an RMSD value and a clashes number. I just want to know: to what exactly are those values referring to? The fit of the new model to the old model or the fit to the SAXS data ?
Many thanks!
Dear Nyshae,

For ATSAS 2.7.2 Alex is right, for each output model SREFLEX specifies:
C-alpha RMSD against the initial input structure (indicates how much the original structure was modified, usually less is better),
Chi2 against input SAXS data (goodness-of-fit, closer to 1.0 is better) and
internal clashes (pairs of C-alpha:C-alpha distances below 2.5 angstroms, the fewer the better).
This is described in the original manuscript http://pubs.rsc.org/en/content/articlel ... c5cp04540a, but I will add a clarification to the program's output.
In ATSAS 2.8.0, the 'breaks' and 'clashes' scores are normalized by the size of the molecule, the higher the value, the worse.
Thanks for sharing your question (and to Alex for answering), I will add the explanation to the manual and to the actual output for the next ATSAS version.

Best,

sasha

Alex
Active member
Posts: 692
Joined: 2007.08.09 21:10
Location: Planet Earth

Re: SREFLEX Chi2 and RMSD meaning ?

#4 Post by Alex » 2017.02.06 22:30

can i ask also how the clash score is calculated? can you also tell more how protein-ligand complexes are treated (where ligand is HETATM, like DNA/RNA/sugars) and how (if) clashscore is calculated for them?
Thank you in advance, Sasha!

Nyshae
Active member
Posts: 24
Joined: 2015.11.03 14:38

Re: SREFLEX Chi2 and RMSD meaning ?

#5 Post by Nyshae » 2017.02.07 10:54

Hi,
Thanks both for your answers, much appreciated! I'm using the ATSAS 2.8.0 version (on Mac) by the way ;)
In my case, the pdb file is not an actual crystallographic structure but an atomic model based on computing and various inputs. In that case, given how you define the values, the only value that is "relevant" to us is actually the Chi2 because it compares it to the SAXS data ? (the other two being comparisons to the atomic model, which in our case is an approximation)
Many thanks!!

Nyshae
Active member
Posts: 24
Joined: 2015.11.03 14:38

Re: SREFLEX Chi2 and RMSD meaning ?

#6 Post by Nyshae » 2017.02.07 12:56

NB: I might also add that, weirdly enough, on the last run the best Chi2 model is actually completely not superimposable with the SAXS DAMMIF shape whereas the one with the best RMSD is... :shock:

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#7 Post by sasha » 2017.02.15 12:34

Hi Alex,
Alex wrote:can i ask also how the clash score is calculated?
Yes, any distance below 2.5 between centroids (Calpha por proteins, or C1' for nucleotides) is considered a clash.
Then a score is computed as the percentage of clashing centroids in the structure divided by 10 (i.e. a clashes score of 1.0 would represent 10 percent of the centroids clashing).
can you also tell more how protein-ligand complexes are treated (where ligand is HETATM, like DNA/RNA/sugars) and how (if) clashscore is calculated for them?
Thank you in advance, Sasha!
Of course, before 2.8.0, HETATM entries were ignored.
As from 2.8.0, HETATM atoms are grouped by their own residue number/name, and then treated as side-chains* of the closest standard residue (protein or nucleotide) in space.
*This means that HETATM atoms are only considered for prediction of theoretical scattering, but not for the normal mode analysis.

Thanks for the good questions!

sasha

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#8 Post by sasha » 2017.02.15 12:40

Nyshae wrote:NB: I might also add that, weirdly enough, on the last run the best Chi2 model is actually completely not superimposable with the SAXS DAMMIF shape whereas the one with the best RMSD is... :shock:
Dear Nyshae,

It is difficult to say something without looking at the actual data and models, but your observations may be due to the ambiguity in the ab-initio modeling, did you generate several DAMMIF models and superimposed them? For more info check out
https://www.embl-hamburg.de/biosaxs/ats ... imeter.php
http://journals.iucr.org/d/issues/2015/05/00/dz5357/

What are the corresponding Chi-square and NSD values?

Alex
Active member
Posts: 692
Joined: 2007.08.09 21:10
Location: Planet Earth

Re: SREFLEX Chi2 and RMSD meaning ?

#9 Post by Alex » 2017.02.15 20:44

Yes, any distance below 2.5 between centroids (Calpha por proteins, or C1' for nucleotides) is considered a clash.
Then a score is computed as the percentage of clashing centroids in the structure divided by 10 (i.e. a clashes score of 1.0 would represent 10 percent of the centroids clashing).
That would mean that if one starts with a structure that has a clash and given that a distance is always a positive number, your clash score would not be very
informative, i suppose.
*This means that HETATM atoms are only considered for prediction of theoretical scattering, but not for the normal mode analysis.
this is what i have expected.
As from 2.8.0, HETATM atoms are grouped by their own residue number/name, and then treated as side-chains* of the closest standard residue (protein or nucleotide) in space.
I assume that the same Calpha - side chain criteria is used for clashscore? though i am not quite sure if you are indeed using the atomic radius for each particular HET atomic group?

Thanks, Alex

Nyshae
Active member
Posts: 24
Joined: 2015.11.03 14:38

Re: SREFLEX Chi2 and RMSD meaning ?

#10 Post by Nyshae » 2017.02.16 13:58

Hi all,
Thanks again for the answers!
Sasha, to answer your questions, I did run DAMMIF using the online version and generated 20 models. All of them but one were deemed good enough to be superimposed, with a mean value of NSD of 0,597.
I also just run ambimeter as you suggested and the output is the following:
Number of compatible shape categories .................. : 19
Ambiguity score ........................................ : 1.279
3D reconstruction is potentially unique

Alex
Active member
Posts: 692
Joined: 2007.08.09 21:10
Location: Planet Earth

Re: SREFLEX Chi2 and RMSD meaning ?

#11 Post by Alex » 2017.02.21 11:49

Sasha, just to re-phrase my questions and give a real life example:
I have a protein-DNA complex for which there is a SAXS data. In this complex protein is assumed to be in an opened conformation
however the protein only could have been crystallized and its in the closed state. If one places DNA in the central groove (as its observed
in the close homologues and confirmed by other experiments), there are severe clashes. My question is does it make sense to start
with model with bad clashes hoping that it gets better in terms of discrepancy to SAXS data and decrease of clash score?
Thanks,
Alex

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#12 Post by sasha » 2017.02.23 16:59

Nyshae wrote:Hi all,
Thanks again for the answers!
Sasha, to answer your questions, I did run DAMMIF using the online version and generated 20 models. All of them but one were deemed good enough to be superimposed, with a mean value of NSD of 0,597.
I also just run ambimeter as you suggested and the output is the following:
Number of compatible shape categories .................. : 19
Ambiguity score ........................................ : 1.279
3D reconstruction is potentially unique
That sounds good, what are the NSD values of the ab-initio models against the high-resolution model, both before and after SREFLEX refinement? What are the corresponding chi-square values?

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#13 Post by sasha » 2017.02.23 17:19

Alex wrote:
Yes, any distance below 2.5 between centroids (Calpha por proteins, or C1' for nucleotides) is considered a clash.
Then a score is computed as the percentage of clashing centroids in the structure divided by 10 (i.e. a clashes score of 1.0 would represent 10 percent of the centroids clashing).
That would mean that if one starts with a structure that has a clash and given that a distance is always a positive number, your clash score would not be very
informative, i suppose.
Before starting the refinement, SREFLEX will check for clashes in the input structure and abort if there are more than 10% centroids clashing.
What you are proposing is that a single atom could move against another until it "passes to the other side" where the clash threshold would not apply anymore... If you can provide an example I would happily look at it, but usually there are more atoms around and the clashes score will increase in an informative way when dealing with protein structures. Ultimately, a single centroid clashing into the structure is usually not a big deal, that's why the score is normalized by the total amount of centroids in the structure.
Alex wrote:
As from 2.8.0, HETATM atoms are grouped by their own residue number/name, and then treated as side-chains* of the closest standard residue (protein or nucleotide) in space.
I assume that the same Calpha - side chain criteria is used for clashscore? though i am not quite sure if you are indeed using the atomic radius for each particular HET atomic group?
Only centroids are considered for clashes (CA or C1'), HET atoms are not considered for clashes.

sasha
Active member
Posts: 46
Joined: 2014.03.05 17:56

Re: SREFLEX Chi2 and RMSD meaning ?

#14 Post by sasha » 2017.02.23 17:29

Alex wrote:Sasha, just to re-phrase my questions and give a real life example:
I have a protein-DNA complex for which there is a SAXS data. In this complex protein is assumed to be in an opened conformation
however the protein only could have been crystallized and its in the closed state. If one places DNA in the central groove (as its observed
in the close homologues and confirmed by other experiments), there are severe clashes. My question is does it make sense to start
with model with bad clashes hoping that it gets better in terms of discrepancy to SAXS data and decrease of clash score?
Thanks,
Alex
Hi Alex, that's a more complicated scenario and I don't think SREFLEX would be the appropriate tool for it. The basis of normal mode analysis is to start from an equilibrium state, this means that NMA will not produce meaningful results in the presence of severe clashes.
Maybe (coarse-grained) molecular dynamics can be used to get rid of severe clashes in this case?

Alex
Active member
Posts: 692
Joined: 2007.08.09 21:10
Location: Planet Earth

Re: SREFLEX Chi2 and RMSD meaning ?

#15 Post by Alex » 2017.02.23 17:34

Sasha, thanks for your replies!

Post Reply