Bug in EOM Rsigma calculation?

Linear (OLIGOMER), and non-linear (MIXTURE) analysis, singular value decomposition (SVDPLOT), addition of missing fragments (BUNCH, CORAL), analysis of flexible systems (EOM/RANCH & GAJOE), flexible refinement of high-resolution models (SREFLEX)
Post Reply
Message
Author
jbhopkins
Active member
Posts: 17
Joined: 2014.10.22 21:21
Location: BioCAT, Advanced Photon Source

Bug in EOM Rsigma calculation?

#1 Post by jbhopkins » 2020.05.18 21:32

Hi folks,

I'm using EOM from ATSAS 3.0.1, and I think the Rsigma is being calculated incorrect. Either that or I don't understand how the Rsigma calculation is being done (quite possible).

I've attached the results of an EOM run on some good quality data (.dat file also attached). Most of the results look reasonable, but the reported Rsigma value in the .log file is 4.57! This is huge. In the EOM 2.0 paper, figure 3 shows an Rsigma for a hugely artificial bimodal distribution as 2.91. I'm struggling to see how the Rsigma of my result is larger than that of the biomdal distribution shown in that Figure 3. Below is a plot of the EOM distributions.

Here's why I think this might be a bug. First, I reviewed the 2015 paper and it defined Rsigma as std_selected/std_pool for the distributions. In the sup it defines the standard deviation in the usual manner. Second, I calculated the standard deviation of the pool and ensemble distributions using the standard approach for calculating standard deviation from a histogram. Which is to say, I weighted the value for each bin (for example, each Rg value) by the frequency of the bin, and then calculated the standard deviation of the resulting data. So, for example, some python (psuedo-)code to do this would be:

Code: Select all

rg_w_list = []
for rg in rg_distribution:
    rg_weighted = rg*f
    rg_w_list.append(rg_weighted)

rg_mean = numpy.mean(rg_w_list)
rg_std = numpy.std(rg_w_list)
where f is the frequency of the rg in the distribution and rg_distribution are the Rg values of the bins of the distribution. I then calculated the ratio of the standard deviatiosn for the selected ensemble and pool to come up with a distribution specific Rsigma value.

I did this for each individual distribution, and then calculated a final Rsigma as the average of the Rsigma of all four distributions (Rg, Dmax, Volume, and C alpha).

For the EOM result I've attached, the calculated Rsigma values of:
Rg R_sigma: 1.31
Dmax R_sigma: 0.99
Volume R_sigma: 0.99
C alpha R_sigma: 0.71
Average R_sigma: 1.0

I've attached the python script I used for the calculations so you can test it yourself, and see if I made a mistake somewhere.

I'm hoping you can let me know if either:
1) I'm completely misunderstanding how to calculate Rsigma from the distributions.
2) I've made a mistake in the actual calculation.
or 3) There's actually a bug in the EOM reported Rsigma value.


Finally, here are a few caveats:
1) I know that calculating standard deviations from histograms should use the center of each bin. I'm not sure if the results reported in the distribution files are edges or centers, so I didn't adjust for this. This could change the results slightly.

2) I know that calculating standard deviations from histograms might not be precisely the same as calculating it from the underlying data, since you assuming everything in the bin is at the bin midpoint. However, given the size of the bins involved I don't imagine this will make a lot of difference.

All the best.

- Jesse
Attachments
Rg_distr_001_1.txt
(2.38 KiB) Downloaded 57 times
Size_distr_001_1.txt
(2.38 KiB) Downloaded 50 times
Volume_distr_001_1.txt
(2.4 KiB) Downloaded 51 times
eom_result.png
eom_result.png (253.66 KiB) Viewed 922 times
NN8_alt.dat
(66.35 KiB) Downloaded 42 times

jbhopkins
Active member
Posts: 17
Joined: 2014.10.22 21:21
Location: BioCAT, Advanced Photon Source

Re: Bug in EOM Rsigma calculation?

#2 Post by jbhopkins » 2020.05.18 21:34

Turns out I can't attach more than 5 files to a post. Here's the rest of the EOM files. The python script is the 'eom_std_check.txt', which I can't attach as .py.
Attachments
eom_std_check.txt
(1.52 KiB) Downloaded 71 times
profiles_001_1.fit
(72.19 KiB) Downloaded 50 times
logFile_001_1.log
(2.91 KiB) Downloaded 47 times
CaCa_distr_001_1.txt
(2.38 KiB) Downloaded 46 times

jbhopkins
Active member
Posts: 17
Joined: 2014.10.22 21:21
Location: BioCAT, Advanced Photon Source

Re: Bug in EOM Rsigma calculation?

#3 Post by jbhopkins » 2020.10.12 18:27

I just wanted to ping this question. Does anyone have any thoughts? Or can anyone validate my calculation of Rsigma? Even if you haven't had time to track down the bug in EOM, it would be great to know if my calculation of Rsigma from the data is accurate, so I can use that in publications.

User avatar
AL
Administrator
Posts: 727
Joined: 2007.08.03 18:55
Location: EMBL Hamburg, Germany
Contact:

Re: Bug in EOM Rsigma calculation?

#4 Post by AL » 2020.10.13 09:32

Thank you for reporting this, we are looking into it.

franke
Administrator
Posts: 410
Joined: 2007.08.10 11:09
Contact:

Re: Bug in EOM Rsigma calculation?

#5 Post by franke » 2020.10.14 13:52

Hi Jesse.

Apologies for the late reply, I wasn't aware of this report. To reproduce, if possible, could you please send me your corresponding ranch pool? Namely:

Code: Select all

 Intensities file name .................................. : juneom.eom
 RANCH log file name .................................... : Rancheom.log
 SIZE_list file name .................................... : Size_listeom.txt
And anything else I'd need to re-run gajoe with the same inputs you had. That would simplify things greatly.

Thanks!

Daniel

jbhopkins
Active member
Posts: 17
Joined: 2014.10.22 21:21
Location: BioCAT, Advanced Photon Source

Re: Bug in EOM Rsigma calculation?

#6 Post by jbhopkins » 2020.10.14 17:23

Hi Daniel,

Thanks for looking into it. The juneom.eom file seemed to be too large to attach here, so I sent you an email.

Jesse

franke
Administrator
Posts: 410
Joined: 2007.08.10 11:09
Contact:

Re: Bug in EOM Rsigma calculation?

#7 Post by franke » 2020.10.21 15:19

To provide an update: the issue has been identified (memory corruption), and been fixed. The fix is available with ATSAS-3.0.3. Many thanks for the report!

Post Reply