Hello,
I have some problems with trying to understand how SHANUM works. As an example, I ran SHANUM for the Importin alpha/beta complex (SASBDB id: SASDAC5) with Dmax = 11 nm. The results for this example can be found in the Konarev and Svergun paper (http://journals.iucr.org/m/issues/2015/ ... index.html).
If I run SHANUM ($ shanum -d SASDAC5.dat 19) I obtain:
Datafile=SASDAC5.dat
Dmax= 19.000000000000000
Smax= 6.2370099999999997
Nsh= 37.720736921316117
Nopt= 8
Sopt= 1.3227758541430708
Mi questions:
i) why the Nopt value suggested by shanum is 8 in spite of the minimum value in the f(M) function is at M = 11? (I have the same problems when I work with other noisy data)
ii) is the SASDAC5 data the same data used in the Konarev & Svergun paper? How can I reproduce the f(M) function in Fig 4a (Konarev & Svergun, 2015)?
iii) the DPR_vs_Nshannon.dat file has three columns, what does they mean? (dp/dr, (dp/dr)², alpha*Omega(pM) ...?)
Thanks!
A.
SHANUM interpretation
Re: SHANUM interpretation
Hi Albert,
thank you for your inquiry, below you can find the answers.
>>i) why the Nopt value suggested by shanum is 8 in spite of the minimum value in the f(M) function is at M = 11? (I have the same problems >>when I work with other noisy data)
It is correct to check the minimum value of f(M), in majority of cases it gives a good estimate for the useful angular range.
However, in some cases (especially for noisy data) f(M) has a wide minimum plateau and the angular range, where a significant improvement of the fit quality of Shannon approximation happens, actually corresponds to lower M values.
That is why in the current implementation of Shanum, if f(M) function has a wide plateau, Nopt value is estimated in a more 'conservative' way selecting the M value after which no significant improvement of Chi^2 (or correlation map value) occurs.
It is also a good practice to compare the results for the data with and without error estimates, e.g. for this particular case
the correlation test will yield M=7 (that corresponds to the angular range up to 1.3 nm^-1
taken into account that the data becomes noisy already at 1.0 nm^-1)
>>ii) is the SASDAC5 data the same data used in the Konarev & Svergun paper? How can I reproduce the f(M) function in Fig 4a (Konarev & >>Svergun, 2015)?
It can be reproduced using the automated search for Dmax (that yields 17.2 nm), in this case f(M) will have minimum at M=8.
>>iii) the DPR_vs_Nshannon.dat file has three columns, what does they mean? (dp/dr, (dp/dr)², alpha*Omega(pM) ...?)
The first column - the number of Shannon channels used for the data approximation,
the second column - is Omega(pM) (the integral first derivative of p(r))
the third column - is the integral second derivative of p(r), calculated in a similar way as Omega(pM)
(the latter column does not influence any estimations of Shanum, it is stored just for information)
thank you for your inquiry, below you can find the answers.
>>i) why the Nopt value suggested by shanum is 8 in spite of the minimum value in the f(M) function is at M = 11? (I have the same problems >>when I work with other noisy data)
It is correct to check the minimum value of f(M), in majority of cases it gives a good estimate for the useful angular range.
However, in some cases (especially for noisy data) f(M) has a wide minimum plateau and the angular range, where a significant improvement of the fit quality of Shannon approximation happens, actually corresponds to lower M values.
That is why in the current implementation of Shanum, if f(M) function has a wide plateau, Nopt value is estimated in a more 'conservative' way selecting the M value after which no significant improvement of Chi^2 (or correlation map value) occurs.
It is also a good practice to compare the results for the data with and without error estimates, e.g. for this particular case
the correlation test will yield M=7 (that corresponds to the angular range up to 1.3 nm^-1
taken into account that the data becomes noisy already at 1.0 nm^-1)
>>ii) is the SASDAC5 data the same data used in the Konarev & Svergun paper? How can I reproduce the f(M) function in Fig 4a (Konarev & >>Svergun, 2015)?
It can be reproduced using the automated search for Dmax (that yields 17.2 nm), in this case f(M) will have minimum at M=8.
>>iii) the DPR_vs_Nshannon.dat file has three columns, what does they mean? (dp/dr, (dp/dr)², alpha*Omega(pM) ...?)
The first column - the number of Shannon channels used for the data approximation,
the second column - is Omega(pM) (the integral first derivative of p(r))
the third column - is the integral second derivative of p(r), calculated in a similar way as Omega(pM)
(the latter column does not influence any estimations of Shanum, it is stored just for information)
Re: SHANUM interpretation
Thanks for your reply!
Please, let me raise another question.
I have obtained the following Shannon results in several times:
According to the Chi²(M), Omega(pM) and f(M) functions the optimal number of Shannon channels is around 12. However, if I plot the fits as a function of M on the experimental data (without errors) I see that the best fitting is for the 5 Shannon channel (which corresponds to Smax = 2.4). This is in concordance with the fact that I can't obtain good p(r) functions with GNOM beyond S = 2.9.
How can I explain it?
Thanks!
Please, let me raise another question.
I have obtained the following Shannon results in several times:
According to the Chi²(M), Omega(pM) and f(M) functions the optimal number of Shannon channels is around 12. However, if I plot the fits as a function of M on the experimental data (without errors) I see that the best fitting is for the 5 Shannon channel (which corresponds to Smax = 2.4). This is in concordance with the fact that I can't obtain good p(r) functions with GNOM beyond S = 2.9.
How can I explain it?
Thanks!
Re: SHANUM interpretation
It is difficult to clearly see the low angle part of the data, but still one can distinguish that Fit_Shannon_7.dat does not fit this part,
and with high probability Fit_Shannon5 and Fit_Shannon6 should also have systematic deviations from the data in this region.
Besides, Chi_Rfac_vs_Nshannon.dat points that the best fit quality corresponds to M=10-12.
It looks that the buffer was undersubtraced (or there is some sample/buffer mismatch), one can try to force
Porod asymptotics at higher angles by subtracting a constant from the data, it may improve p(r) fitting at higher angles.
and with high probability Fit_Shannon5 and Fit_Shannon6 should also have systematic deviations from the data in this region.
Besides, Chi_Rfac_vs_Nshannon.dat points that the best fit quality corresponds to M=10-12.
It looks that the buffer was undersubtraced (or there is some sample/buffer mismatch), one can try to force
Porod asymptotics at higher angles by subtracting a constant from the data, it may improve p(r) fitting at higher angles.
Re: SHANUM interpretation
Hello konarev,
This discussion is very useful for me, thanks a lot!
Is it acceptable a nice fitting with a slightly oscillating pair distribution function?
Dmax= 11.000000000000000
Smax= 6.0127709999999999
Nsh= 21.053168979251168
Nopt= 21
Thanks,
A.
This discussion is very useful for me, thanks a lot!
Is it acceptable a nice fitting with a slightly oscillating pair distribution function?
Dmax= 11.000000000000000
Smax= 6.0127709999999999
Nsh= 21.053168979251168
Nopt= 21
Thanks,
A.
Re: SHANUM interpretation
It looks reasonable taking into account that Omega(M) function for M between 10 and 23 has similar values.