Author |
Topic |
|
Steve Schulze
Iceland
2 Posts |
Posted - 01/31/2010 : 5:17:18 PM
|
Hi!
I have a little question about the error estimation in origin. In the case of multiple data sets, not all parameters could be independent. One can combine them either via shared parameters (NLSF->Action->Dataset->shared parameters) or linear constraints (NLSF->Option->Constraints, Toolbar). If a data set is fitted, how is the error estimated in both methods?
Example: I fitted two data sets with a smoothly broken power law which is defined in the following way:
F = A * ( (t/tb)^(alpha1 * n) + (t/tb)^(alpha2 * n))^(-1/n)
A is the normalization, t is the time, tb the break time, alpha1 is the decay slope before the break, alpha2 is the decay slope after the break and n the smoothness of the break.
Both data sets only differ in their normalizations. First, I fitted both data sets with the following linear constraints: alpha1 = alpha1_2, alpha2 =alpha2_2, tb = tb_2. The parameter n was fixed to 10 in both cases.
I obtained the following results: alpha1 = 1.07 +/- 0.04 alpha2 = 2.40 +/- 0.05 tb = 112600 +/- 2900
alpha1_2 = 1.07 +/- 0.13 alpha2_2 = 2.40 +/- 0.10 tb_2 = 112600 +/- 10300
red. chi^2 = 1.08
Fitting the same data with the option "shared parameters" switched on, I obtained these results:
alpha1 = 1.07 +/- 0.04 alpha2 = 2.40 +/- 0.04 tb = 112600 +/- 2800
red. chi^2 = 1.02
The error estimates differ. What does Origin internally do that makes the estimates so different? Is there an additional weightening for the number of data points if the option 'joint parameter' is used?
Cheers,
Steve
=========== Origin Pro 7.5G SR6, Windows XP SR3 running in VirtualBox on MacOS 10.5.8 |
|
larry_lan
China
Posts |
Posted - 02/01/2010 : 10:10:57 AM
|
Hi Steve:
Here is the basic calculation for fitting parameter errors and other values. Regards to the errors, there are two key things related to it: the Partial Derivatives Matrix and Residual Sum of Square
First, if you want to know more detail about parameter errors, you should pay more attention on the partial derivatives matrix, F. When fitting multiple datasets, especially in global fit, we will combine all data into one partial derivatives matrix, and then enter 0 for those parameters that not shared. Before looking into the following example, I supposed you have read the above link. Now, say we want to fit the function, y = a+b*x, and we have the following data:
x y y 1 2 3 2 4 5 3 5 6 4 8 8 5 9 9
The partial derivatives for each parameter is:
So if we fit the first xy data, that's a regular fit, and the partial derivatives matrix is:
1 1 1 2 1 3 1 4 1 5
However, when doing global fit, we need to combine these two dataset together. So, in global fit, if no parameter is shared, the partial derivatives matrix is (let's say, M1):
1 1 0 0 1 2 0 0 1 3 0 0 1 4 0 0 1 5 0 0 0 0 1 1 0 0 1 2 0 0 1 3 0 0 1 4 0 0 1 5
Then calculate the covariance matrix by the formula I mentioned on our help document:
The covariance matrix is the matrix that parameter errors based on.
However, if you shared one parameter, say, a, then the partial derivatives becomes (M2):
1 1 0 1 2 0 1 3 0 1 4 0 1 5 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5
And also, if you shared b, then the matrix becomes (M3):
1 1 0 1 2 0 1 3 0 1 4 0 1 5 0 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1
Now, let's come back to your question, why the error is different between global fit and constraint? That's because there is another key value, the Residual Sum of Square. Note that in the first formula, there is a s^2, which is called mean residual variance. And it is calculated by:
Where RSS is the residual sum of square. When you use linear constraint and not sharing any parameters, the covariance matrix is actually construct like M1. However, the RSS is different from doing global fit, since the parameter values are different, and the RSS are different. In other words, these two methods both generate a 4x4 covariance matrix, but since the RSS is different, values in covariance matrix are different. However, if you do global fit and share 1 parameter, you will get a 3x3 covariance matrix (such as M2 or M3).
You can verify these formulas by yourself. Maybe it's somewhat difficult to do matrix calculation in Origin, need more coding if you want to do that manually. In our help document, you can find more information about how to do L-M iterations and what's the formula for everything during fitting. The help document is for Origin 8, but the algorithm is the same.
Thanks Larry OriginLab Technical Services |
Edited by - larry_lan on 02/01/2010 10:18:31 AM |
|
|
Steve Schulze
Iceland
2 Posts |
Posted - 02/02/2010 : 08:25:58 AM
|
Thanks, for your help. larry_lan. It took me a while to understand but things are getting clearer now.
These things are still very new to me. I was looking for the RSS in the case of a constraint and global fit on the pages about the theory of non-linear NLSF. I didn't find any hints. Could you give me some hints how they would differ in both cases?
I have also another question. Based on your experience, is it better to use a constraint fit instead of a global one? IMHO, I would think that the constraint fit is better if one needs to know the uncertainty in the shared fitting parameter of each data set. On the other side, the global fit is restricted to the case that the parameters are identical. This is quite a limitation.
Cheers,
Steve |
|
|
larry_lan
China
Posts |
Posted - 02/02/2010 : 09:47:35 AM
|
Hi Steve:
Yes, the nonlinear fitting formulas involve lots of matrix calculation, so, it's somewhat difficult to understand. The RSS is different because the fitted values are different either. Please see equation 3 from this page. This equation also involve the partial derivative values. We did not expand this equation on help document since it's too complicated. If you interest in the detail math about nonlinear regression, please read these books.
In short, the fitted values between 'global fit & share parameters' and 'linear constraint between datasets' are different. In your case, I think maybe your fitting model is over-parameterized, so you saw the fitted values are the same. Can you check the Dependency value to see whether these values are close to 1?
How to choose the fitting methods depends on how you treat your data. The global fit implies that there may be interactions between datasets. While the other way will fit datasets independently, but restrict the parameter values.
Thanks Larry |
Edited by - larry_lan on 02/02/2010 09:59:54 AM |
|
|
|
Topic |
|
|
|