The Origin Forum - A cautionary note on linear fitting theory used in Origin 5.0

The Origin Forum

Username:	Password:
Save Password
Forgot your Password? \| Admin Options

All Forums

Origin Forum

A cautionary note on linear fitting theory used in Origin 5.0

New Topic

Reply to Topic

Printer Friendly

Author

Topic

Clemens Woda

Heidelberg, Germany
0 Posts

Posted - 08/20/1998 : 3:26:00 PM

Attention!

The same problem described earlier this year with the
non-linear curve fitter seems to apply to the linear fitting theory used in
Origin 5.0. If no weighting is used, Origin estimates the standard deviation
(i.e. the measurement errors) of the yi's from the scattering of the data:
sd=1/(n-2)*Sum(yi-a-b*xi)^2, which is nothing but the Chi^2 you obtain, when
setting all individual measurement errors to 1. The errors for the parameters
a and b (intercept and slope) are then estimated by taking the analytical
formulas for the error estimates for a and b with different weights,
setting all si=1, and multipling this result with the estimated standard
deviation (sqrt(reduced Chi^2)). This procedure, which is also described in
Press et al. "Numerical Recipes", p. 526, is legitimate, if you have no prior
knowledge of your measurement errors, although it does has the disadvantage
that you get no independent statistical measure of your goodness-of-fit,
because it ASSUMES you have a good fit. If you do have your own (perhaps more
accurate) estimate of the standard deviation, multiply the errors for a and b
with the factor sqrt("own" sd/ "Origin" sd).
The problematic part is when you're using the weighted linear fit, i.e. when
the individual measurement errors si are NOT equal. Origin again uses the
analytical formulas for the error estimates with different weights for a and b
(which are correct) but then STILL multiplies these results, which already are
the CORRECT estimates, with the (weighted) standard deviation sd
(=sqrt(reduced Chi^2)). So, the program uses the same procedure for both
cases (linear fit with equal or unequal weights), but from the statistical
background this procedure should ONLY (if at all) be used for the
"eqal weight"-fit!!!
This can have two effects: either your data points are by chance distributed
very close to the regression line (small deviations), this is most probable
if you only have a few (4 or 5) data points, then the reduced Chi^2 is smaller
than 1 and your error estimates for a and b are too small; or your data
displays unusual large scatter (large deviations) then the reduced Chi^2
is greater than one and your error estimates are too large. There is an easy
way to correct for this: Divide the error estimates for a and b by the
standard deviation sd!

We have the same situation as with the non-linear curve fitter (see previous
posts on this subject). There, the error estimates (sqrt of the diagonal
elements of the variance matrix) are also multiplied with the sqrt of the
reduced Chi^2, when using Origin 4.0, 4.1 or the "use chi^2 for errors" option
in Origin 5.0! The (satistically) correct estimates are obtained in
Origin 5.0 when disabling the "use chi^2 for errors" button. In turn, the same
easy correction as described above can be applied when using the earlier
versions (3.7 and higher): Take the errors and divide them by the sqrt of the
reduced Chi^2 (or let the program calculate the variance-covariance matrix,
an option already possible in version 4.0, the sqrt of the diagonal elements
are the correct error estimates): finished.

Now there are cases, probably encountered quite often by the user, where the
linear relation (or generally speaking the assumed model function) between the
dependent y and independent x values is the DOMINATING but not the ONLY
relation observed. Other unknown and undeterminable effects result in
systematic scatter of the data points. From the statistical point of view,
if the Chi^2-test tells you that the observed data set is very unlikely to
occur with the assumed model relationship, you would have to discard the
parameters and error estimates. But you yourself know that the (linear)
function is only an approximation and you still want a coarse estimate of
the parameters, taking into account a larger error. In that case the
(statistically) correct error estimates are too small. Then it could be
intuitively justified to enlarge your errors, the magnitude being determined
by how large your deviations of the measured data to the fitted function is,
i.e. multiplying your errors by the sqrt(reduced Chi^2). But this is a
delicate matter, you have to be sure (from other knowledge) that your
assumptions are correct and that the perturbing effects still are small
against the dominating (linear) relation (and that does not include that the
fitted function and the data "look good",this is very dangerous, see also
Press et al.for further dicussion). Otherwise, as stated before, the
parameters and errors are meaningless!

So both ways of estimating errors (taking the variances directly or
multiplying them with Chi^2) can be justified but it should be made clear
to the user when to use what. Additionally, the option to use either estimate
should also be implemented for the linear fitting tool in Origin 5.0!
Hopefully, if Origin starts distinguishing between the two effects it could
make the scientific world more sensible to the evaluation of fitting results
and could also make the results published in scientific journals more
comparable and honest.

I am very curious wether other users share my opinion or have a different
view on this subject. So please, write!

Clemens Woda

Topic

New Topic

Reply to Topic

Printer Friendly

Jump To:

The Origin Forum

Snitz Forums 2000