The Origin Forum - Using stat.lr for a weighted linear fit

The Origin Forum

Username:	Password:
Save Password
Forgot your Password? \| Admin Options

All Forums

Origin Forum for Programming

LabTalk Forum

Using stat.lr for a weighted linear fit

New Topic

Reply to Topic

Printer Friendly

Author

Topic

basp

3 Posts

Posted - 06/27/2013 : 4:22:50 PM

I'm trying to use stat.lr() to do a least-squares fit of my data. The problem, it seems, is that stat.lr() assumes the x-data is {1,2,3...}.

What am I missing? How do I define the x dataset?

Origin Ver. and Service Release: 8.0.63.988 SR6
Operating System: Win7

basp

3 Posts

Posted - 06/28/2013 : 10:28:23 AM

It occurred to me that I might be using the wrong tool for the job, and I should maybe explain the issue a bit better:

I want to perform a weighted least-squares fit to a straight line. Simple enough, I thought. My data is held as three loose datasets: xdata, ydata, yerror.

My script repeats this for 30,000 such x, y, yerror groups and records the fit parameters. Prior experience has found that nlsf works but is impractically slow.

The lr command works much faster but does not allow for weighting, so that's out. The same goes for the fitLR or fitpoly x-functions.

That leaves me with stat.lr. Weighting is possible (stat.errBarData$). However, the stat.fitxData$ property only defines the x dataset for output...akin to nlsf.funcx$. There doesn't seem to be an equivalent to nlsf.x$.

Again, am I missing something? Is there no way to script a linear least-squares fit with weighting?

Penn

China
644 Posts

Posted - 07/01/2013 : 02:07:32 AM

Hi,

In Origin 9 (you can download a demo from this page), the speed issue is improved. And with the nlbegin X-Function (refer to nonlinear fitting), the following script elapses less than 1 second in my computer, but about 300 seconds in version 8.


nlbegin (dsx, dsy, dserr) line tt weight:=ins;
nlfit;
nlend 0;

Also, in Origin 9, maybe you can try to use Origin C. You can refer to fitlinear function or ocmath_linear_fit for more details. Here is a simple example, which will use the reference for the input data, that has no data copy when passing parameters, so to improve the speed.


void myfitlr(vector& vx, vector& vy, vector& verr, double& a, double& b)
{
	int nsize = vx.GetSize();
	double err_a, err_b, rsq, rss, df;
	FitParameter sFitParameter[2];
	int nret = ocmath_linear_fit(vx, vy, nsize, sFitParameter, verr, nsize);
	
	if(nret != STATS_NO_ERROR)
		return;
	
	a = sFitParameter[0].Value;
	b = sFitParameter[1].Value;
}

Before using this function in LabTalk, open Code Builder (View: Code Builder), and create a new c file and put the code there, then compile it. The script below is the usage example.


double a, b;  // for output of myfitlr, intercept and slope
myfitlr(dsx, dsy, dserr, a, b);  // dsx, dsy, and dserr are X, Y and Error data respectively, a and b are output values
a = ;  // intercept
b = ;  // slope

With the same data, Origin C code is much faster than nlbegin script above.

Please note that, the weight method is direct weight in the Origin C code above, if you want to use the Instrumental weight method, please calculate 1/(Error^2) for it.

Penn

Edited by - Penn on 07/01/2013 02:37:12 AM

basp

3 Posts

Posted - 07/01/2013 : 08:44:41 AM

Hi Penn,

Thank you for your help.

Over the weekend, I dusted off the data analysis book on my shelf, and I used it as the basis for a few lines of code:

//Datasets from the active sheet columns
dataset x, y, ye;
x=col(1);
y=col(2);
ye=col(3);

//Length of dataset for summation loop
sz=x.getsize();

//Initialize summation values
sum1=0;
sum2=0;
sum3=0;
sum4=0;
sum5=0;

//Summations
loop(j,1,sz)
{
sum1=sum1+(1/ye[j]^2);
sum2=sum2+(x[j]/ye[j]^2);
sum3=sum3+(x[j]^2/ye[j]^2);
sum4=sum4+(y[j]/ye[j]^2);
sum5=sum5+(x[j]*y[j]/ye[j]^2);
};

delt = sum1*sum3-sum2^2;
a=(sum3*sum4-sum2*sum5)/delt; //slope
b=(sum1*sum5-sum2*sum4)/delt; //y-intercept
ae=sqrt(sum3/delt); //slope error
be=sqrt(sum1/delt); //y-intercept error

Easy! However, the uncertainties I calculate (literally, by the book) are not what Origin's linear (or non-linear) fitting yields.

I'm glad Origin 9 is faster, but I won't be upgrading. (I'll spare you the rant behind this statement.)

greg

USA
1379 Posts

Posted - 07/01/2013 : 11:23:53 AM

You do not need to specify the X dataset when using the STAT tool to do Linear Regression with weighting, BUT the column Plot Designation must be set to X for the X dataset and that column must be to the left of the specified Y dataset so the X-Y become properly associated. The Error column Plot Designation does not have to be yErr for this to work as this script demonstrates (Warning:Script is destructive, so save your project first):
doc -s;
doc -n;
col(1)=data(1,10);
col(2)=uniform(10);
wo -a 7;
col(3)=uniform(10);
col(4)=col(1)+uniform(10);
col(5)=col(2);
col(6)=col(3);
col(7)=col(4);
col(8)=col(2);
col(9)=col(3);
// X is just row numbers
stat.DATA$ = %(Data1,2);
stat.ERRBARDATA$ = %(Data1,3);
stat.LR();
ty \x5bLinear Fit of A-B-C\x5d;
ty Intercept $(stat.lr.a), Slope $(stat.lr.b);
// X data is different, but Plot Designation of 'D' is still Y
stat.DATA$ = %(Data1,5);
stat.ERRBARDATA$ = %(Data1,6);
stat.LR();
ty \x5bLinear Fit of D-E-F WITHOUT changing D Plot Designation\x5d;
ty Intercept $(stat.lr.a), Slope $(stat.lr.b);
// X data is different, and Plot Designation of 'G' is now X
wks.col7.type = 4; // Set Column 7 as X
stat.DATA$ = %(Data1,8);
stat.ERRBARDATA$ = %(Data1,9);
stat.LR();
ty \x5bLinear Fit of G-H-I WITH change of G Plot Designation\x5d;
ty Intercept $(stat.lr.a), Slope $(stat.lr.b);

Topic

New Topic

Reply to Topic

Printer Friendly

Jump To:

The Origin Forum

Snitz Forums 2000