The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum for Programming
 LabTalk Forum
 True correlation coefficient
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

jonni

United Kingdom
58 Posts

Posted - 05/07/2003 :  07:08:46 AM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Hello,

A question about correlation in labtalk. I have several files with 1 X column and 2 Y columns. I need to determine the correlation coefficient between the Y columns. I am trying to use the following script to automate this for all my files (a017s00X).

%a=001 002 003 004 005; //list of filename suffixes
for (ii=1;ii<=5;ii+=1)
{
work -b a017s%[%a,#ii]; //set a017s00X as active dataset
get %h_a -e lastrow; //get total number of rows for normalisation
stat.reset();
stat.ds.more=1;
stat.ds.data$=a017s%[%a,#ii]_b;
stat.ds();
b=stat.ds.mean; // mean of first Y column
bb=stat.ds.var; // variance of first Y column
stat.reset();
stat.ds.more=1;
stat.ds.data$=a017s%[%a,#ii]_c;
stat.ds();
c=stat.ds.mean; // mean of second Y column
cc=stat.ds.var; // variance of second Y column
bxc=corr(col(b),col(c),0)/lastrow ; //correlation with lag=0
xcor$(ii)=(bxc-b*c)/sqrt(bb*cc); //normalised correlation coefficient
};

If I use this script where my two Y columns are identical I should get a coefficient of 1 - in fact I get 0.967. It appears that the corr() function does something a bit strange (even in the example FFT correlation.OPJ) where the lag extends beyond the range of the data (for a data set with 200 rows it gives a lag range -256 to 255). The value in this case when the lag equals zero, when divided by the total number of rows should give the correlation but this is never the case.

Is the corr() function doing something more complicated than simply multiplying the two Y columns (one with a lag) and summing the result? If I do this process explicitly it gives me the same value suggesting rounding errors or similar.

The reasons for doing this in preference to finding a linear fit between the two Y columns and reading the 'R' coefficient is that my analysis may later require the lag offset and different normalisation; also it does not look possible to automate this other method in the same way.

Any help would be appreciated (additionally any simplification of the above script to automate a process for several files would be useful).

Thanks.

Edited by - jonni on 05/07/2003 07:24:31 AM
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000