The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum for Programming
 LabTalk Forum
 automatic outlier removal in a worksheet
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

coll@inia.

Spain
125 Posts

Posted - 04/29/2011 :  12:14:53 PM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Origin Ver.8 professional
Operating System: WXP

I have a worksheet of values distributed in 2000 rows of 4 columns.

I need to make the mean and sd of each of the rows but I need to mask the outliers. I am doing it now manually.
Can I use an existing function in Origin?
Is there any script I can use?

I define an outlier as any value deviating more than 4 sd above or below from the preliminary mean.

Could anybody help?
thanks, julio

Penn

China
644 Posts

Posted - 05/02/2011 :  10:54:59 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi julio,

You can try the rowstats and wcellmask.

Penn
Go to Top of Page

coll@inia.

Spain
125 Posts

Posted - 05/03/2011 :  04:34:36 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thank you for the help Penn.

Folowing your recomendations, I came out with this tentative script which is wrong somewhere(s) because it gives me #command error! in several of its versions:

'to select a range with the mouse
for(jj=wks.r1; jj<=wks.r2; jj++);
for(ii=wks.c1; ii<=wks.r2; ii++);
'to calculate mean and sd
rowstats irng:=col(c1):col(c2)[ii] mean:=mean1;
rowstats irng:=col(c1):col(c2)[ii] sd:=sd1;
'to define an outlier and mask it
{if(ii<mean1-sd1 or i>mean1+sd1) wcellmask irng:=col(jj)[ii]};

Could you help me?
Thanks I really would appreciate it!, julio
Go to Top of Page

Penn

China
644 Posts

Posted - 05/05/2011 :  04:14:30 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi julio,

Please refer to this example.

Penn
Go to Top of Page

coll@inia.

Spain
125 Posts

Posted - 05/05/2011 :  2:15:27 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thanks Penn, I will try it!

I really appreciated it!

sincerely, julio
Go to Top of Page

coll@inia.

Spain
125 Posts

Posted - 05/06/2011 :  05:34:14 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
GREAT!!
with your help and a little bit of moving around , I came out with the following code which makes the trick:

nCols = wks.nCols;
nRows = wks.nRows;
type -c $(nCols);
type -c $(nRows);

//add 2 cols for mean and sd
wks.addCol(meanCtrl);
wks.addCol(sdCtrl);

//perform row stats for ctrls rows 6 to 10
rowstats -r 0 irng:=col(6)[1]:col($(nCols))[nRows] mean:=col(meanCtrl) sd:=col(sdCtrl);

//outliers
for(ii=1; ii<=nRows; ii++)
//{type -c "row: " $(ii);
for(jj=6; jj<=nCols; jj++)
//{type -c "column: " $(jj);
if(col($(jj))[ii]>col(meanCtrl)[ii]+col(sdCtrl)[ii])
{wcellmask irng:=col($(jj))[ii];};
if(col($(jj))[ii]<col(meanCtrl)[ii]-col(sdCtrl)[ii])
{wcellmask irng:=col($(jj))[ii];};
}}
//perform definitive row stats for ctrls
rowstats -r 0 irng:=col(6)[1]:col($(nCols))[nRows] mean:=col(meanCtrl) sd:=col(sdCtrl);

Thanks Penn, it really help me!, julio
Go to Top of Page

coll@inia.

Spain
125 Posts

Posted - 05/07/2011 :  07:45:43 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Upps!
I found out that some memory is preventing the recalculation of the new means and sd once the outliers are masked

I tried to delete the mean and sd columns and calculted again, but it seems that is taken the old non-masked values to recalculate!!

Any other idea to try?

Thanks, julio
Go to Top of Page

Penn

China
644 Posts

Posted - 05/08/2011 :  11:39:24 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi julio,

When you perform statistics on the row again after masking some data, the masked data will not be involved and the results will change. You can see what I get with your script below:

Before recalculate:


After recalculate:


You can see that the results change.

If you need to recalculate the results after masking data, you can set the Recalculate Mode to Auto, and then no need to call the rowstats X-Function manually. To set to Auto, please use -r 1. For example:

// this way has no recalculation
rowstats -r 0 irng:=col(6)[1]:col($(nCols))[nRows] mean:=col(meanCtrl) sd:=col(sdCtrl);
// this way sets the Recalculate Mode to Auto
rowstats -r 1 irng:=col(6)[1]:col($(nCols))[nRows] mean:=col(meanCtrl) sd:=col(sdCtrl);


And then, you don't need to execute this line any more:

//perform definitive row stats for ctrls
rowstats -r 0 irng:=col(6)[1]:col($(nCols))[nRows] mean:=col(meanCtrl) sd:=col(sdCtrl);

See the results below:


By the way, in your script, there is a little mistake, which is two left curly braces "{" are missing because of the two lines are commented out.

//{type -c "row: " $(ii);

//{type -c "column: " $(jj);


Penn

Edited by - Penn on 05/08/2011 11:44:40 PM
Go to Top of Page

coll@inia.

Spain
125 Posts

Posted - 05/09/2011 :  06:35:04 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
GREAT, the 0 to 1 "transition", works beautifully!!!. I was really wondering what for the 0 was!!.
The auto calculation is really the way to go to avoid messy results. THANKS!.

The lacking }s were first finished at the end of the script. However when I removed the type statement, I forgotten to remove also the final extra }}. Their presence were producing an error window. Thanks for confirm me the interpretation.

I think the outlier detection and calculation problem is solved. I am still impress by the way the script detects the outliers which I was doing manually, except that is a little slow.

I try to substitute the 2 If statements (> and <) by only one If(> || < ) statement, which I think should be faster. However I do not get it to work. I was not sucessful either to pass variable names into any of the If statements. Nevertheless it is working!

THANKS again Penn, julio


Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000