The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum
 Origin Forum
 Removing duplicated data values
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

dpgg94

Singapore
5 Posts

Posted - 06/14/2018 :  12:39:55 PM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Hello all,
I would like to find out if there's any way to remove duplicated values from particular columns in a sheet. For example, i have 10 columns of data, but some columns may have repeated values due to multiple scans by an instrument. I would like to remove these data because they would affect the average and standard deviation if I were to include them while carrying out statistics.

Thanks for any help! Have tried googling for this but did not manage to find any info.

YimingChen

1609 Posts

Posted - 06/14/2018 :  3:11:46 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

Can you try Worksheet:Remove/Combine Duplicated Rows. This tool only scan one single column though. You may consider stacking your 10 columns before running the this tool.

James
Go to Top of Page

Hideo Fujii

USA
1582 Posts

Posted - 06/14/2018 :  3:37:42 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi dpgg94,

Origin provides three methods to remove the duplicated data, which are:
1) "Worksheet> Remove/Combine Duplicated Rows" menu (wdeldup), and
2) "Analysis> Data Manipulation> Reduce Duplicated X Data" menu (reducedup)
3) "Unique(<column>)" command

The differences between first two are mainly as follows:

- Wdeldup works for a whole worksheet, but not for a plot. Reducedup works on two
datasets of XY. It works for two columns of XY in a worksheet as well as for a plot in a graph.

- Wdeldup reduces the rows in the rest of columns on a reference column by taking options:
Keep 1st/Last, Average, Min/Max, etc. Reducedup does similar, but no Keep 1st/Last option.

- Wdeldup works for duplication by exact numbers, or by the text strings. Reducedup works
on numbers possibly with some tolerance.

The Unique command can be applied to only one dataset, i.e., a column in a worksheet.
There is Keep 1st/Last option, but no Tolerance.

You can choose either method of above which is suitable for your case.
If more than one columns in your data may possibly contain duplication, you may have to
apply above method multiple times.

References:
[1] Blog: http://blog.originlab.com/data-handling/delete-every-nth-row-or-column-from-worksheet-2
[2] Origin Help:
https://www.originlab.com/doc/Origin-Help/Wks-DataReduction
https://www.originlab.com/doc/Origin-Help/reduce-duplicate-rows
https://www.originlab.com/doc/LabTalk/ref/Unique-func
[3] Video Tutorial: https://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis&ss=chm&pid=1575

Hope this helps.

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/14/2018 3:41:29 PM
Go to Top of Page

dpgg94

Singapore
5 Posts

Posted - 06/14/2018 :  10:44:11 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Here's the kind of data I have, with the original form of values and manually edited to remove duplicated values within each column.
https://imgur.com/a/rCZlUu1

The other methods suggested seem to remove all the same rows for the remaining columns even if i select a specific column to work on.
As for my data, there would be random numbers of duplicated values within each row, and at varying positions.

I have no clue about scripting and writing the command line, but will be trying to learn how to use it now. If possible, could anyone help me to write a script for this purpose?

Thanks a bunch!
Go to Top of Page

Hideo Fujii

USA
1582 Posts

Posted - 06/15/2018 :  5:09:58 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi dpgg94,

Okay, so you want to remove redundant rows at each column in the worksheet, right? Then you can use
the 3rd option, "Unique(<column>)" command. Just applying this command every column in your worksheet
doesn't need to make a script; rather you can do the following:

1) Enter unique(this) in the F(X)= header row in the 1st column, followed by pressing ENTER key.
==> You get the 1st column non-redundant.
2) Select the F(X)= header row in the 1st column, and drag the black dot at the bottom corner of the cell
(cursor shape changes to the "+" sign) rightward until the last column.
==> All columns get the same operation.

Hope this helps.

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/15/2018 5:11:32 PM
Go to Top of Page

dpgg94

Singapore
5 Posts

Posted - 06/15/2018 :  9:00:09 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hello,
thank you for the reply.

What does "this" refer to when you mention unique(this)? Should i be putting the column label in it?

I've tried it with column label and it doesn't seem to do anything after pressing enter.

https://imgur.com/a/6tpfbza

Apologies if I'm kind of slow on this, had started using Origin few weeks back and am trying to learn the tricks in it.

Edited by - dpgg94 on 06/15/2018 10:22:33 PM
Go to Top of Page

Hideo Fujii

USA
1582 Posts

Posted - 06/18/2018 :  10:00:54 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi dpgg94,

What is the version of Origin you're using? I guess that in your version the Unique() function is not supported.
(Please indicate the version of your Origin in the post from the next time so that we both don't waste our time.)

If your Origin is not the latest, please download and install the evaluation version to test.
https://www.originlab.com/demodownload.aspx

Regarding "this", it means "this column" in the typical object-oriented fashion. See the details at:
https://www.originlab.com/doc/Origin-Help/Using-Formulae-to-Set-Cell-Values#Referring_to_Worksheet_Columns_using_.22This.22

Hope this helps.

--Hideo Fujii
OriginLab
Go to Top of Page

dpgg94

Singapore
5 Posts

Posted - 06/19/2018 :  08:16:50 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
I'm on OriginPro2017 version
Go to Top of Page

Hideo Fujii

USA
1582 Posts

Posted - 06/19/2018 :  09:38:31 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi dpgg94,

> I'm on OriginPro2017 version

I have just confirmed that Unique() function became available in Origin 2018b unfortunately.
Please consider either:
- download and try 2018b Evaluation version, or
- upgrade to 2018b (if you have maintenance, it is free), or
- make a script to substitute.

Regards,

--Hideo Fujii
OriginLab
Go to Top of Page

dpgg94

Singapore
5 Posts

Posted - 06/22/2018 :  11:54:43 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Would it be possible if you could assist me with creating a script for that function? I'm clueless about how to make a script even after going through the online tutorials.
Go to Top of Page

Hideo Fujii

USA
1582 Posts

Posted - 06/25/2018 :  12:15:38 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi dpgg94,

Learning any programming language including LabTalk, is an incremental process.
So, just keep learning patiently from the basics.

First, try the following "Hello World" example:
https://www.originlab.com/doc/LabTalk/guide/Getting-Started-with-LT

Try the following Getting Started tutorial for how to run:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Get-Started

Try the following tutorial for the basic elements in LabTalk:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Language-Fundamental

To handle the data in a worksheet, Range notation is essential. Try the following
tutorial about the range:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Range-Notation

For the rest, explore Origin documents about LabTalk when you need something
to do, or to understand what the given sample is doing:
https://www.originlab.com/doc/LabTalk
and about X-functions:
https://www.originlab.com/doc/X-Function/ref

Finally, you can try to copy the following sample script, paste into the Script window,
highlight all pasted lines, and press ENTER key when Sheet1 in Book1 is active to see
what happened. Read the comments (after //) to grasp what the script roughly does,
but don't be worried about exact meaning. Don't be deterred by the details. Learning
by examples by trying by yourself is a way to go, I can say.
//////////////////////////////////////////////
double NA=0/0;                 //Missing value constant
range rA=[Book1]Sheet1!col(A); //Data range for col(A) in Sheet1 in Book1
rA=sort(rA);                   //Sort the range of column A
range rT=[??]!_tmpdata;        //Range for temporary dataset
int nirows=rA.getSize();       //Number of input rows
int norows=0;                  //Number of output rows
for(ii=1; ii<=nirows; ii=ii+1) { //Loop over with ii
  if(ii==1 || rA[ii]!=rA[ii-1]) {
    rT[ii]=rA[ii];   //If 1st row, or unequal to previous row, set the same
    norows=norows+1; //Increment number of output rows
  }
  else rT[ii]=NA;    //Otherwise set a missing value
}
rA=sort(rT);         //Put the sorted temp dataset back
rA.setSize(norows);  //Reset the size of column A
/////////////////////////////////////////////
Good luck, and enjoy!

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/25/2018 1:46:20 PM
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000