The Origin Forum - Removing duplicated data values

The Origin Forum

Username:	Password:
Save Password
Forgot your Password? \| Admin Options

All Forums

Origin Forum

Removing duplicated data values

New Topic

Reply to Topic

Printer Friendly

Author

Topic

dpgg94

Singapore
5 Posts

Posted - 06/14/2018 : 12:39:55 PM

Hello all,
I would like to find out if there's any way to remove duplicated values from particular columns in a sheet. For example, i have 10 columns of data, but some columns may have repeated values due to multiple scans by an instrument. I would like to remove these data because they would affect the average and standard deviation if I were to include them while carrying out statistics.

Thanks for any help! Have tried googling for this but did not manage to find any info.

YimingChen

1664 Posts

Posted - 06/14/2018 : 3:11:46 PM

Hi,

Can you try Worksheet:Remove/Combine Duplicated Rows. This tool only scan one single column though. You may consider stacking your 10 columns before running the this tool.

James

Hideo Fujii

USA
1582 Posts

Posted - 06/14/2018 : 3:37:42 PM

Hi dpgg94,

Origin provides three methods to remove the duplicated data, which are:
1) "Worksheet> Remove/Combine Duplicated Rows" menu (wdeldup), and
2) "Analysis> Data Manipulation> Reduce Duplicated X Data" menu (reducedup)
3) "Unique(<column>)" command

The differences between first two are mainly as follows:

- Wdeldup works for a whole worksheet, but not for a plot. Reducedup works on two
datasets of XY. It works for two columns of XY in a worksheet as well as for a plot in a graph.

- Wdeldup reduces the rows in the rest of columns on a reference column by taking options:
Keep 1st/Last, Average, Min/Max, etc. Reducedup does similar, but no Keep 1st/Last option.

- Wdeldup works for duplication by exact numbers, or by the text strings. Reducedup works
on numbers possibly with some tolerance.

The Unique command can be applied to only one dataset, i.e., a column in a worksheet.
There is Keep 1st/Last option, but no Tolerance.

You can choose either method of above which is suitable for your case.
If more than one columns in your data may possibly contain duplication, you may have to
apply above method multiple times.

References:
[1] Blog: http://blog.originlab.com/data-handling/delete-every-nth-row-or-column-from-worksheet-2
[2] Origin Help:
https://www.originlab.com/doc/Origin-Help/Wks-DataReduction
https://www.originlab.com/doc/Origin-Help/reduce-duplicate-rows
https://www.originlab.com/doc/LabTalk/ref/Unique-func
[3] Video Tutorial: https://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis&ss=chm&pid=1575

Hope this helps.

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/14/2018 3:41:29 PM

dpgg94

Singapore
5 Posts

Posted - 06/14/2018 : 10:44:11 PM

Here's the kind of data I have, with the original form of values and manually edited to remove duplicated values within each column.
https://imgur.com/a/rCZlUu1

The other methods suggested seem to remove all the same rows for the remaining columns even if i select a specific column to work on.
As for my data, there would be random numbers of duplicated values within each row, and at varying positions.

I have no clue about scripting and writing the command line, but will be trying to learn how to use it now. If possible, could anyone help me to write a script for this purpose?

Thanks a bunch!

Hideo Fujii

USA
1582 Posts

Posted - 06/15/2018 : 5:09:58 PM

Hi dpgg94,

Okay, so you want to remove redundant rows at each column in the worksheet, right? Then you can use
the 3rd option, "Unique(<column>)" command. Just applying this command every column in your worksheet
doesn't need to make a script; rather you can do the following:

1) Enter unique(this) in the F(X)= header row in the 1st column, followed by pressing ENTER key.
==> You get the 1st column non-redundant.
2) Select the F(X)= header row in the 1st column, and drag the black dot at the bottom corner of the cell
(cursor shape changes to the "+" sign) rightward until the last column.
==> All columns get the same operation.

Hope this helps.

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/15/2018 5:11:32 PM

dpgg94

Singapore
5 Posts

Posted - 06/15/2018 : 9:00:09 PM

Hello,
thank you for the reply.

What does "this" refer to when you mention unique(this)? Should i be putting the column label in it?

I've tried it with column label and it doesn't seem to do anything after pressing enter.

https://imgur.com/a/6tpfbza

Apologies if I'm kind of slow on this, had started using Origin few weeks back and am trying to learn the tricks in it.

Edited by - dpgg94 on 06/15/2018 10:22:33 PM

Hideo Fujii

USA
1582 Posts

Posted - 06/18/2018 : 10:00:54 AM

Hi dpgg94,

What is the version of Origin you're using? I guess that in your version the Unique() function is not supported.
(Please indicate the version of your Origin in the post from the next time so that we both don't waste our time.)

If your Origin is not the latest, please download and install the evaluation version to test.
https://www.originlab.com/demodownload.aspx

Regarding "this", it means "this column" in the typical object-oriented fashion. See the details at:
https://www.originlab.com/doc/Origin-Help/Using-Formulae-to-Set-Cell-Values#Referring_to_Worksheet_Columns_using_.22This.22

Hope this helps.

--Hideo Fujii
OriginLab

dpgg94

Singapore
5 Posts

Posted - 06/19/2018 : 08:16:50 AM

I'm on OriginPro2017 version

Hideo Fujii

USA
1582 Posts

Posted - 06/19/2018 : 09:38:31 AM

Hi dpgg94,

> I'm on OriginPro2017 version

I have just confirmed that Unique() function became available in Origin 2018b unfortunately.
Please consider either:
- download and try 2018b Evaluation version, or
- upgrade to 2018b (if you have maintenance, it is free), or
- make a script to substitute.

Regards,

--Hideo Fujii
OriginLab

dpgg94

Singapore
5 Posts

Posted - 06/22/2018 : 11:54:43 PM

Would it be possible if you could assist me with creating a script for that function? I'm clueless about how to make a script even after going through the online tutorials.

Hideo Fujii

USA
1582 Posts

Posted - 06/25/2018 : 12:15:38 PM

Hi dpgg94,

Learning any programming language including LabTalk, is an incremental process.
So, just keep learning patiently from the basics.

First, try the following "Hello World" example:
https://www.originlab.com/doc/LabTalk/guide/Getting-Started-with-LT

Try the following Getting Started tutorial for how to run:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Get-Started

Try the following tutorial for the basic elements in LabTalk:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Language-Fundamental

To handle the data in a worksheet, Range notation is essential. Try the following
tutorial about the range:
https://www.originlab.com/doc/LabTalk/Tutorials/Tutorial-Range-Notation

For the rest, explore Origin documents about LabTalk when you need something
to do, or to understand what the given sample is doing:
https://www.originlab.com/doc/LabTalk
and about X-functions:
https://www.originlab.com/doc/X-Function/ref

Finally, you can try to copy the following sample script, paste into the Script window,
highlight all pasted lines, and press ENTER key when Sheet1 in Book1 is active to see
what happened. Read the comments (after //) to grasp what the script roughly does,
but don't be worried about exact meaning. Don't be deterred by the details. Learning
by examples by trying by yourself is a way to go, I can say.

//////////////////////////////////////////////
double NA=0/0;                 //Missing value constant
range rA=[Book1]Sheet1!col(A); //Data range for col(A) in Sheet1 in Book1
rA=sort(rA);                   //Sort the range of column A
range rT=[??]!_tmpdata;        //Range for temporary dataset
int nirows=rA.getSize();       //Number of input rows
int norows=0;                  //Number of output rows
for(ii=1; ii<=nirows; ii=ii+1) { //Loop over with ii
  if(ii==1 || rA[ii]!=rA[ii-1]) {
    rT[ii]=rA[ii];   //If 1st row, or unequal to previous row, set the same
    norows=norows+1; //Increment number of output rows
  }
  else rT[ii]=NA;    //Otherwise set a missing value
}
rA=sort(rT);         //Put the sorted temp dataset back
rA.setSize(norows);  //Reset the size of column A
/////////////////////////////////////////////

Good luck, and enjoy!

--Hideo Fujii
OriginLab

Edited by - Hideo Fujii on 06/25/2018 1:46:20 PM

Topic

New Topic

Reply to Topic

Printer Friendly

Jump To:

The Origin Forum

Snitz Forums 2000