The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
 All Forums
 Origin Forum
 Origin Forum
 ks2density on large data vectors

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:
UserName:
Password:
Anti-Spam Code:
Format Mode:
Format: BoldItalicizedUnderlineStrikethrough Align LeftCenteredAlign Right Horizontal Rule Insert HyperlinkUpload FileInsert Image Insert CodeInsert QuoteInsert List
   
Message:

* HTML is OFF
* Forum Code is ON
Smilies
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Clown [:o)]
Black Eye [B)] Eight Ball [8] Frown [:(] Shy [8)]
Shocked [:0] Angry [:(!] Dead [xx(] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
mikkomaek Posted - 09/19/2017 : 05:36:38 AM
Origin Ver. and Service Release (Select Help-->About Origin): 2017 SR2
Operating System: Windows 7 on Parallels 12

Hi,

I'm trying to calculate the kernel densities of a dataset that consists of two data vectors (~1,700,000 rows). The calculation has been slow with smaller datasets, but with this one won't go through at all. I've assigned half of my 16 GB Mac RAM to the virtual machine. Any ideas how to make this happen?

The formula:
ks2density(Col(1), Col(2), Col(1), Col(2), wx, wy)

Before script:
double wx, wy;
kernel2width(Col(1), Col(2), wx, wy);

Thanks!
Mikko
9   L A T E S T    R E P L I E S    (Newest First)
mikkomaek Posted - 11/06/2017 : 02:25:49 AM
Hi Hideo,

Sorry but I have to bother you once more on the gridding.

I tried it but didn't really succeed in calculating the densities for a specified grid. If I calculate symmetric grid intervals based on the xy-scale and XYZ grid them, I'll only have a matrix with the Z values on numbered rows and columns. I also tried to calculate the densities based on my symmetric grid, but the plotting doesn't really turn out right.

Your right, calculating the densities at 1,700,000 locations is an overkill. I can plot the densities based on less as long as the plot still includes all the points.

Thanks!
Mikko
Hideo Fujii Posted - 10/18/2017 : 1:38:53 PM
Hi Mikko,

As described in our document for ks2density(http://www.originlab.com/doc/LabTalk/ref/ks2density-func),
your formula: ks2density(Col(1), Col(2), Col(1), Col(2), wx, wy); calculates the density value at 1,700,000
points. unless you need the density value at every point, it sounds overkilling. Otherwise, you can create
grid XY points in the col(3) and col(4) (such by making a matrix of, say 25x25 with proper XY ranges
having values no matter, convert to XYZ worksheet, then paste XY columns to col(3) and col(4)). In col(5),
you can run: ks2density(col(3), col(4), Col(1), Col(2), wx, wy); In this method, it took around 12 minutes.
(In my previous test, I measured the time of 2D kernel density plotting, and this way performed much faster
to produce the output matrix - very efficient!)
If you need the density values at the input datasets, giving up to get all, but you can sample to reduce the
output data points.

Hope this suggestion helps.

--Hideo Fujii
OriginLab
mikkomaek Posted - 10/17/2017 : 02:14:09 AM
Hi Hideo,

Thanks for your reply! Could you write an example how to adjust the grid size?

BR
Mikko
Hideo Fujii Posted - 09/28/2017 : 5:14:33 PM
Hi Mikko,

I have tried 1,700,000 data points by normal random numbers with various numbers of grids at each direction on my
machine (Intel Core Duo 3.16GHz 4GB memory).
#Grids    Elapsed Time(min)
   25  =>   0.6
   32  =>   1.1
   50  =>   2.6
   75  =>   6.5
  100  ->  30.2
Based on this, it seems time costs of the power of around 5, and rapidly becomes too slow with more than 50x50 grids.
(Then, if #Grids=200, it may take ~1000 min.)
So, at least for now, I suggest you to set the wx and wy to set the the number of grids to 50 or so (as well as to use
the native Windows machine or Boot Camp on Mac).

--Hideo Fujii
OriginLab
mikkomaek Posted - 09/28/2017 : 02:21:24 AM
Hi Origin Support,

Any reply to this question?

BR
Mikko

quote:
Originally posted by mikkomaek

Hi Aviel,

Thanks for your reply. You wrote the calculation was slow, but did it actually go through? If it did, was it on VM with OriginPro or a Windows computer?

I am ready to wait if I am able to get the plots in the end.

Thanks!
Mikko

mikkomaek Posted - 09/21/2017 : 03:47:45 AM
Hi Aviel,

Thanks for your reply. You wrote the calculation was slow, but did it actually go through? If it did, was it on VM with OriginPro or a Windows computer?

I am ready to wait if I am able to get the plots in the end.

Thanks!
Mikko
arstern Posted - 09/20/2017 : 09:39:33 AM
Hi Mikko,

We took a look at your project file and it was very slow to use. Unfortunately, It seems that the slow calculation is reasonable. Improvements have already been made on the performance on Kernel Density plotting, therefore it seems that with the combination of using Parallels and plotting with a large dataset you have reached the limitation for plotting a kernel density plot.

Aviel
OriginLab
Hideo Fujii Posted - 09/19/2017 : 09:43:52 AM
Hi Mikko,

> my 16 GB Mac RAM to the virtual machine

Beside the intrinsic issue of your problem, if you are using Parallel, VMWare, etc., you can try "Boot Camp"
dual-boot system of Apple, if it can be an option in your situation, because it should run much faster as the
"native" Windows without performance penalty (though I guess that dealing with your 1,700,000 rows data
may not be helped enough by this way).

http://www.originlab.com/index.aspx?go=Support/DocumentationAndHelpCenter/Installation/RunOriginonaMac

--Hideo Fujii
OriginLab
arstern Posted - 09/19/2017 : 09:26:39 AM
Hi,

Could you please e-mail your opj file to Tech Support via tech@originlab.com.

Thanks,
Aviel
OriginLab

The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000