The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
Username:
Password:
Save Password
Forgot your Password? | Admin Options

 All Forums
 Origin Forum
 Origin Forum
 Outlier detection method in PCA tool
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic

albeviane

Italy
3 Posts

Posted - 04/06/2018 :  5:51:29 PM  Show Profile  Edit Topic  Reply with Quote  View user's IP address  Delete Topic
Origin Ver. and Service Release (Select Help-->About Origin): Origin 2017 SR2
Operating System: Mac via Win 8-10 VM

Hi
I have been using the PCA tool in order to detect outliers in my data, and I would like to know if the confidence ellipse calculated by the script (btw: can you change the 95% default value?) is based on:
a) the (robust) distances based on the Minimum Value Ellipsoid (MVE) estimates of location and scatter, or
b) on the Mahalanobis distances based on sample mean and sample covariance matrix.

Is the Confidence Ellipse tool available starting from Origin 2017 SR2 based on the same principle?

The question has been asked me by a reviewer of a publication currently under revision...
Thanks for helping!

AmandaLu

439 Posts

Posted - 04/08/2018 :  05:04:36 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

I suppose you are talking about Principal Component Analysis App. The calculation is based on a similar method of Mahalanobis distances based on sample mean and sample covariance matrix.



where n is the size of a random sample, and S the sample mean and sample covariance matrix, is mean

Principal Component Analysis App cannot change the confidence level. You can use data from PCA tool in the Confidence Ellipse App:

2D: https://www.originlab.com/fileExchange/details.aspx?fid=365
3D: https://www.originlab.com/fileExchange/details.aspx?fid=280

PCA App is based on the Confidence Ellipse tool.

Thanks,
Amanda
OriginLab Technical Service

Edited by - AmandaLu on 04/08/2018 05:17:55 AM
Go to Top of Page

albeviane

Italy
3 Posts

Posted - 04/10/2018 :  2:51:35 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thanks very much Amanda for your detailed and clarifying reply. Yes, I used the term "tool" to mean "app", of course, not being sure the latter was the right way too call these very useful add-ons.
As far as I understand from the literature (Jackson and Chen, 2004, Robust principal component nanlysis and outlier detection with ecological data, Environmetrics, https://doi.org/10.1002/env.628 ; Van Aelst and Rousseeuw, 2009, Minimum volume ellipsoid, https://doi.org/10.1002/wics.19 outlier detection is much more robust when using MVE-based distances.
Maybe it is worth considering, for the future, to implement this possibility in the PCA app.

Thanks again for your help
Alberto
Go to Top of Page

AmandaLu

439 Posts

Posted - 04/13/2018 :  02:16:59 AM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

Thank you for your suggestion. We have created a JIRA to implement MVE in the future:

https://originlab.jira.com/browse/APPS-539

Thanks,
Amanda
OriginLab Technical Service
Go to Top of Page

albeviane

Italy
3 Posts

Posted - 04/23/2018 :  5:34:24 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Thank you Amanda for your help. Actually, I realized there is another quite used robust estimator for multivariate location and scatter which is widely used for outlier detection, MCD

https://wis.kuleuven.be/stat/robust/papers/2010/wire-mcd.pdf

In any case, I would have another question I forgot to ask you: in the OriginPro PCA App, there is an option, for plotting data, to visualize the "outliers", that I did not notice at first since I was mostly interested in visualizing the ellipse only. Most of these outliers, at least for my data, lie inside the 95% ellipse, though of course very close to its defining boundary. Only a few of them are outside. I expected instead, perhaps wrongly, the ellipse itself to define the outliers, and would then like to know how the Mahalanobis Distance (I guess,not being a robust estimate) threshold is set in order to define outliers in the PCA App.

Thanks very much in advance
Alberto
Go to Top of Page

AmandaLu

439 Posts

Posted - 04/23/2018 :  11:22:47 PM  Show Profile  Edit Reply  Reply with Quote  View user's IP address  Delete Reply
Hi,

Thank you for the information. We have added MCD to JIAR APPS-539.

Regarding the outlier, it has nothing to do with the confidence band. We examine the XY values distribution and consider those outside 95% XY ranges the outliers.

Thanks,
Amanda
OriginLab Technical Service
Go to Top of Page
  Previous Topic Topic Next Topic Lock Topic Edit Topic Delete Topic New Topic Reply to Topic
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000