The Origin Forum
File Exchange
Try Origin for Free
The Origin Forum
Home | Profile | Register | Active Topics | Members | Search | FAQ | Send File to Tech support
 All Forums
 Origin Forum
 Origin Forum
 Outlier detection method in PCA tool

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:
UserName:
Password:
Anti-Spam Code:
Format Mode:
Format: BoldItalicizedUnderlineStrikethrough Align LeftCenteredAlign Right Horizontal Rule Insert HyperlinkUpload FileInsert Image Insert CodeInsert QuoteInsert List
   
Message:

* HTML is OFF
* Forum Code is ON
Smilies
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Clown [:o)]
Black Eye [B)] Eight Ball [8] Frown [:(] Shy [8)]
Shocked [:0] Angry [:(!] Dead [xx(] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
albeviane Posted - 04/06/2018 : 5:51:29 PM
Origin Ver. and Service Release (Select Help-->About Origin): Origin 2017 SR2
Operating System: Mac via Win 8-10 VM

Hi
I have been using the PCA tool in order to detect outliers in my data, and I would like to know if the confidence ellipse calculated by the script (btw: can you change the 95% default value?) is based on:
a) the (robust) distances based on the Minimum Value Ellipsoid (MVE) estimates of location and scatter, or
b) on the Mahalanobis distances based on sample mean and sample covariance matrix.

Is the Confidence Ellipse tool available starting from Origin 2017 SR2 based on the same principle?

The question has been asked me by a reviewer of a publication currently under revision...
Thanks for helping!

5   L A T E S T    R E P L I E S    (Newest First)
AmandaLu Posted - 04/23/2018 : 11:22:47 PM
Hi,

Thank you for the information. We have added MCD to JIAR APPS-539.

Regarding the outlier, it has nothing to do with the confidence band. We examine the XY values distribution and consider those outside 95% XY ranges the outliers.

Thanks,
Amanda
OriginLab Technical Service
albeviane Posted - 04/23/2018 : 5:34:24 PM
Thank you Amanda for your help. Actually, I realized there is another quite used robust estimator for multivariate location and scatter which is widely used for outlier detection, MCD

https://wis.kuleuven.be/stat/robust/papers/2010/wire-mcd.pdf

In any case, I would have another question I forgot to ask you: in the OriginPro PCA App, there is an option, for plotting data, to visualize the "outliers", that I did not notice at first since I was mostly interested in visualizing the ellipse only. Most of these outliers, at least for my data, lie inside the 95% ellipse, though of course very close to its defining boundary. Only a few of them are outside. I expected instead, perhaps wrongly, the ellipse itself to define the outliers, and would then like to know how the Mahalanobis Distance (I guess,not being a robust estimate) threshold is set in order to define outliers in the PCA App.

Thanks very much in advance
Alberto
AmandaLu Posted - 04/13/2018 : 02:16:59 AM
Hi,

Thank you for your suggestion. We have created a JIRA to implement MVE in the future:

https://originlab.jira.com/browse/APPS-539

Thanks,
Amanda
OriginLab Technical Service
albeviane Posted - 04/10/2018 : 2:51:35 PM
Thanks very much Amanda for your detailed and clarifying reply. Yes, I used the term "tool" to mean "app", of course, not being sure the latter was the right way too call these very useful add-ons.
As far as I understand from the literature (Jackson and Chen, 2004, Robust principal component nanlysis and outlier detection with ecological data, Environmetrics, https://doi.org/10.1002/env.628 ; Van Aelst and Rousseeuw, 2009, Minimum volume ellipsoid, https://doi.org/10.1002/wics.19 outlier detection is much more robust when using MVE-based distances.
Maybe it is worth considering, for the future, to implement this possibility in the PCA app.

Thanks again for your help
Alberto
AmandaLu Posted - 04/08/2018 : 05:04:36 AM
Hi,

I suppose you are talking about Principal Component Analysis App. The calculation is based on a similar method of Mahalanobis distances based on sample mean and sample covariance matrix.



where n is the size of a random sample, and S the sample mean and sample covariance matrix, is mean

Principal Component Analysis App cannot change the confidence level. You can use data from PCA tool in the Confidence Ellipse App:

2D: https://www.originlab.com/fileExchange/details.aspx?fid=365
3D: https://www.originlab.com/fileExchange/details.aspx?fid=280

PCA App is based on the Confidence Ellipse tool.

Thanks,
Amanda
OriginLab Technical Service

The Origin Forum © 2020 Originlab Corporation Go To Top Of Page
Snitz Forums 2000