Hi All,
I am new to detecting/ eliminating outliers on big dataset, hence I am posting this question.
I have a dataset which has product and their running time on different instances. I want to eliminate the outliers using Z-score methodology. i am not sure as how should I proceed to that.
I have attached a sample data to this post.
Help is much appreciated,
Thank you!
Solved! Go to Solution.
use the hidden standardized z-score macro. it will convert your numeric values into z-scores. an outlier might be anything more than +/- 3 standard deviations, so then a simple filter tool to filter out records with a abs(z-score)>3
the standardized z-score tool is pretty simple, you could build it yourself the long way. otherwise you'll have to insert a macro, then navigate to the folder where it's stored.. for me the macro/tool is stored here:
C:\Users\.....\AppData\Local\Alteryx\bin\RuntimeData\Macros\Predictive Tools\Supporting_Macros