Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Removing outliers from dataset

ArnabSengupta
8 - Asteroid

Hi All,

 

I have a data set where it shows running time of a machine. I want to remove the outliers from the data set, where top 25% and bottom 25% represent the outliers. 

 

I have attached a sample dataset.

 

Help is much appreciated.

3 REPLIES 3
atcodedog05
22 - Nova
22 - Nova

Hi @ArnabSengupta 

 

Here is how you can do it.

Workflow:

atcodedog05_0-1631196018605.png

 

1. Using sort tool sort data by ascending.

2. Using record id tool to set row id.

3. Using summarize to get max row id(row count).

4. Using formula to calculate 25% and 75% row id.

5. Using append tool to map back 25% and 75% to main data.

6. Using filter to keep data between 25% and 75% row id. This way removing top and bottom 25%.

 

Hope this helps : )

 

OllieClarke
15 - Aurora
15 - Aurora

Hi @ArnabSengupta 

 

The summarize tool allows you to calculate percentiles, which you can use in a filter. Otherwise it's very similar to @atcodedog05's solution

 

Ollie

OllieClarke_1-1631202720396.png

 

OllieClarke_0-1631202650411.png

 

 

binay2448
11 - Bolide

Hope this will help you...

Labels