Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Standard Deviation using Multiple Column Groupings

aralbrecht
7 - Meteor

Hello Alteryx Community!

 

I am attempting to determine if values in my data-set are outliers. Of course, to do this, I need to calculate standard deviation first. 

 

My data can have occurrences where different customers can purchases the same items, so I created an ID to join the two together (SmithA1 and Brown A1). Grouping by this CustomerID, I need to calculate if $ Error for any given week in a range of weeks is outside of 3 St. Deviations from the norm. If so, I would like to deem the week an outlier.

 

I have attached an image below, as well as an example of the file I am using. The columns with blue headers are how my data is already set up, and I would like to use Alteryx to calculate the columns with yellow headers. I have attemped to solve this using the Summarize Tool, but am running into a wall.

 

Thank you in advance!

 

AlteryxStDevHelp.png

11 REPLIES 11
bpatel
Alteryx
Alteryx

I attached a workflow that i think may help you with this. I did it two ways - grouping by weeks and another way by grouping item and CustomerItemID. I think you are looking to group by weeks but I added a different way as well. I used the summarize tool to find the standard deviation and joined it back together with the data. 

 

Hope this helps!

danilang
19 - Altair
19 - Altair

Hi @aralbrecht 

 

The solution provided by @bpatel is a good first step, however it's missing a key point.  The data values are considered outliers if they lie outside the range bounded by 3 STDevs, on either side of the mean, assuming the distribution isn't skewed.  

 

WF.png

 

I started with @bpatel's workflow and added the mean(AVG in this case) into the Summarize tool and changed the final formula to check if [Dollar Error] is outside the 3 STDev range from the mean, resulting in Results.png

 

Note: Your original data was very flat and non of the values counted as outliers.  I bulked up the middle range and added some obvious outlier data

 

Dan

 

aralbrecht
7 - Meteor

Thank you both for your fast responses! However, I apologize as I forgot to include a key detail in my data-set when originally posting, only realizing this is an issue when studying both of your workflows...

 

I also have location data that I need to summarize by week (thinking of PivotTable in Excel, include image of logic in second image below).

 

Essentially, I need to:

 

  • By CustomerItemID, summarize the "Dollar Error" for all locations into one row (thinking Pivot Table-like logic)
  • Once summarized, calculate the standard deviation for set of now-summarized week's Total Dollar error
  • If a summarized week is an outlier (outside 3 St.Devs away on their side of the mean), mark that week (assuming this would repeat for each location for that week) as an outlier

 

Essentially, I am trying to determine which weeks are outliers for each CustomerID, but am running into an issue since there are multiple locations that need summed to determine what he weekly value is.

 

I have re-attached the file with location data.

 

Thank you again in advance!

 

Aletyx St.Dev Help 2.pngAlteryxHelp3.png

danilang
19 - Altair
19 - Altair

Hi @aralbrecht 

 

I think this is what your looking for.  I summarized the weekly results before passing them to the rest of the procedure.

 

WF.png

 

It wasn't clear whether you wanted the outliers by week or for the original records, so it does both.  The top branch is by week and the bottom is all records including locations

 

Dan

aralbrecht
7 - Meteor

Thank you both for you help! I slightly modified the solution to join the results together using an ID created using Customer-Item-Week, as I needed the result from the first stream in the workflow to append to each record in the second stream. I've included and image of it below.

 

AlteryxStDevHelpResult.png

Nickvv
7 - Meteor

Danilang i have an older verison of Altryx and isnt letting me open this attached file. I think is what i need to results i need. Is there any way to save the file differently ?  Thanks Nick

danilang
19 - Altair
19 - Altair

Hi @Nickvv 

 

If you open the .yxmd file that you get after the import process with Notepad, you can change the second line from 

 

<AlteryxDocument yxmdVer="2019.2">

 

to what ever your versions is.  Save it and you be able to open it in Alteryx

 

Dan

 

 

Nickvv
7 - Meteor

This what im getting when i open in notepad:

 

clipboard_image_0.png

danilang
19 - Altair
19 - Altair

Hi @Nickvv 

 

That's the .yxzp file that you downloaded from the community.  You need to find the .yxmd file.   It'll  probably be in a subdirectory called  Outliersv2. 

 

Dan

 

 

Labels