Calculate min, max and mean values for a category
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi
I am attempting to recreate some excel analysis in Alteryx.
I have a dataset containing public accounts data data approx. 5,000 companies. Amongst the data are the compound average growth rate (CAGR) and the industrial SIC code.
I would like to identify "outlier" CAGR % (both max and min) for each SIC code, and also identify the companies with the max and min CAGR in order that these can potentially be excluded if required from final analysis.
Attached is an extract showing how the data is currently presented in Excel. The min/max/average are calculated using an Averageif/MaxIf/MinIf, and the company with the max/min CAGR looked up using an X-Lookup.
Can anyone help with a way to calculate the Min, Max and Average for each Industrial sector, and then how to identify the max/min companies? I am guessing the Summarize and then Fuzzy Match tool, but some guidance would be greatly appreciated.
David
Solved! Go to Solution.
- Labels:
- Data Investigation
- Reporting
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You wouldn't want to use fuzzy match for this type of task. The summarize tool can provide Min, Max and Avg. Summarize the data grouped by industry SIC with min max and average on the 3 or 5 yr (or both) and after the summarize, join back to the original data with the join on industry SIC. After that, you can do additional sorting or analysis as needed to determine your outliers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Here's a similar workflow as designed by @jdminton.
I also added a 'Z' score as handy way to help identify outliers as well as a rank for each company in each category both 3 and 5 years.
Edit: As a side note, some of the categories in the data only have 1 or 2 companies. The 'Z' scores, and values there are going to be missing or next to useless.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@jdminton and @smoosh - this is excellent. Thank you both. I really appreciate the help and advice.
David
