Hi Team,
I am having an issue re-creating an Alteryx workflow that would accurately calculate the quartile data represented in the data below (Source Data), as required by the rules and regulations of my industry. Also see below, desired results (Table 2):
Source Data:
Company Name | Data |
Company 1 | N/A |
Company 2 | N/A |
Company 3 | 19.64% |
Company 4 | 1.29% |
Company 5 | 19.79% |
Company 6 | 10.35% |
Company 7 | 4.57% |
Company 8 | 8.94% |
Company 9 | 3.15% |
Company 10 | 8.03% |
Company 11 | 14.75% |
Company 12 | 3.91% |
Company 13 | 4.40% |
Company 14 | 4.78% |
Company 15 | -2.83% |
Company 16 | 11.30% |
Table 2:
Data | |
Max | 19.8% |
Q3 | 11.3% |
Median | 6.4% |
Q1 | 3.9% |
Min | -2.8% |
Q3 =AVERAGE(LARGE(IF(ISNUMBER(B3:B18),B3:B18,""), INT((COUNT(B3:B18)-1)/4)+1), LARGE(IF(ISNUMBER(B3:B18),B3:B18,""), INT(COUNT(B3:B18)/4)+1))
Q1 =AVERAGE(LARGE(IF(ISNUMBER(B3:B18),B3:B18,""),INT((3*COUNT(B3:B18)-1)/4)+1),LARGE(IF(ISNUMBER(B3:B18),B3:B18,""), INT(3*COUNT(B3:B18)/4)+1))
All data is coming from Column B, as used in the formulas above. The Summarize tool does not achieve the desired result, nor does the profiling tool.
Any help would be greatly appreciated. Otherwise my only option would be to stop overarching the workflow, manually input the formula, and re-input the new quartile data.
Solved! Go to Solution.
Hi @MN124,
Have you checked out the basic data profile tool? It looks like it has everything you're looking for except median, but you can get that from the summarize tool. The quartile numbers a little different that what you have, so I am not sure it's using the exact same calculation, but it may be worth looking at.
Thanks @Greg_Murray,
I've done some searching on other quartile related questions on the board and it led me to the tool you've linked. The sticking point is that I need to emulate the exact quartile numbers that some out of the formulas listed, as required by industry regulations. I've pleaded my case with much simpler, very close quartile calcs via the summary/data profile tool to no avail.
Gotcha. Try this one. You'll definitely want to do some testing, but it should be a start.
Edit: Here is a simpler version too. The first version had a few redundancies.
You're fast. That is looking really good. I will try to piece those concepts into my workflow.
Quick side - in my actual data set the source data is also divided into additional sub categories, example below:
Group | Company Name | Data |
A | Company 1 | N/A |
A | Company 2 | N/A |
A | Company 3 | 19.64% |
A | Company 4 | 1.29% |
A | Company 5 | 19.79% |
A | Company 6 | 10.35% |
B | Company 7 | 4.57% |
B | Company 8 | 8.94% |
B | Company 9 | 3.15% |
B | Company 10 | 8.03% |
B | Company 11 | 14.75% |
C | Company 12 | 3.91% |
C | Company 13 | 4.40% |
C | Company 14 | 4.78% |
C | Company 15 | -2.83% |
C | Company 16 | 11.30% |
If I wanted individual interquartile calculations for each independent group. Is that just an addition the summary tools via 'Group By - Group' on top of the Value (min,Q1,med,Q3,Max) calculation. An ideal output resulting like:
Group | Min | Q1 | Median | Q3 | Max |
A | ### | ### | ### | ### | ### |
B | ### | ### | ### | ### | ### |
C | ### | ### | ### | ### | ### |
Sorry, I used an incorrect starting point. This is part of a much larger workflow.
No worries. You have the right idea. The summarize tools would all need to group on the the [group] field, but there are a few other changes what would be needed as well. see attached.
Got it up and running! Thanks!!