I am attempting to group customer sales by category using percentile to get a normal distribution. I cannot use a std dev because of the nature of the data the cluster toward the low end and the high variance would have -1 std dev from the median below 0. So I settled on running it through a percentile summarize and taking the 2.5, 13.5, 34, 34, 13.5, 2.5 as the thresholds for my groups to get an even distribution. But when i perform the percentile function it returns the raw value of sales, not a percentile ranking. has anyone ran into this. The percentile macro does not work for me because i believe it is std dev based and in some uncommon categories my lowest value will be 34 and highest will be 96. i attached a sample data set of what my expected/desired result is vs what the tool is spitting out.I just made up some data to keep it simple and for privacy reasons
Student | Class | rawscore | Percentile_rawscore | expected result | |
azmodius | history | 150 | 150 | 50 | 50th |
brian | history | 85 | 85 | 0 | 0th |
britney | history | 184 | 184 | 80 | 80th |
casey | history | 155 | 155 | 60 | 60th |
helga | history | 86 | 86 | 10 | 10th |
jaime | history | 96 | 96 | 20 | 20th |
james | history | 125 | 125 | 30 | 30th |
jenny | history | 130 | 130 | 40 | 40th |
kelly | history | 165 | 165 | 70 | 70th |
kristen | history | 188 | 188 | 90 | 90th |
azmodius | math | 421 | 421 | 90 | 90th |
brian | math | 158 | 158 | 10 | 10th |
britney | math | 255 | 255 | 60 | 60th |
casey | math | 285 | 285 | 70 | 70th |
helga | math | 248 | 248 | 40 | 40th |
jaime | math | 235 | 235 | 30 | 30th |
james | math | 254 | 254 | 50 | 50th |
jenny | math | 288 | 288 | 80 | 80th |
kelly | math | 145 | 145 | 0 | 0th |
kristen | math | 210 | 210 | 20 | 20th |
azmodius | pe | 145 | 145 | 0 | 0th |
brian | pe | 258 | 258 | 40 | 40th |
britney | pe | 231 | 231 | 10 | 10th |
casey | pe | 247 | 247 | 20 | 20th |
helga | pe | 955 | 955 | 90 | 90th |
jaime | pe | 752 | 752 | 80 | 80th |
james | pe | 251 | 251 | 30 | 30th |
jenny | pe | 265 | 265 | 50 | 50th |
kelly | pe | 596 | 596 | 70 | 70th |
kristen | pe | 456 | 456 | 60 | 60th |
Solved! Go to Solution.
@BFritter How about the attached.
I used the sort tool to order the records by score. Then the multi-row formula ranks them within each class. Then the summarize gets the max rank per class. The find replace then adds the Max rank back to each class. The formula tool gets the percentile and multiplies it by 100.
Hopefully that's the direction you were looking to go!
Thanks, this gets me where i wanted to go. I don't get why the percentile function doesn't do this though.