I have a dataset of hundreds of columns with thousands of rows and need to find the percentile of each value within each set. Normally to find the percentile of each value in a data set I would just sort the data, create a rank column, and divide the rank by the count of unique values. However, I do not want to create a workflow that does this for each of the hundreds of columns. Is there a way to loop the process or a tool that will do this?
Example Dataset Input:
| ID | Col1 | Col2 | Col3 | Col4 |
11111 | 1 | 15 | 99 | 56 |
| 22222 | 6 | 33 | 55 | 105 |
| 33333 | 3 | 6 | 66 | 12 |
| 44444 | 8 | 99 | 4 | 156 |
| 55555 | 9 | 55 | 15 | 56 |
| 66666 | 7 | 3 | 88 | 8 |
| 77777 | 4 | 1 | 5 | 60 |
Desired Output:
| ID | Col1 | Col2 | Col3 | Col4 |
| 11111 | 0% | 50% | 100% | 33% |
| 22222 | 50% | 67% | 50% | 83% |
| 33333 | 17% | 33% | 67% | 17% |
| 44444 | 83% | 100% | 0% | 100% |
| 55555 | 100% | 83% | 33% | 33% |
| 66666 | 67% | 17% | 83% | 0% |
| 77777 | 33% | 0% | 17% | 67% |
How can I do this for hundreds of columns? The above example is the inclusive percentile, but I am fine with finding the exclusive percentile as well.