Hi everyone I have a weird question. I have a dataset that looks like this
id | amount |
101 | 7 |
101 | -3 |
101 | -4 |
101 | 8 |
101 | -4 |
101 | -4 |
How do I group so that amounts 7, -3, -4 are the same group #? The Logic is that 7 + -3 + -4 = 0 so I want to group them together. Similarly 8 + -4 + -4 = 0.
For example I want
id | amount | group # |
101 | 7 | 1 |
101 | -3 | 1 |
101 | -4 | 1 |
101 | 8 | 2 |
101 | -4 | 2 |
101 | -4 | 2 |
I tried using summarize but I cant just sum by ID number because most of them have more than one group per ID. The original data set has Date column which really doesn't follow a pattern, and description which again has just an explanation and is sometimes unique to each transaction.
Solved! Go to Solution.
Hi @nivnat ,
In my example, I'm admitting that you will always sum 0 for every group and that they are all sorted somehow.
So, after those assumptions, I'm using a running total to do an accumulate sum and then a multi-row to identify the different groups.
Take a look at the example and let me know if that works for you.
Best,
Fernando Vizcaino
Many thanks! That works for most of my data. But I forgot to mention that there are a couple groups where the sum need not be zero, but they have to be classified as a different group. The problem with this is it works until it hits a group where sum never becomes zero so for the rest of the id it categorizes it as the same group when it's actually not.
I've tweaked the data to include this condition too. What I get is:
id | amount | group |
76 | 6.67 | 1 |
76 | -6 | 1 |
76 | -0.67 | 1 |
93 | 9.33 | 2 |
93 | 3.67 | 2 |
93 | -3.67 | 2 |
What I need is:
id | amount | group |
76 | 6.67 | 1 |
76 | -6 | 1 |
76 | -0.67 | 1 |
93 | 9.33 | 2 |
93 | 3.67 | 3 |
93 | -3.67 | 3 |
Any help is much appreciated.
Hi @nivnat ,
Maybe I am oversimplifying your problem, but take a look at the example attached and let me know if that works for you.
Best,
Fernando Vizcaino
Yes! That solves the problem. To make it understandable, if we think of positives as credit issued and the negatives as credit used, I only needed the ones where the credit was completely used. To use up your credit, you have to have a sum of zero so the first part solves that, for the remaining, I just grouped by the positives because they have credits that are unused.
Thanks!