Flag the items the value of which sums up to zero in two files
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have a dataset from two files A and B. I want to flag the ID with values that add up to 0 in the two files.
Here is an example of the dataset.
Card | ID | Amount | Time | File |
202301 | 2023-01 | 5000 | 12:00 | A |
202302 | 2023-02 | 200 | 12:50 | A |
2023-01 | 2023-01 | -5000 | 12:00 | B |
2023-02 | 2023-02 | 150 | 12:50 | B |
2023-01 | 2023-01 | 250 | 16:00 | B |
The desired output would be:
Card | ID | Amount | Time | File | Flag |
202301 | 2023-01 | 5000 | 12:00 | A | Yes |
202302 | 2023-02 | 200 | 12:50 | A | No |
2023-01 | 2023-01 | -5000 | 12:00 | B | Yes |
2023-02 | 2023-02 | 150 | 12:50 | B | No |
2023-01 | 2023-01 | 250 | 16:00 | B | Yes |
Note that ID 2023-01 is flagged Yes because 5000 from file A plus -5000 in file B equals to zero. The additional record in file B with a value of of 250 is flagged Yes as well. ID 2023-02 is flagged No because there's no value in the two files follow this rule.
Solved! Go to Solution.
- Labels:
- Workflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Yoshiro_Fujimori Thanks for your help! After checking the data, I see cases of an ID having both values that add up to 0 and not to 0. In this case, I would like to flag these items as Yes even though it doesn't completely follow the rule.
Below is an updated example. Could you help me modify the workflow so that it achieves this new desired outcome?
ID | Amount | File |
2023-01 | 5000 | A |
2023-02 | 200 | A |
2023-01 | -5000 | B |
2023-02 | 150 | B |
2023-01 | 150 | B |
Output:
ID | Amount | File | Flag |
2023-01 | 5000 | A | Yes |
2023-02 | 200 | A | No |
2023-01 | -5000 | B | Yes |
2023-02 | 150 | B | No |
2023-01 | 250 | B | Yes |
Notice that ID 2023-01 now has values that add up to 0 in files A and B (5000 in A + -5000 in B = 0) and an additional record in file B with the value of 250 which doesn't result in 0 (5000 + 250 = 5250). In this case, I would like to flag this item as Yes as well because in the previous case of ID 2023-01, it works.
Thanks~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Yoshiro_Fujimori Thanks for your effort! It's odd that I can't find my reply to you after refreshing the page, so I'm typing again here. After checking the data, I found that there're cases in which an ID could have values that add up to or not to 0. In this case, as long as there're values that add up to 0, I would like to flag the ID as "Yes".
Below is my updated example:
Card | ID | Amount | Time | File |
202301 | 2023-01 | 5000 | 12:00 | A |
202302 | 2023-02 | 200 | 12:50 | A |
2023-01 | 2023-01 | -5000 | 12:00 | B |
2023-02 | 2023-02 | 150 | 12:50 | B |
2023-01 | 2023-01 | 250 | 16:00 | B |
The new desired output would be:
Card | ID | Amount | Time | File | Flag |
202301 | 2023-01 | 5000 | 12:00 | A | Yes |
202302 | 2023-02 | 200 | 12:50 | A | No |
2023-01 | 2023-01 | -5000 | 12:00 | B | Yes |
2023-02 | 2023-02 | 150 | 12:50 | B | No |
2023-01 | 2023-01 | 250 | 16:00 | B | Yes |
Note that ID 2023-01 is flagged Yes because 5000 from file A plus -5000 in file B equals to zero. The additional record in file B with a value of 250 is flagged Yes as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am not sure what you mean by
"an ID could have values that add up to or not to 0.
"In this case, as long as there're values that add up to 0, I would like to flag the ID as "Yes"
Once the running total adds up to 0, the ID should be flagged as "Yes" for all the rows?
If that is the case, here is one way of doing this.
Workflow
Output
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Yoshiro_Fujimori Thanks for your update. Running total is a tool that I've not used before. This is an interesting solution to the challenge. Your interpretation of the two quoted lines is correct. Unfortunately, after checking the data, I found out this workflow would miss flagging some items. I believe it's due to the sorting of time. Some + and - records in the two files after sorting are still not adjacent to each other; and therefore, failed to result in 0.
As can be seen from the given example, there could be different numbers of the same variable 2023-01 in files A and B, meaning that there could be just one 2023-01 in file A but two or three in file B. I still can't find a way to properly flag those that cancel out after adding together. My goal remains to find out the variables in file A that has a positive number which after summing up with the negative number of the same variable in file B would result in 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator