I have data consisting of multiple groups shown below. Within each group, I need to identify specific records to keep, and specific ones to remove. I have manually done this in the column Keep or Remove but I want to recreate the last column using a formula.
I want to start from the latest date in each group (last row with date in ascending order) and Keep all the rows (starting from the last row in group and moving up) that sum up to the value in column A from bottom up until the entire value of column A is used up. All other remaining rows would be marked as Remove.
What I'm looking for is something where it checks the value of column B (in the last row with the latest date) against the value in column A in the same row. If the value of B is less than or equal to value of column A , mark row as Keep if not mark as Remove. Now we move up one row with the next latest date, if value of column B plus the sum of all the previous Keep column B's is less than or equal to the value in column A, mark as Keep, if not mark as Remove. Moving on to the next row up, if the value in column B plus the sum of all the previous Keep column B's is less than or equal to the value in column A, mark as Keep, if not mark as Remove.
Essentially what I am looking for is a formula for the last column that can do what I am describing above for the below data set.
See below for an example table:
Group | A | B | Date | Keep or Remove? |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 1 | 5/4/2022 | Keep |
1 | 3 | 2 | 5/11/2022 | Remove |
1 | 3 | 2 | 5/17/2022 | Remove |
1 | 3 | 2 | 6/1/2022 | Remove |
1 | 3 | 2 | 6/7/2022 | Remove |
1 | 3 | 2 | 6/13/2022 | Remove |
1 | 3 | 2 | 6/20/2022 | Remove |
1 | 3 | 2 | 6/24/2022 | Keep |
2 | 4 | 1 | 5/17/2022 | Remove |
2 | 4 | 1 | 6/1/2022 | Keep |
2 | 4 | 1 | 6/7/2022 | Keep |
2 | 4 | 4 | 6/13/2022 | Remove |
2 | 4 | 1 | 6/20/2022 | Keep |
2 | 4 | 1 | 6/24/2022 | Keep |
3 | 2 | 1 | 6/1/2022 | Keep |
3 | 2 | 1 | 6/7/2022 | Keep |
3 | 2 | 4 | 6/13/2022 | Remove |
Solved! Go to Solution.
@Althelp trying to understand your logic for keeping records and removing records. Can you explain the logic for keeping the 6/24 and 5/4 records from the group 1?
@binuacs if you look at column B, those 2 rows sum up to 3 (which is the value in column A) when you start from the bottom (last date) and go up. As the value in column A is then used up, you would mark the remaining rows (going up) as remove.
@DataNath @grazitti_sapna When I applied this to a larger dataset, I realized I haven't accounted for every situation so it's right sometimes but also wrong sometimes. I have to keep the full value in "A" no matter what, so if nothing adds up to it exactly, then i would still need to keep the row or rows starting from the last date in the group that contain the value of "A" even if over the actual value. See below for 3 more groups, the Alteryx result, and what the correct answer should have been.
Group | A | B | Date | Alteryx Result | Correct Answer |
4 | 4 | 4 | 1/24/2022 | Remove | Remove |
4 | 4 | 4 | 2/28/2022 | Remove | Keep |
4 | 4 | 1 | 6/22/2022 | Keep | Keep |
5 | 1 | 4 | 1/31/2022 | Remove | Remove |
5 | 1 | 2 | 12/6/2021 | Remove | Keep |
6 | 50 | 55 | 6/16/2022 | Remove | Keep |
@DataNath @grazitti_sapna Actually, I realized the 3rd row in date 6/22/2022 should be Remove not Keep because the row above it 2/28/2022 is Keeping the full quantity of 4 in A. So use this table instead.
Group | A | B | Date | Alteryx Result | Correct Answer |
4 | 4 | 4 | 1/24/2022 | Remove | Remove |
4 | 4 | 4 | 2/28/2022 | Remove | Keep |
4 | 4 | 1 | 6/22/2022 | Keep | Remove |
5 | 1 | 4 | 1/31/2022 | Remove | Remove |
5 | 1 | 2 | 12/6/2021 | Remove | Keep |
6 | 50 | 55 | 6/16/2022 | Remove | Keep |
@DataNath @grazitti_sapna Would you happen to be able to help with an alternate simpler scenario as well? In this case, I want to start from the last row in the group and keep the rows that where Column B equal or exceed the quantity in Column A. All other rows can be marked as Remove once Column A has been reached or exceeded.
Group | A | B | Date | Keep or Remove |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 1 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/11/2022 | Remove |
1 | 3 | 2 | 5/17/2022 | Remove |
1 | 3 | 2 | 6/1/2022 | Remove |
1 | 3 | 2 | 6/7/2022 | Remove |
1 | 3 | 2 | 6/13/2022 | Remove |
1 | 3 | 2 | 6/20/2022 | Keep |
1 | 3 | 2 | 6/24/2022 | Keep |
2 | 4 | 1 | 5/17/2022 | Remove |
2 | 4 | 1 | 6/1/2022 | Remove |
2 | 4 | 1 | 6/7/2022 | Remove |
2 | 4 | 4 | 6/13/2022 | Keep |
2 | 4 | 1 | 6/20/2022 | Keep |
2 | 4 | 1 | 6/24/2022 | Keep |
3 | 2 | 1 | 6/1/2022 | Remove |
3 | 2 | 1 | 6/7/2022 | Remove |
3 | 2 | 4 | 6/13/2022 | Keep |
4 | 4 | 4 | 1/24/2022 | Remove |
4 | 4 | 4 | 2/28/2022 | Keep |
4 | 4 | 1 | 6/22/2022 | Keep |
5 | 1 | 4 | 1/31/2022 | Remove |
5 | 1 | 2 | 12/6/2021 | Keep |
6 | 50 | 55 | 6/16/2022 | Keep |
@DataNath @grazitti_sapna Would you be able to help with a simpler second scenario? In this case, I simply want to start from the bottom up of each group and "Keep" rows until the sum of B from bottom up equals or exceeds the value of A with all other rows being marked as "Remove". See below.
Group | A | B | Date | Correct Answer |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 1 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/11/2022 | Remove |
1 | 3 | 2 | 5/17/2022 | Remove |
1 | 3 | 2 | 6/1/2022 | Remove |
1 | 3 | 2 | 6/7/2022 | Remove |
1 | 3 | 2 | 6/13/2022 | Remove |
1 | 3 | 2 | 6/20/2022 | Keep |
1 | 3 | 2 | 6/24/2022 | Keep |
2 | 4 | 1 | 5/17/2022 | Remove |
2 | 4 | 1 | 6/1/2022 | Remove |
2 | 4 | 1 | 6/7/2022 | Remove |
2 | 4 | 4 | 6/13/2022 | Keep |
2 | 4 | 1 | 6/20/2022 | Keep |
2 | 4 | 1 | 6/24/2022 | Keep |
3 | 2 | 1 | 6/1/2022 | Remove |
3 | 2 | 1 | 6/7/2022 | Remove |
3 | 2 | 4 | 6/13/2022 | Keep |
4 | 4 | 4 | 1/24/2022 | Remove |
4 | 4 | 4 | 2/28/2022 | Keep |
4 | 4 | 1 | 6/22/2022 | Keep |
5 | 1 | 4 | 1/31/2022 | Remove |
5 | 1 | 2 | 12/6/2021 | Keep |
6 | 50 | 55 | 6/16/2022 | Keep |
Would you also be able to help with a secondary simpler scenario? In this case I simply want to start from the bottom of each group and "Keep" all the rows until the sum of B is either equal to or exceeds the value of A. Once this is reached, the remaining rows going up would be marked as "Remove". See Below -
Group | A | B | Date | Correct Answer |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 2 | 5/2/2022 | Remove |
1 | 3 | 1 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/4/2022 | Remove |
1 | 3 | 2 | 5/11/2022 | Remove |
1 | 3 | 2 | 5/17/2022 | Remove |
1 | 3 | 2 | 6/1/2022 | Remove |
1 | 3 | 2 | 6/7/2022 | Remove |
1 | 3 | 2 | 6/13/2022 | Remove |
1 | 3 | 2 | 6/20/2022 | Keep |
1 | 3 | 2 | 6/24/2022 | Keep |
2 | 4 | 1 | 5/17/2022 | Remove |
2 | 4 | 1 | 6/1/2022 | Remove |
2 | 4 | 1 | 6/7/2022 | Remove |
2 | 4 | 4 | 6/13/2022 | Keep |
2 | 4 | 1 | 6/20/2022 | Keep |
2 | 4 | 1 | 6/24/2022 | Keep |
3 | 2 | 1 | 6/1/2022 | Remove |
3 | 2 | 1 | 6/7/2022 | Remove |
3 | 2 | 4 | 6/13/2022 | Keep |
4 | 4 | 4 | 1/24/2022 | Remove |
4 | 4 | 4 | 2/28/2022 | Keep |
4 | 4 | 1 | 6/22/2022 | Keep |
5 | 1 | 4 | 1/31/2022 | Remove |
5 | 1 | 2 | 12/6/2021 | Keep |
6 | 50 | 55 | 6/16/2022 | Keep |