I have a dataset that runs daily through a workflow that gives me an output that we will call Output-BOM-TIMESTAMP. I need to check and compare data amongst both the files where the new timestamp file will be considered the latest.
1) I would like to compare both the files considering "Output-BOM-5-26-2022" to be the latest. We will have to remove any row or record that is not present in the new file considering P-B-S column to always be unique.
Output-BOM-5-26-2022
PBS | Prefix | Base | Suffix | Cost |
P1-B1-S1 | P1 | B1 | S1 | 150 |
P2-B2-S2 | P2 | B2 | S2 | 250 |
P3-B3-S3 | P3 | B3 | S3 | 350 |
P4-B4-S4 | P4 | B4 | S4 | 550 |
P6-B6-S6 | P6 | B6 | S6 | 650 |
Output-BOM-5-25-2022
PBS | Prefix | Base | Suffix | Cost |
P1-B1-S1 | P1 | B1 | S1 | 100 |
P2-B2-S2 | P2 | B2 | S2 | 200 |
P3-B3-S3 | P3 | B3 | S3 | 300 |
P5-B5-S5 | P5 | B5 | S5 | 400 |
P6-B6-S6 | P6 | B6 | S6 | 650 |
2) I would like to now process through the data and create two new columns in the Output file. "Action" and "Change". In Action, we have 3 categories. "Add", "Revised", "No Change" depending on the change in Cost values from one dataset to another.
In Change, I would like to calculate the difference in Cost from one dataset to another.
I am adding the expected output down below and if anyone could help me with this workflow. It would be great.
PBS | Prefix | Base | Suffix | Action | Change |
P1-B1-S1 | P1 | B1 | S1 | Revised | 50 |
P2-B2-S2 | P2 | B2 | S2 | Revised | 50 |
P3-B3-S3 | P3 | B3 | S3 | Revised | 50 |
P4-B4-S4 | P4 | B4 | S4 | Add | 550 |
P6-B6-S6 | P6 | B6 | S6 | No Change | 0 |
Solved! Go to Solution.
Hi @ssripat3, how does this look? Works for your example but let me know if there are any issues you come across when testing!
@DataNath Thank you for helping me out. It works for other sample data I tested on. Thank you
I just need one more advice. If I want to compare more than one column with values from another column. What would be the best way of doing it?
PBS | Prefix | Base | Suffix | Cost | Indicator | Date |
P1-B1-S1 | P1 | B1 | S1 | 150 | Y | 08-02-2021 |
P2-B2-S2 | P2 | B2 | S2 | 250 | N | 09-02-2021 |
P3-B3-S3 | P3 | B3 | S3 | 350 | N | 10-02-2021 |
P4-B4-S4 | P4 | B4 | S4 | 550 | Y | 11-02-2021 |
P6-B6-S6 | P6 | B6 | S6 | 650 | Y | 12-02-2021 |
PBS | Prefix | Base | Suffix | Cost | Indicator | Date |
P1-B1-S1 | P1 | B1 | S1 | 100 | Y | 08-02-2021 |
P2-B2-S2 | P2 | B2 | S2 | 200 | Y | 09-02-2021 |
P3-B3-S3 | P3 | B3 | S3 | 300 | N | 10-03-2021 |
P5-B5-S5 | P5 | B5 | S5 | 400 | Y | 11-02-2021 |
P6-B6-S6 | P6 | B6 | S6 | 650 | N | 12-03-2021 |
I would like to now compare the values "Cost", "Indicator" and "Date" across the two tables. Let us say if any field changes and differs from the first table. Then we say the Action is "Revised". But in this instance we have to change it to "Revised" even when one single column doesn't match the data from the other file.
I think Multi-Row formula is the best way to approach this, but I am lost of writing a formula using that tool.
Please help me if you can
How does this look @ssripat3? To start with, P5-B5-S5 is missing because I joined on PBS for the comparison - is this right or should I just join on position, regardless of whether the PBS matches? To me it just looked like this was the only/obvious unique identifier. Also, I've only left the new table in the view to finish with. You can show the original table by its side by selecting them again in the join configuration:
@DataNath Thank you for the help and pardon my dumb brain. I was overthinking it and literally forgot about OR condition and was thinking is a bit complex way of using Multi-Row formula
No problem @ssripat3 - happy to help! Don’t beat yourself up, happens to everyone from time to time!