I want to group data based on specific fields and then filter records based on that grouped data.
Here is my sample data
Cookies | Fruits | Vegetables | Cakes |
Chocolate Chip | Apple | Onion | Chocolate unsweetened |
Chocolate Chip | Apple | Onion | Vanilla |
Snickerdoodle | Strawberry | Kale | Tiramisu |
Peanut Butter | Blueberry | Spinach | Tres Leches |
Chocolate Chip | Apple | Onion | Chantilly |
I want to look for for records that are duplicates (match) for the Cookies, Fruits, and Vegetables fields. For the records that are duplicates on those fields, I want to filter based on the Cakes field.
For rows 1, 2, and 5 that match on the Cookies, Fruits, and Vegetables fields, I want only one of the records to continue through the workflow. I want to prioritize the record that contains the string 'unsweetened' in the Cake field.
If the record has no duplicates based on the specified fields (Cookies, Fruits, Vegetables), I want it to pass through as normal.
In the example above, I would want the following fields to continue to the next step of the workflow:
Cookies | Fruits | Vegetables | Cakes |
Chocolate Chip | Apple | Onion | Chocolate unsweetened |
Snickerdoodle | Strawberry | Kale | Tiramisu |
Peanut Butter | Blueberry | Spinach | Tres Leches |
Please let me know if this does not make sense.
Solved! Go to Solution.
If we consider your sample data as is, only using the 'Unique tool will give you the result you want.
However, it's possible that the first option in the list does not contain 'unsweetened' and gives the wrong output.
so, you can add the 'Formula' tool creating a new column as Sort ID with the condition 'IF Contains([Cakes], "unsweetened") THEN 1 ELSE 2 ENDIF'
Next, you sort ascending on the Sort ID column >> use the 'Unique' tool.
if you do not want the Sort ID column to be carried forward, use the select tool to get rid of it.
Cheers,
Anand
Very clever of you to notice that the unique tool takes the first record it comes across as the unique one and adds all the rest to the duplicate output. Sorting to give the "unsweetened" records first ensure that they are in the Unique output. I'd "Like" this answer more than once if I could
Dan
Thank you so much. I have additional information I would like to share.
Thanks for your kind words Dan. appreciate your feedback. This is my first ever post on the Alteryx community.
I don't know how to add your name while replying. sorry for that.
Thank you for your help. I have a follow up question with a more complicated example.
Let's say I have a system in which a single payment ID may be associated with multiple documents. If the payment ID is only associated with one document, I just want it to flow through with the associated status. If the payment ID is associated with multiple documents, I want to use the following logic:
If a payment ID has multiple documents:
Here is the new sample data:
Payment ID | Document ID | Status | Employee Name |
123 | 7410 | Paid | Lael Odonnell |
123 | 14523 | Rejected | Lael Odonnell |
789 | 45593 | In Transit | Tanek Cardenas |
676 | 10390 | Expired | Deborah Lester |
456 | 36213 | Duplicate | Melyssa Frost |
456 | 62231 | Rejected | Melyssa Frost |
In the example above, I would want the following fields to continue to the next step of the workflow:
Payment ID | Document ID | Status | Employee Name |
123 | 7410 | Paid | Lael Odonnell |
789 | 45593 | In Transit | Tanek Cardenas |
676 | 10390 | Expired | Deborah Lester |
456 | 62231 | Rejected | Melyssa Frost |
Following on from the excellent answer from @anandshingi, for your second query I would create a lookup table prioritising the statuses then sort and use the unique tool again. I've had to assume a prioritisation order for statuses other than rejected and duplicate, but you can correct them if necessary!