Alteryx Designer Desktop Discussions

iudogie · ‎07-31-2021

I want to group data based on specific fields and then filter records based on that grouped data.

Here is my sample data

Cookies	Fruits	Vegetables	Cakes
Chocolate Chip	Apple	Onion	Chocolate unsweetened
Chocolate Chip	Apple	Onion	Vanilla
Snickerdoodle	Strawberry	Kale	Tiramisu
Peanut Butter	Blueberry	Spinach	Tres Leches
Chocolate Chip	Apple	Onion	Chantilly

I want to look for for records that are duplicates (match) for the Cookies, Fruits, and Vegetables fields. For the records that are duplicates on those fields, I want to filter based on the Cakes field.

For rows 1, 2, and 5 that match on the Cookies, Fruits, and Vegetables fields, I want only one of the records to continue through the workflow. I want to prioritize the record that contains the string 'unsweetened' in the Cake field.

If the record has no duplicates based on the specified fields (Cookies, Fruits, Vegetables), I want it to pass through as normal.

In the example above, I would want the following fields to continue to the next step of the workflow:

Cookies	Fruits	Vegetables	Cakes
Chocolate Chip	Apple	Onion	Chocolate unsweetened
Snickerdoodle	Strawberry	Kale	Tiramisu
Peanut Butter	Blueberry	Spinach	Tres Leches

Please let me know if this does not make sense.

anandshingi · ‎07-31-2021

If we consider your sample data as is, only using the 'Unique tool will give you the result you want.

However, it's possible that the first option in the list does not contain 'unsweetened' and gives the wrong output.

so, you can add the 'Formula' tool creating a new column as Sort ID with the condition 'IF Contains([Cakes], "unsweetened") THEN 1 ELSE 2 ENDIF'

Next, you sort ascending on the Sort ID column >> use the 'Unique' tool.

if you do not want the Sort ID column to be carried forward, use the select tool to get rid of it.

Cheers,

Anand

danilang · ‎08-01-2021

@anandshingi

Very clever of you to notice that the unique tool takes the first record it comes across as the unique one and adds all the rest to the duplicate output. Sorting to give the "unsweetened" records first ensure that they are in the Unique output. I'd "Like" this answer more than once if I could

Dan

iudogie · ‎08-01-2021

Thank you so much. I have additional information I would like to share.

anandshingi · ‎08-01-2021

Thanks for your kind words Dan. appreciate your feedback. This is my first ever post on the Alteryx community.

I don't know how to add your name while replying. sorry for that.

iudogie · ‎08-01-2021

Thank you for your help. I have a follow up question with a more complicated example.

Let's say I have a system in which a single payment ID may be associated with multiple documents. If the payment ID is only associated with one document, I just want it to flow through with the associated status. If the payment ID is associated with multiple documents, I want to use the following logic:

If a payment ID has multiple documents:

Check the statuses of all the documents

If any of the statuses are not REJECTED or DUPLICATE, use that status
If the statuses are all REJECTED, use REJECTED
If the statuses are all DUPLICATE, use DUPLICATE
If the statuses are a mixture of REJECTED or DUPLICATE, use REJECTED status

Here is the new sample data:

Payment ID	Document ID	Status	Employee Name
123	7410	Paid	Lael Odonnell
123	14523	Rejected	Lael Odonnell
789	45593	In Transit	Tanek Cardenas
676	10390	Expired	Deborah Lester
456	36213	Duplicate	Melyssa Frost
456	62231	Rejected	Melyssa Frost

In the example above, I would want the following fields to continue to the next step of the workflow:

Payment ID	Document ID	Status	Employee Name
123	7410	Paid	Lael Odonnell
789	45593	In Transit	Tanek Cardenas
676	10390	Expired	Deborah Lester
456	62231	Rejected	Melyssa Frost

Christina_H · ‎08-02-2021

Following on from the excellent answer from @anandshingi, for your second query I would create a lookup table prioritising the statuses then sort and use the unique tool again. I've had to assume a prioritisation order for statuses other than rejected and duplicate, but you can correct them if necessary!

Alteryx Designer Desktop Discussions

Grouping data based on fields and filtering based on that grouping

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...