Alteryx Designer Desktop Discussions

BMartinCOE · ‎07-18-2023

Hello!

I am working with a dataset that contains duplicates it looks something like this:

Building ID	Field 1	Field 2
1	1234 Maple St	7:00pm 6-6-2022
2	123 Yellow St	6:00pm 4-5-2022
1	123 Blue St	5:33pm 2-5-2022
2	123 Yellow St	3:00pm 6-6-2022
3	123 Green Ave	3:00pm 4-5-2022
1	1234 Maple St	7:10pm 6-6-2022

I am trying to remove duplicate based on multiple columns. Some users submit data with the same ID and the same address but it has been submitted at a later time. I am looking to remove the early time as I assume the more recent submission is their intended submission (ex: would want to remove the older 1234 yellow st). This becomes more complicated because some users submit data with the same building ID for different addresses (ex: building ID 1 has two different address but 3 different submissions). In this case it should end up with two submissions, removing the older duplicated address.

Does anyone have any suggestions on how I can clean this up?

Thanks

*Edit: it is okay for their to be duplicate building ID, it is not okay for their to be duplicate field 1

geraldo · ‎07-19-2023

@BMartinCOE

AN worflow example

Alteryx Designer Desktop Discussions

Delete duplicate based on time submitted help

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...