Alteryx Community,
I have 50 Different files that I receive on a monthly basis that contain thousands of Name's and Address's.
Using a software it tells me all the duplicate records between all 50 Different Files based on Name and Address. The software creates a field called [DupeGroup] and every duplicate it finds it assigns a new number. So for example, If Joe Smith that lives in Washington DC is in 3 of the files it's puts a "1" in all 3 files in the [DupeGroup] field. The next duplicate bunch will get "2" and so on.
Which leaves me with something like this:
FILE# | DUPE GROUP |
FILE 1 | 1 |
FILE 2 | 1 |
FILE 3 | 1 |
FILE 4 | 1 |
FILE 5 | 1 |
FILE 2 | 2 |
FILE 3 | 2 |
FILE 4 | 2 |
FILE 1 | 3 |
FILE 3 | 3 |
FILE 5 | 3 |
I'm trying to create a workflow that shows me how many times 1 file hit up against another file based on the [Dupe Group] field.
So based on the above table the end result would be this:
FILE # | FILE 1 | FILE 2 | FILE 3 | FILE 4 | FILE 5 |
FILE 1 | 0 | 1 | 2 | 1 | 2 |
FILE 2 | 1 | 0 | 2 | 2 | 1 |
FILE 3 | 2 | 2 | 0 | 2 | 2 |
FILE 4 | 1 | 2 | 2 | 0 | 1 |
FILE 5 | 2 | 1 | 2 | 1 | 0 |
So using File 1 and File 5 as an example, you can see it knocked up against each other 2x - Once in the [Dupe Group]=1 and again in the [Dupe Group]=3.
Any recommendations on what tool to even begin to use to get this done? I've been racking my brain over this for the last couple of days and just can't seem to find a place to start.
Javier
Solved! Go to Solution.
I don't know when, but one day when I meet you- I'm buying you dinner!!! Thank you so much!
Javier
Wow, next time, lets try to answer your query.😁