Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Split paired records to the same file

agriese
7 - Meteor

Hey world!

 

I am working on a massive data de-duplication project, working with a lot of legacy data on multiple contact records. I've written this process, which pairs records by email address, marks the "master" record as the one more recently edited, and overwrites any last name shorter than two characters with the other last name in the pair. (IE Cheshire C. becomes Cheshire Cat).

 

I'd like to split this into two outputs. Output 1 should be any record where neither record nor its pair have had their name overwritten. Output 2 should be any record or its pair have had a name change.

 

How would I determine this, and how would I put both halves of a pair into the same output record?

 

Current:

EmailLevelFirstLast
alice@wonderland.comMasterAliceCarroll
alice@wonderland.comVictimAliceCarroll
ccat@wonderland.comVictimCheshireCat
ccat@wonderland.comMasterCheshireC

 

Goal:

EmailLevelFirstLastSaveTo
alice@wonderland.comMasterAliceCarrollFile1
alice@wonderland.comVictimAliceCarrollFile1
ccat@wonderland.comVictimCheshireCatFile2
ccat@wonderland.comMasterCheshireCFile2
4 REPLIES 4
CharlieS
17 - Castor
17 - Castor

Hello @agriese 

 

In the attached example, I used a Tile tool to determine email records that had more than one First/Last Name combo. Once I joined that information back to all records of each email, I filtered them into the two output files. Let me know if you have any questions.

agriese
7 - Meteor

Hey @CharlieS, thanks for that example! I believe it works, but I'm not understanding the thought process that led you here. Would you mind explaining how it works, so I can troubleshoot on my own data?

CharlieS
17 - Castor
17 - Castor

Sure thing. 

 

- The most important tool here is the Tile tool. I've set it to find unique values of [Email], [First], and [Last] and set it to group by the [Email] field. For each email address, this will determine the number of unique combinations of first and last names for that email. In the version that I send you, selecting [Email] as a unique field is redundant since it is set as a group by field. 

- The Summarize tool will group by [Email] and returns the number of unique First/Last name combos found for that email.

- The number of unique name records for each email is then rejoined and filtered so emails that all have the same name go to File1 and emails with different name combos go to File2.

agriese
7 - Meteor

Thanks! I'll have to mess around with this Tile tool more going forward.

Labels
Top Solution Authors