Alteryx Designer Desktop Discussions

EvolveKev · ‎01-11-2019

Hi guys,

I've got a simple problem that I'm having a hard time getting my head around.

I have two sets of data (2 files). File 1 is my master file of a couple of hundred thousand addresses. File 1 is a file of addresses of houses which have specific characteristics.

Quite simply, I want to add a field to file 1 indicating if its address can be found in file 2. This is giving me a headache!

I can easily union the data, and the output the files that match. I can then create a new field using the formula tool which indicates the match. Obviously, 100% of the output is a match. My problem is I want to see not just the matches, but the whole dataset with an additional field indicating if its a match.

I know this is easy...I'm just at a wall! Any ideas?

mmenth · ‎01-11-2019

Hi there,

I would recommend first joining the two files on address (I recommend making file 1 your left input and file 2 your right input). Those that join (come out of the J output node) will be the ones you want to flag for file 1. If you click on that J node, you can uncheck all the fields from the file 2 input. Then you can add a formula tool out of the J node with a field named something like 'address found in file 2', and in the formula section simply put a 1. You can make this and int or a byte. Finally, union the L and J nodes of your join tool, and you will have your original dataset for file 1, plus a field that indicates if the address exists in file 2. This new field will be 1 if it exists, Null if it doesn't exist. You can easily replace these nulls with zeros with another formula tool or a data cleansing tool if you wish.

Best,

mmenth

EvolveKev · ‎01-11-2019

Hey @mmenth,

Thank you so much - this worked perfectly. Perfectly!

You're awesome.

Alteryx Designer Desktop Discussions

Flagging Matches from 2 different sources

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Passed the Advanced Certification Exam!

Re: Identify duplicates in a specific column, and ...