ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now
The Alteryx Community will be temporarily unavailable for a few hours due to scheduled maintenance starting on Thursday, April 22nd at 5pm MST. Please plan accordingly.

Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Join Tool Issue

andrewc22
7 - Meteor

I have an Alteryx application that is checking for duplicates and similarities between two txt files and outputting to excel in multiple different sheets based on similarities/differences. 

 

I am having an issue with the join tool at the end of my workflow. Incoming from the left i have 369 Records and from the right 364. 


The output from the Join tool only has 7 records from the left, 2 records from the right, and 362 from the inner join.

 

Is there a way to have all records that are duplicate output from the inner join? I have to do further filtration to the inner join data on fields I did not join the 2 instances with, so I need all 700+ records.

 

I have attached screenshots below of the workflow.

 

Thank you for any assistance!

 

 

skeen503979
8 - Asteroid

@andrewc22 

Yes, the Unique tool will help you with that!  You'll need to select which column you believe has duplicates.  The "U" output will display all unique records while the "D" output will display all duplicates.  That is where it sounds like you'll want to focus.

 

skeen503979_2-1614278285084.png

 

andrewc22
7 - Meteor

I've tried that, but no luck. The issue isn't identifying the unique/duplicate records right now, I am trying to figure out where the 300+ records went because I need to be able to work with those for my next steps. Does that make sense?

skeen503979
8 - Asteroid

@andrewc22  

Yes, I think I understand you correctly.  That sounds very odd that the # of records output from L, J, and R do not add up to the number of records input from L and R.  Have you tried deleting the Join tool and trying again fresh?  That tool should not be affecting the number of records, only rearranging them depending on the column joined.  Can you share the workflow or a more clear screenshot showing the count of records per input/output?

mceleavey
14 - Magnetar

Not really. Can you provide a mock up of what you're trying to achieve?

 

If you want to see the duplicates from the inner join, you need to determine what you mean by a duplicate. Is it where all fields are exactly the same? The key fields are the same?

 

You can start by using a sum tool and grouping by the fields you want to define as a duplicate if they are the same, including a count to show how many records are in that group. Then filtering out the records where that count =1. You're left with the "Duplicates".

 

M.

andrewc22
7 - Meteor

I tried deleting and refreshing, but same result.  @skeen503979 

 

Attached is each input and output with record #

andrewc22
7 - Meteor

@mceleavey  I am joining them based on key columns (which does not include dates) which will output to an excel sheet. Next based on key columns, take the duplicates and on another sheet output which records have the same key columns but different dates. 

 

Essentially I have these 5 excel sheets:

Unique to instance 1

Unique to instance 2

Key columns match but differ on dates

All columns match

All records from instance 1 and 2

 

Labels