Alert: There is a planned Community maintenance outage October 16th from approximately 10 - 11 PM PST. During this time the Alteryx Community will be inaccessible. Thank you for your understanding!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Join w/o Unique Records in Both Tables

knnwndlm
8 - Asteroid

Hi SME,

 

What's the best approach to resolving joining two tables that have crossed referenced fields but do not have unique records in both tables in which case using Join will produce a multiplying number of records.  I suppose I could use the Find & Replace tool but I would run into the issue of matching multiple fields at the same time.  

 

Any suggestions on how to use the Join tool for tables that do not have unique records along with other tools to preserve the integrity of one table when looking up info from the other?

 

Thanks,

Konn

9 REPLIES 9
ChrisTX
16 - Nebula

Can you attach sample input data and expected output?

 

And describe, step by step, how you would perform the task(s) if you were doing them manually.

 

Alteryx, or any other automation tool, is simply translating a manual task into code/workflow steps.

 

Chris

knnwndlm
8 - Asteroid

Hi @ChrisTX,

 

Think I'm good as long as I have one table with unique records.  If I were to have duplicates in the common field, then the Join will only pick up the first one encountered - kind of like vlookup.  On a side note, what is the best approach to test two datasets in Alteryx to see if they have the same values (both with the same data structure).  I want to make sure that they are identical.

 

Thanks,

K

AdwaitTarudkar
8 - Asteroid

Hey @knnwndlm 

 

If two tables have the same schema (and same data structure) and you want to identify the same records, you can join both by using all fields as "join by specific fields"

 

It would be great if you could share a sample data table to better understand the problem.

 

Thanks

knnwndlm
8 - Asteroid

Thank you!

ChrisTX
16 - Nebula

@knnwndlm I'm not sure this statement is accurate:  If I were to have duplicates in the common field, then the Join will only pick up the first one encountered - kind of like vlookup.

 

an accurate statement is:  A join will pick up all joined records from the Right input table, even if the common field is duplicated in 2 or more records.

 

For your question about ... best approach to test two datasets in Alteryx to see if they have the same values 

 

See these posts:

 

How To: Compare Data from Two Data sets

Community > Designer > Browse Knowledge > How To: Compare Data from Two Data sets

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/How-To-Compare-Data-from-Two-Data-sets/ta-p/...

 

comparison macro

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/String-Comparison/m-p/388279#M71158

 

CReW Delta macro

This macro allows a user to input two files that are expected to be equal.

https://gallery.alteryx.com/#!app/CReW-Delta--Alpha-/5b41141b826fd3116445fee8

 

Comparing Data from Two Sets of Data, Calling out Differences

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Comparing-Data-from-Two-Sets-of-Data-C...

 

Compare 2 Data Sets

https://community.alteryx.com/t5/Engine-Works-Blog/Compare-2-Data-Sets/ba-p/88853

 

The purpose of the macro is to pull in two different sheets and help identify the differences between the two

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/How-To-Update-Fields-Selected-With-Con...

 

Community > Getting Started > Alteryx Use Cases > Data Check Between Two Sources

https://community.alteryx.com/t5/Alteryx-Use-Cases/Data-Check-Between-Two-Sources/ta-p/270082

 

Comparing data between two different servers

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Comparing-data-between-two-different-s...

 

Compare every value in every field from 2 different inputs

Community > Knowledge > Designer > How To: Compare Data from Two Data sets

How To: Compare Data from Two Data sets

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/How-To-Compare-Data-from-Two-Data-s...

 

Compare field values Left vs Right

https://community.alteryx.com/t5/Inspire-2019-Buzz/Nested-Macros-Session-Field-Comparison-macro/m-p/...

 

Iterative Data Comparison

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Iterative-Data-Comparison/td-p/841398/...

 

 

Chris

 

 

knnwndlm
8 - Asteroid

Thank you Chris!  

knnwndlm
8 - Asteroid

Hi @ChrisTX,

 

I'm still not certain about a case that I'm working with.  I have the following situation:

 

Table A:

  Field A

  Field B

  CITY

  STATE

  ZIPCODE

  Field C

  Field D

 

TABLE B:

  Field E

  CITY

  STATE

  ZIPCODE

  Field F

  Field G

 

Both tables have no unique records except Field A in Table A which has no connection to Table B.  If I were to use the Join tool with CITY/STATE/ZIPCODE, as long as I rely on the J anchor, my results should be fine, right?  Or will I have a situation where the Join will pick up the first find?  How do I go about testing this? 

 

I should mention that Table A is the main table and Table B serves as the lookup table.  What I want is to populate all the values in Table A based on what I found in Table B and use the J anchor to pick up records that are found based on CITY/STATE/ZIPCODE.

 

Thanks,

K

 

ChrisTX
16 - Nebula

For your question... Or will I have a situation where the Join will pick up the first find?

 

The J output anchor will *never* pick up only the first record.  It is not like an Excel VLookup.  The Join tool will pick up every joined record from both input anchors.

 

suggestion: inside Alteryx, left-click on the Join tool, click the link for Open Example, run the Example, and study the outputs.

 

Beyond just the Join tool, for any analytic you will write, the first step is to truly know the input data.  Use lots of investigative tools like Unique, Join, Summarize, etc to understand where duplicates exist and how data sets relate to each other.

 

If you join from table A to table B on City, State, and Zipcode, you will get duplicate records from table A if there are duplicates in Table B for the fields [City, State, Zip].

 

So attach a Unique tool to table B, and find out if there are duplicates.  If you identify a duplicate, join from A to B, and look at the output for the duplicate you found in B.  If the output isn't what you want, you'll have to figure out how to un-duplicate data in Table B, (but of course, only if the data in Table B can reliably be un-duplicated and give meaningful results).

 

Any analytic workflow, or any code in any programming language, is just telling the machine how you would perform a process manually.  You have to understand the input data, from all tables, know how it's structured, understand the cardinality.  

 

Chris

knnwndlm
8 - Asteroid

Thank you Chris!

Labels