Hi SME,
What's the best approach to resolving joining two tables that have crossed referenced fields but do not have unique records in both tables in which case using Join will produce a multiplying number of records. I suppose I could use the Find & Replace tool but I would run into the issue of matching multiple fields at the same time.
Any suggestions on how to use the Join tool for tables that do not have unique records along with other tools to preserve the integrity of one table when looking up info from the other?
Thanks,
Konn
Solved! Go to Solution.
Can you attach sample input data and expected output?
And describe, step by step, how you would perform the task(s) if you were doing them manually.
Alteryx, or any other automation tool, is simply translating a manual task into code/workflow steps.
Chris
Hi @ChrisTX,
Think I'm good as long as I have one table with unique records. If I were to have duplicates in the common field, then the Join will only pick up the first one encountered - kind of like vlookup. On a side note, what is the best approach to test two datasets in Alteryx to see if they have the same values (both with the same data structure). I want to make sure that they are identical.
Thanks,
K
Hey @knnwndlm
If two tables have the same schema (and same data structure) and you want to identify the same records, you can join both by using all fields as "join by specific fields"
It would be great if you could share a sample data table to better understand the problem.
Thanks
Thank you!
@knnwndlm I'm not sure this statement is accurate: If I were to have duplicates in the common field, then the Join will only pick up the first one encountered - kind of like vlookup.
an accurate statement is: A join will pick up all joined records from the Right input table, even if the common field is duplicated in 2 or more records.
For your question about ... best approach to test two datasets in Alteryx to see if they have the same values
See these posts:
How To: Compare Data from Two Data sets
Community > Designer > Browse Knowledge > How To: Compare Data from Two Data sets
comparison macro
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/String-Comparison/m-p/388279#M71158
CReW Delta macro
This macro allows a user to input two files that are expected to be equal.
https://gallery.alteryx.com/#!app/CReW-Delta--Alpha-/5b41141b826fd3116445fee8
Comparing Data from Two Sets of Data, Calling out Differences
Compare 2 Data Sets
https://community.alteryx.com/t5/Engine-Works-Blog/Compare-2-Data-Sets/ba-p/88853
The purpose of the macro is to pull in two different sheets and help identify the differences between the two
Community > Getting Started > Alteryx Use Cases > Data Check Between Two Sources
https://community.alteryx.com/t5/Alteryx-Use-Cases/Data-Check-Between-Two-Sources/ta-p/270082
Comparing data between two different servers
Compare every value in every field from 2 different inputs
Community > Knowledge > Designer > How To: Compare Data from Two Data sets
How To: Compare Data from Two Data sets
Compare field values Left vs Right
Iterative Data Comparison
Chris
Thank you Chris!
Hi @ChrisTX,
I'm still not certain about a case that I'm working with. I have the following situation:
Table A:
Field A
Field B
CITY
STATE
ZIPCODE
Field C
Field D
TABLE B:
Field E
CITY
STATE
ZIPCODE
Field F
Field G
Both tables have no unique records except Field A in Table A which has no connection to Table B. If I were to use the Join tool with CITY/STATE/ZIPCODE, as long as I rely on the J anchor, my results should be fine, right? Or will I have a situation where the Join will pick up the first find? How do I go about testing this?
I should mention that Table A is the main table and Table B serves as the lookup table. What I want is to populate all the values in Table A based on what I found in Table B and use the J anchor to pick up records that are found based on CITY/STATE/ZIPCODE.
Thanks,
K
For your question... Or will I have a situation where the Join will pick up the first find?
The J output anchor will *never* pick up only the first record. It is not like an Excel VLookup. The Join tool will pick up every joined record from both input anchors.
suggestion: inside Alteryx, left-click on the Join tool, click the link for Open Example, run the Example, and study the outputs.
Beyond just the Join tool, for any analytic you will write, the first step is to truly know the input data. Use lots of investigative tools like Unique, Join, Summarize, etc to understand where duplicates exist and how data sets relate to each other.
If you join from table A to table B on City, State, and Zipcode, you will get duplicate records from table A if there are duplicates in Table B for the fields [City, State, Zip].
So attach a Unique tool to table B, and find out if there are duplicates. If you identify a duplicate, join from A to B, and look at the output for the duplicate you found in B. If the output isn't what you want, you'll have to figure out how to un-duplicate data in Table B, (but of course, only if the data in Table B can reliably be un-duplicated and give meaningful results).
Any analytic workflow, or any code in any programming language, is just telling the machine how you would perform a process manually. You have to understand the input data, from all tables, know how it's structured, understand the cardinality.
Chris
Thank you Chris!