Alteryx Designer Desktop Discussions

rr112 · ‎12-01-2020

Hello all! I'm new to the Alteryx community! There’s a lot of useful information in the discussions but I'm running into issues finding a unique workflow for checking duplicates in a table.

Essentially, I have an excel file I need to check against the table "ORDERS" in a database.

Excel file:

ID_NUMBER	YEAR	CODE	STATUS	RESPONSE	SOURCE	MONTH	SENT DATE
0000123456	2020	ABC20	A	1	DD	12	12/1/2020
0000548759	2020	ABC20	A	1	DD	12	12/1/2020
0000842169	2020	ABC20	A	1	DD	12	12/1/2020
0000254856	2020	ABC20	A	1	DD	12	12/1/2020
0000205872	2020	ABC20	A	1	DD	12	12/1/2020
0000284567	2020	ABC20	A	1	DD	12	12/1/2020
0000102489	2020	ABC20	A	1	DD	12	12/1/2020
0000854753	2020	ABC20	A	1	DD	12	12/1/2020
0000248648	2020	ABC20	A	1	DD	12	12/1/2020

I would like to view the results where

ID_NUMBER

YEAR

CODE

STATUS

RESPONSE

SOURCE

MONTH

SENT DATE

are identical in addition to the total number of records. I will exclude all duplicates before adding the new data to the ORDERS table.This is a sample set - the original file has over 150,000 rows.

So far, I have the following workflow but this data already exists in the Orders table; I’m not sure why all rows populate in the results view.

Any help that you may have would be greatly appreciated. Thanks everyone!

Greg_Murray · ‎12-01-2020

Hi @rr112,

I am not sure I completely understand what you are trying to accomplish, but this workflow should help you identify the duplicates. I had to alter your sample to get this to give results.

Hope that helps,

Greg

rr112 · ‎12-01-2020

Thanks Greg_Murray!

I'm sorry it wasn't clear, still learning the ropes. I want to check the text input against a table that already exists in the database to confirm if duplicate rows exist. The output should return duplicate rows only meaning all field names are identical. Thanks so much for your response!

tsturge · ‎12-01-2020

I am not sure if I am interpreting your challenge correctly either but here is an idea that might help based on what I understand.

rr112 · ‎12-01-2020

Hi tsturge,

Thanks so much for your response. This workflow is helpful; the challenge I'm facing is checking the excel file against data that already exists in the table. I'd like to create a new table that shows all records that already exist in the Orders table using the excel file. For example, load Database > View Orders table > check excel file for duplicate records in Order table > browse duplicates (meaning show data where rows already exist in orders table - duplicates in this case are all identical fields ). I'm sorry if this still doesn't make sense. Thank you so much for your help, I'm very grateful to have such a helpful and responsive community of experts!

tsturge · ‎12-01-2020

If you are looking to compare your Excel file to see which records already exist in your SQL table then the join will achieve that purpose. I assume you don't have a single unique ID field that exist on both the excel AND the SQL, assuming no then the join between the two tables would solve this.

For example, if you have 5 fields of data on your excel file and those same 5 fields in SQL table then set up the join tool using all 5 fields. Only records where all 5 fields are a 100% match will join, these are your 'duplicates', these are your records that exist in both sources of data, these records will output from the J in the join tool

The Excel records that don't join (aka outer left) are records that don't have a like record in SQL, these are you new records.

Alteryx Designer Desktop Discussions

Checking duplicates in table before loading new data in SQL Server