Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Checking duplicates in table before loading new data in SQL Server

rr112
5 - Atom

Hello all! I'm new to the Alteryx community! There’s a lot of useful information in the discussions but I'm running into issues finding a unique workflow for checking duplicates in a table.

 

Essentially, I have an excel file I need to check against the table "ORDERS" in a database.

 

 

Excel file:

ID_NUMBER

YEAR

CODE

STATUS

RESPONSE

SOURCE

MONTH

SENT DATE

0000123456

2020

ABC20

A

1

DD

12

12/1/2020

0000548759

2020

ABC20

A

1

DD

12

12/1/2020

0000842169

2020

ABC20

A

1

DD

12

12/1/2020

0000254856

2020

ABC20

A

1

DD

12

12/1/2020

0000205872

2020

ABC20

A

1

DD

12

12/1/2020

0000284567

2020

ABC20

A

1

DD

12

12/1/2020

0000102489

2020

ABC20

A

1

DD

12

12/1/2020

0000854753

2020

ABC20

A

1

DD

12

12/1/2020

0000248648

2020

ABC20

A

1

DD

12

12/1/2020

 

 

I would like to view the results where

ID_NUMBER     

YEAR     

CODE    

STATUS

RESPONSE          

SOURCE              

MONTH               

SENT DATE

 

are identical in addition to the total number of records. I will exclude all duplicates before adding the new data to the ORDERS table.This is a sample set - the original file has over 150,000 rows.

 

So far, I have the following workflow but this data already exists in the Orders table; I’m not sure why all rows populate in the results view.  

 

table.pngAny help that you may have would be greatly appreciated. Thanks everyone!

5 REPLIES 5
Greg_Murray
12 - Quasar

Hi @rr112,

 

I am not sure I completely understand what you are trying to accomplish, but this workflow should help you identify the duplicates. I had to alter your sample to get this to give results.

 

Greg_Murray_0-1606840089703.png

 

Hope that helps, 

Greg

 

rr112
5 - Atom

Thanks

 

tsturge
7 - Meteor

I am not sure if I am interpreting your challenge correctly either but here is an idea that might help based on what I understand.

 

tsturge_0-1606842876722.png

 

rr112
5 - Atom

 

tsturge
7 - Meteor

If you are looking to compare your Excel file to see which records already exist in your SQL table then the join will achieve that purpose.  I assume you don't have a single unique ID field that exist on both the excel AND the SQL, assuming no then the join between the two tables would solve this.

 

For example, if you have 5 fields of data on your excel file and those same 5 fields in SQL table then set up the join tool using all 5 fields.  Only records where all 5 fields are a 100% match will join, these are your 'duplicates', these are your records that exist in both sources of data, these records will output from the J in the join tool

 

The Excel records that don't join (aka outer left) are records that don't have a like record in SQL, these are you new records.

 

 

Labels