Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Checking duplicates in table before loading new data in SQL Server

rr112
5 - Atom

Hello all! I'm new to the Alteryx community! There’s a lot of useful information in the discussions but I'm running into issues finding a unique workflow for checking duplicates in a table.

 

Essentially, I have an excel file I need to check against the table "ORDERS" in a database.

 

 

Excel file:

ID_NUMBER

YEAR

CODE

STATUS

RESPONSE

SOURCE

MONTH

SENT DATE

0000123456

2020

ABC20

A

1

DD

12

12/1/2020

0000548759

2020

ABC20

A

1

DD

12

12/1/2020

0000842169

2020

ABC20

A

1

DD

12

12/1/2020

0000254856

2020

ABC20

A

1

DD

12

12/1/2020

0000205872

2020

ABC20

A

1

DD

12

12/1/2020

0000284567

2020

ABC20

A

1

DD

12

12/1/2020

0000102489

2020

ABC20

A

1

DD

12

12/1/2020

0000854753

2020

ABC20

A

1

DD

12

12/1/2020

0000248648

2020

ABC20

A

1

DD

12

12/1/2020

 

 

I would like to view the results where

ID_NUMBER     

YEAR     

CODE    

STATUS

RESPONSE          

SOURCE              

MONTH               

SENT DATE

 

are identical in addition to the total number of records. I will exclude all duplicates before adding the new data to the ORDERS table.This is a sample set - the original file has over 150,000 rows.

 

So far, I have the following workflow but this data already exists in the Orders table; I’m not sure why all rows populate in the results view.  

 

table.pngAny help that you may have would be greatly appreciated. Thanks everyone!

5 REPLIES 5
Greg_Murray
12 - Quasar

Hi @rr112,

 

I am not sure I completely understand what you are trying to accomplish, but this workflow should help you identify the duplicates. I had to alter your sample to get this to give results.

 

Greg_Murray_0-1606840089703.png

 

Hope that helps, 

Greg

 

rr112
5 - Atom

Thanks

 

tsturge
7 - Meteor

I am not sure if I am interpreting your challenge correctly either but here is an idea that might help based on what I understand.

 

tsturge_0-1606842876722.png

 

rr112
5 - Atom

 

tsturge
7 - Meteor

If you are looking to compare your Excel file to see which records already exist in your SQL table then the join will achieve that purpose.  I assume you don't have a single unique ID field that exist on both the excel AND the SQL, assuming no then the join between the two tables would solve this.

 

For example, if you have 5 fields of data on your excel file and those same 5 fields in SQL table then set up the join tool using all 5 fields.  Only records where all 5 fields are a 100% match will join, these are your 'duplicates', these are your records that exist in both sources of data, these records will output from the J in the join tool

 

The Excel records that don't join (aka outer left) are records that don't have a like record in SQL, these are you new records.

 

 

Labels