Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identify and remove duplicates using In-database

bmcrhino
5 - Atom

Hi all,

 

Note: using In-database functions

 

I used a left outer join ("Join In-DB"), which resulted in more rows than I had in my left-side table.  I suspect that it could be because my right-side table has duplicate values.  But I don't know how to look into this. 

 

Can someone help me identify and remove any duplicate values from a table using the In-database functions?

 

Note: I'm relatively new to Alteryx and don't know SQL that well --> I can easily google some SQL approaches to solving this problem but am not sure where to execute those within Alteryx and integrate them into my workflow.

 

Thanks!

2 REPLIES 2
danrh
13 - Pulsar

I'd look at the Summarize In-DB tool.  Group by the fields you're joining on, and add a Count to see where the duplicates are - anything over 1 is a dup.  There's a chance the Summarize will be enough to remove the duplicates as well (with the non-joined fields, can you take a max?  a min?  an average? try to aggregate before the join so you end up with a unique join).

 

Post some sample data if you need specific help.  Good luck!

bmcrhino
5 - Atom

Thanks.  That ended up being helpful in identifying the duplicates.

 

Then I used some SQL code to remove the duplicates during the "Connect In-DB" tool portion.

 

 

Labels