Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

removing duplicates in columns

utsarunner981
8 - Asteroid

hello just looking to see if there is an option in deleting duplicates from a column. i have attached a shot of the columns i am wanting to condense. the red box is where there are duplicates and they are the same store but the data loads in duplicates (state, state-city, state-city-store, state-city-address), i only need to identify one of the records for example (California-Los Angeles-Shopping Drive Road). 

 

thanks for the help!

7 REPLIES 7
clmc9601
13 - Pulsar
13 - Pulsar

Hi @utsarunner981,

 

It depends on how consistent your data is. The example data doesn't exactly match the pattern you specified (totally understandable if you typed it, but more important if the pattern in the real data isn't consistent either). I adapted it to match the pattern specified, but please let me know if this isn't valid.

 

I saw two potential solutions: take the fourth record in each group, or parse the Location column into multiple fields and take the last one.

 

I hope one of these helps! If it does, please consider marking this a solution so others may find it.

 

clmc9601_0-1628193200778.png

 

 

Luke_C
17 - Castor
17 - Castor

Hi @utsarunner981 

 

In addition to @clmc9601 's options, since it sounds like you don't have a huge preference for the record that stays, you could use the sample tool to select the 1st record (or last) for each employee. 

 

Luke_C_0-1628195028521.png

 

phottovy
13 - Pulsar
13 - Pulsar

I also recommend the "Unique" tool. That should easily allow you to remove duplicates in one or many columns.

Luke_C
17 - Castor
17 - Castor

@phottovy Totally agree, that's probably the simplest option here! Worth noting it takes the first unique instance in case that matters. 

utsarunner981
8 - Asteroid

Hello everyone! Thanks for your help! I have been testing your solutions but to no avail; I'm sure this is partly my fault in explaining my dilemma. I uploaded the dummy data and the result I am hoping to find. I have duplicates on column E, which is the column I am wanting to condense. The data is in no particular order and there are many records in column E that repeat but need a specific one, for example the address of the store. Is there a method to only capturing the address of the store, which is random due to numbers and streets? 

Luke_C
17 - Castor
17 - Castor

Hi @utsarunner981 

 

Amending my earlier solution a bit. I would use a formula to get the length of the location, sort by length (high to low) and then sample the first one for each Employee.

 

The assumption is the longest location value contains the address you want. 

 

Luke_C_0-1628623880557.png

 

 

utsarunner981
8 - Asteroid

Thank you @Luke_C your solution worked out for me! 

Labels