Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Help Removing Files with Duplicate Information Under Different File Name

kkkim
8 - Asteroid

Hi, Community:

 

I am inputting multiple files with same schema into my workflow. I need to find a way to remove duplicate files where the file names are different but data in the file is the same. Is there a way in Alteryx to do this?

 

So even though the file name appears differently, content of the file is exactly the same between two. I can't put a unique as not all columns/rows are unique - meaning data can be repeated throughout the file.

 

Any ideas? Thank you so much for your help!

 

Rgds,

Kate

4 REPLIES 4
Jean-Balteryx
16 - Nebula
16 - Nebula

Hi @kkkim,

 

Do you have sample files or a sample workflow ?

kkkim
8 - Asteroid

Thanks for your response. I am afraid I can't share the actual data but basically each file contains numeric data that can repeat itself. But I am having issues as source file itself is sometimes duplicated (so same record for file twice) but with different file names.

 

Thank you for your help!

Jean-Balteryx
16 - Nebula
16 - Nebula

So it would have same number of rows, same columns and same content in cells ?

Elias_Nordlinder
11 - Bolide

Hello @kkkim 


I understand that you cannot use Union and Unique because of the data itself repeats itself? 
(Solution 2 in the attached workflow).

 


But what about using a join between the two file inputs and keeping the output from the joined files?
(See Solution 1 below).

 

If the two files have the same schema and content the result should only be data from one of the input files.

I attached the workflow below as well.

 

Example:

 

//Elias

 

Elias_Nordlinder_0-1626976180691.png

 

Labels