Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Help Removing Files with Duplicate Information Under Different File Name

kkkim
8 - Asteroid

Hi, Community:

 

I am inputting multiple files with same schema into my workflow. I need to find a way to remove duplicate files where the file names are different but data in the file is the same. Is there a way in Alteryx to do this?

 

So even though the file name appears differently, content of the file is exactly the same between two. I can't put a unique as not all columns/rows are unique - meaning data can be repeated throughout the file.

 

Any ideas? Thank you so much for your help!

 

Rgds,

Kate

4 REPLIES 4
Jean-Balteryx
16 - Nebula
16 - Nebula

Hi @kkkim,

 

Do you have sample files or a sample workflow ?

kkkim
8 - Asteroid

Thanks for your response. I am afraid I can't share the actual data but basically each file contains numeric data that can repeat itself. But I am having issues as source file itself is sometimes duplicated (so same record for file twice) but with different file names.

 

Thank you for your help!

Jean-Balteryx
16 - Nebula
16 - Nebula

So it would have same number of rows, same columns and same content in cells ?

Elias_Nordlinder
11 - Bolide

Hello @kkkim 


I understand that you cannot use Union and Unique because of the data itself repeats itself? 
(Solution 2 in the attached workflow).

 


But what about using a join between the two file inputs and keeping the output from the joined files?
(See Solution 1 below).

 

If the two files have the same schema and content the result should only be data from one of the input files.

I attached the workflow below as well.

 

Example:

 

//Elias

 

Elias_Nordlinder_0-1626976180691.png

 

Labels