community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Iterative Macro which stops when there are no duplicates.

Meteoroid

Hi

 

I am new to Alteryx and am currently working on a report comparison workflow.I am not exactly sure if this is doable with Alteryx.

I have very little idea about Macros in Alteryx and does not have a coding background 

 

So the basic idea is,there will be two reports with identical headers,but with different values for them.

This reports can have duplicate records in each,unique records in each (which are different records which are unique to each),and different in number of records.

I am trying to map all the unique records,so that i will only be left with the union of un joined records in each of the report.

 

My idea is to separate the unique (Joining unique and writing the un joined records to output)  and duplicates (using unique tool) and pass the duplicates in each of the file in a macro.And inside the macro,do the same (separating unique and duplicate and keep writing(appending) the un joined unique to the initial output till there are no records in either of the report).

 

Thanks

Nidhin

Alteryx Certified Partner

What if you input both files, union on all records, then unique on all fields? It would leave only records that are unique within each input and to either input. 

Meteoroid

Just to add.This is what i am trying to achieve.

 

Step 1 Read the files          
             
File 1 File 2        
#Value #Value        
1Value 1 1Value 1        
2Value 2 2Value 2        
3Value 3 3Value 3        
4Value 4 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
1Value 1 1Value 1        
2Value 2 2Value 2        
3Value 6 3Value 3        
4Value 7 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
Step 2Finding unique  and duplicate values       
File 1 File 2        
#Value #ValueFinal output      
1Value 1 1Value 1Mapped       
2Value 2 2Value 2Mapped       
3Value 3 3Value 3Mapped       
4Value 4 4Value 4Mapped       
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
3Value 6   Not mapped       
             
Duplicates -Initial set from the first Unique output       
File 1  File 2         
#Value #Value        
1Value 1 1Value 1        
2Value 2 2Value 2        
4Value 7 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
5Value 5 3Value 3        
6Value 6 5Value 5        
7Value 7 6Value 6        
8Value 8 7Value 7        
   8Value 8        
From here on I want to loop the duplicate outputs and map them to each other on every loop till there is no records left in any of the file
Unique -Loop 1        
File 1 File 2Final output      
#Value #Value        
1Value 1 1Value 1Mapped       
2Value 2 2Value 2Mapped       
4Value 7   Not mapped       
   4Value 4Not mapped       
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
   3Value 3Not mapped       
             
Duplicate 2        
File 1 File 2        
#Value #Value        
             
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
             
This process is to continue till all possible records are mapped and one of the records become null   
I will be writing the all the mapped and Not mapped values in the output file with the appropriate status   
             
             
Final output will look like this          
             
#Value#Value         
1Value 11Value 1Mapped        
2Value 22Value 2Mapped        
3Value 33Value 3Mapped        
4Value 44Value 4Mapped        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
1Value 11Value 1Mapped        
2Value 22Value 2Mapped        
3Value 6  Not mapped        
4Value 7  Not mapped        
  4Value 4Not mapped        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
  3Value 3Not mapped  

     
             
             
Meteoroid

Thanks Charlie S.But the files i am using will have multiple duplicates in both.A small bit of design is added in reply.

 

When i get duplicates from the files,i would need to use that as the input and again do a unique on individual file (records left after removing the unique the first time) and keep doing this till there are no more records available in any of the file (any one file is left with records and the other is not,i will take a union of Left and right output of Join and mark them as un mapped.

 

Hope my reply makes sense.

 

 

Meteoroid
Step 1 Read the files          
             
File 1 File 2        
#Value #Value        
1Value 1 1Value 1        
2Value 2 2Value 2        
3Value 3 3Value 3        
4Value 4 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
1Value 1 1Value 1        
2Value 2 2Value 2        
3Value 6 3Value 3        
4Value 7 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
Step 2Finding unique  and duplicate values       
File 1 File 2        
#Value #ValueFinal output      
1Value 1 1Value 1Mapped       
2Value 2 2Value 2Mapped       
3Value 3 3Value 3Mapped       
4Value 4 4Value 4Mapped       
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
3Value 6   Unjoined       
             
Duplicates -Initial set from the first Unique output       
File 1  File 2         
#Value #Value        
1Value 1 1Value 1        
2Value 2 2Value 2        
4Value 7 4Value 4        
5Value 5 5Value 5        
6Value 6 6Value 6        
7Value 7 7Value 7        
8Value 8 8Value 8        
5Value 5 3Value 3        
6Value 6 5Value 5        
7Value 7 6Value 6        
8Value 8 7Value 7        
   8Value 8        
From here on I want to loop the duplicate outputs and map them to each other on every loop till there is no records left in any of the file
Unique -Loop 1        
File 1 File 2Final output      
#Value #Value        
1Value 1 1Value 1Mapped       
2Value 2 2Value 2Mapped       
4Value 7   Unjoined       
   4Value 4Unjoined       
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
   3Value 3Unjoined       
             
Dupliacte 2        
File 1 File 2        
#Value #Value        
             
5Value 5 5Value 5Mapped       
6Value 6 6Value 6Mapped       
7Value 7 7Value 7Mapped       
8Value 8 8Value 8Mapped       
             
This process is to continue till all possible records are mapped and one of the records become null   
I will be writing the all the mapped and unjoined values in the output file with the appropriate status   
             
             
Final output will look like this          
             
#Value#Value         
1Value 11Value 1Mapped        
2Value 22Value 2Mapped        
3Value 33Value 3Mapped        
4Value 44Value 4Mapped        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
1Value 11Value 1Mapped        
2Value 22Value 2Mapped        
3Value 6  Unjoined        
4Value 7  Unjoined        
  4Value 4Unjoined        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
5Value 55Value 5Mapped        
6Value 66Value 6Mapped        
7Value 77Value 7Mapped        
8Value 88Value 8Mapped        
  3Value 3Unjoined        
             
             
Alteryx Certified Partner

I think the most reliable way to make iterative macro is to design it so that the schema is consistent beginning to end. What this means in scenarios like this:

 

Union all records from both files and add two columns Source (file1, file2, etc) and match. Match should start out as 0/null, but each iteration, in updating the match column which whatever value is relevant. At the end of the iterative macro, I append the count of unmatched/unmapped records to all records, and filter if the number is greater than 0 (loop) or not (final output). Then add a select tool before each output to remove the count field so that each output is given a field schema that matches the original input. All records are looped every time with a consistent field schema. This will keep the macro looping if there's any work to be done, or output the results if not. 

Meteoroid

Hi Charlie S,

 

If i am to union both files (with Source and match column to differentiate them),will it not match the duplicates as well (Eg : say there are 4 duplicates in File 1 and 3 in File 2,will it not update all the 4 records as matched ?).I want the result to say the 3 records in File one have matching records in File 2 and the 4th record in file 1 marked as "Matching records not found"

I am still having difficulty in setting up the batch macro,so couldn't run it to verify the results.

 

Thanks

Nidhin

Alteryx Certified Partner

Ok, now that I've seen your example data, I've got a different approach that doesn't require a macro.

 

The attached solution uses the Tile tool to not only identify unique values, but how many of each unique value occurs in each file. This way unique values are joined on both fields so the total records remain consistent. 

Meteoroid

Hi Charlie,

 

Sorry for the delay in response.I tried with another approach.

 

First i will groupby the data with all the column in groupby and take a count (for both the files)

then join then on all the columns.This way i will get the un joined ones in Left and right output of the join and the ones that are joined i will use the formula to get the Left count -Right count.

If the the count is != 0,I populate those records as output along with the unjoined (Left and right output from join).for the records I will also set from which source file the additional records are present using the formula tool

Attaching the screenshot of the same

 

Thanks

Nidhin

Labels