Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Multi select and match List in Macro (Like Join tool)

SharkeyNZ
7 - Meteor

I regularly get data sets that I need to join (on 1 or more fields) and the first thing I want to know is how much of the data is common and will join, and how much will fall out of the process. 
The fall out may be bad data that needs cleaning or is legitimately dissimilar.  For Example, take two data sets to compare:

 

Data Set One    Data Set 2   
ROLELEVELTEAMClassification TypeCategoryExperienceScrum
AnalystSeniorFull StackBig Data Java SECloud NativeJuniorDigital
AnalystJuniorWaterfallBlockchain Java SEJava, BA

Junior

Niche
DEVMidStandard

Design Systems Designer

 PMBig DataIntermediateStandard
DEVSeniorFull StackDesign Systems Engineer TesterCloudSeniorStandard
JavaJuniorWaterfallCloud Native Data Scientist Azure+5 YearsDigital
JavaMidStandardCloud Native Data ModellerBig DataSeniorWaterfall

Project Manager

SeniorStandard.Net Developer

Design Systems Designer

+5 YearsNiche

 

I want to make a macro where I can take the two data sets, select the Fields that I want to compare, so for the data above, I want a selection similar to a Join tool, where I select fields:

  • Data Set One - ROLE  <-> Data Set 2 - Type
  • Date Set One - LEVEL  <-> Date Set 2 - Experience
  • Date Set One - TEAM <-> Date Set 2 - Scrum
  • Date Set One - Classification <-> Date Set 2 - Category

Each field value comparison will be independent.  I've seen examples of how to create a single list of the field names, but not one that allows me to select and match up both sets of field names.  Has any one got a solution here?

4 REPLIES 4
caltang
17 - Castor
17 - Castor

Not very sure I follow you, but correct me if I am wrong:

 

Basically you will ingress two files into Alteryx, and you want to see what data can join and what can't join based on selected fields?

 

In this case, I have a few points to make:

  1. If you have a many-to-many join, it will give you many more records than needs be. You'll need to use a Tile Tool to get them a specific / unique number per field name. It's easy if it's just 1 field, but having many fields will be a bit tricky.
  2. Why a Macro specifically? It can be a workflow in itself, and you can package it as a Macro later no?
  3. Can you provide an expected output or even a mock-up design that you ideally want?

The above will help.

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
OllieClarke
15 - Aurora
15 - Aurora

Hey @SharkeyNZ I can think of a way to build this macro, but it would involve typing in fields in the order of joining into 2 text boxes, and then updating the join tool in a nested batch macro. You could also maybe have as many drop downs as there could be join clauses (with some left blank) to set join clause 1, join clause 2 etc.
Neither one of these 'solutions' would be user friendly, both would involve nested batch macros. 

Another 'solution' (attached) is to transpose all the data and append it. You can then see where there are matching values. This approach will be slow with large datasets though.

 

image.png

SharkeyNZ
7 - Meteor

Thanks @caltang.
In terms of output, I'd like for example:

Evaluating Fields names (Selection 1 - Selection 2)Matching values in both setsValue from selection 1Value from selection 2
Level - ExperienceJuniorJuniorJunior
Level - ExperienceSeniorSeniorSenior
Level - ExperienceNot MatchedMid 
Level - ExperienceNot Matched Intermediate
Level - ExperienceNot Matched +5 Years
Team - ScrumWaterfallWaterfallWaterfall
Team - ScrumStandardStandardStandard
Team - ScrumNot MatchedFull Stack 
Team - ScrumNot Matched Digital
Team - ScrumNot Matched Niche

 

Part of the goal here is I want to see what's different between multiple groups of two fields, that are "supposed" to have the same data set of unique values, and if possible to create a mapping file so that I can join other elements of data sets.

 

I want a macro, as this problem comes up often with the data I get, and the field names change constantly.  So while I can do this in a follow, I'd need to update it each time.

SharkeyNZ
7 - Meteor

Thanks @OllieClarke.  As you stated, the approach is slow with large datasets if not cut down to unique values.  I will try out your other suggestion.

Labels