Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Comparing Two Large HR data files

aehrenwo
11 - Bolide

I have been tasked with reviewing two similar but different HR source data files.  Some analytics needed:

 

The unique values available in certain fields... % of rows that match between the sources, % Null  

 

The only field that links these two data sets is a UniqueID.  The fields for each are similar but not identical.  There are over 100 fields for each dataset and each file is around 200k rows (all users in the company) 

 

Would it help if I could manually map the fields first?  How could I iterate through every combination of that mapping to do the analysis for the data? 

 

Appreciate any guidance on how to proceed.  I can provide some demo data if that would help. 

 

Adam

 

 

1 REPLY 1
JordanB
Alteryx
Alteryx

Hi @aehrenwo

 

Here are some god resources that can be used to answer a few of your scenarios below.

 

Field Summary tool will give you the unique values available in certain fields and % Null

 

Join Tool on Unique ID will give you those records which match and un-match. You can then use the count records tool on the J (Joined records) and take that value away from the total for all records to get the % match on unique ID.

 

To compare everything you could do a similar process to above, but before the Join you could look at transposing the data and then joining. An example can be found in this article.

 

Hope this provides more clarity as to how you can go about this.

 

Best,

 

Jordan Barker

Solutions Consultant

Labels