Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Comparing 2 data sets to find missing and extraneous information

IMensink
5 - Atom

I have been trying to find an efficient way to compare 2 data sets to find missing or extra information, and was wondering if anyone would be able to help me out.

 

I have the below example. Dataset 1 is sort of a "key", where I have a set of values in column B that should be associated with a set value in column A. In dataset 2, I can have multiple records that contain the set value in column B, and then the values I want to compare are in column C. So I want to use the relationships of column A-B on Dataset 1 to compare to Column B-C on dataset 2. I have included my ideal output as well.

 

Example.PNG

5 REPLIES 5
T_Willins
14 - Magnetar
14 - Magnetar

Hi @IMensink,

 

The Join tool is your friend for this.  By Joining data and using the left and right unjoined rows you can find where each data set doesn't match the other.  Additional data from Dataset 2 (Column A) must be added to Dataset 1 (Summarize and Join) to allow for all fields to be compared in the 2nd Join.  The remaining tools rename, change field type, summarize, concatenate, and Join/Union the data together in the format you are looking for.

 

Compare 2 data sets.png

 

 

IMensink
5 - Atom

Thank you @T_Willins, very helpful!

Meena
8 - Asteroid

Hi @T_Willins 

 

Please could you help me if this solution will work for my requirement which is similar to this.

 

I have to find the missing Type Group (rows) with the Lookup.

Lookup Table  columns Seq Code, Type Code matched with the sample Data .. and any missing Type Group must be reported for each Seq# for the Seq Code, Type Code combination.

 

Sheet Unit_Test contain the example which is expected to be in output.

 

Thanks

Meena

T_Willins
14 - Magnetar
14 - Magnetar

Hi @Meena,

 

From your description and your data it looks like you need to validate the data from the lookup rules is included in the Input_Entire Volume for each Seq #.  The attached workflow produces the results on the Macro Output_Entire Volume tab, with what looks like additional missing records.  For example, your Excel file indicates Seq Code:  300; Seq #:  9450645163; Type:  4 is missing Child, but the Lookup_Rules indicates this Seq Code/Type Code combination should have Type Groups Child, Father, and Mother.  The input data only has Type Group Father, so the Output should have both Child and Mother as missing.

 

Compare 2 data sets - @Meena.png

 

Meena
8 - Asteroid

Thank you @T_Willins.

This solution worked.

Labels