Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Comparison Summary to Find Conflicting Columns Between Two Datasets

jmohr
5 - Atom

I have two datasets of sales data from different companies. I need to union them together but the field names are different and there are multiple instances where fields are named the same name but different types (i.e. one company has month stored as a numeric "1" and another has it as character "January"). I suppose the usual recommendation is to manually go through these but this is not practical because there are about 30-50 fields in each dataset and a different number of fields between datasets. I also have a lot of datasets. I need to include these differences in our documentation as well so when this analysis is repeated or this data is used for other analyses, it can be easily referenced so I will need an output.

 

I am looking to get an way to print a list of the field names and types for both datasets so I can easily see if are named differently, differ in type, or possibly missing entirely. I tried pulling the field names and joining them name but it was not easy to understand nor provided all of the info I needed. I have attached two dummy datasets to work with and the desired output as an example.

 

Here is a brief table that explains each field for the two datasets

FIELD NAMEDUMMY DATA 1DUMMY DATA 2
STORE_IDCharacter field of IDs for storesNumeric field of IDs for stores
REVENUENumeric field of Gross RevenueNumeric field of Gross Revenue

EARNINGS

Numeric field of Revenue after costsNumeric field of Revenue after costs
MONTHNumeric field of the current month i.e. January is 1Not present. Match is titled "Period"
PERIODNot present. Match is titled "Month"Character field of the current month 
CustomersA numeric field of total customersA numeric field of total customers
BREAK_EVENLogical field of whether the store broke even for that sales periodMissing
AVG_SALE_REVMissingNumeric field of the Average Revenue per customer

 

Please note here that the data itself won't make much sense here. I am only concerned with the metadata. Also, while these conflicts are specific examples, they are not all encompassing so I am not worried about actually fixing them. With the list, the specific conflict will be identified and resolved.

1 REPLY 1
kamal03
9 - Comet

Hi @jmohr 

 

If I am not wrong you need to find the data types which are coming from different files into one file and compare them with each other.

If my understanding is correct, you can use the below workflow to get the desired output.

 

Please let me know if you have any further requirements on it.

jmohr.png

 

 

Labels