Hi,
I am trying to compare and validate my data following a new ETL process. I have 2 csv-based input files where one is derived from Prod and one from a new ETL process but should contain the same mirrored data. I used join tool to match my columns as the column headers are the same and I got pretty close matches. However, I want to do comparison for each column so I know where I can identify where there is a mismatch and the specific columns where data doesn't validate.
One way to do it is after the join, create a Boolean formula to test the fields from Prod vs Dev. Then filter on any of the False records. Those will show you where the Prod Field doesnt match the Dev field.
[Field2]=[Right_Field2]
Thanks for the quick response back, is there a faster way to do it because I am working with around 50 columns in each dataset.
Check out this example from my "What the XML?" Inspire presentation called "Table style rules". I use a batch macro to edit the XML of a Table tool to highlight fields in two tables that differ for a nice visual comparison.
Of course this workflow can be modified to join on a field rather than position, or output to Excel if you prefer (might be better with larger datasets). Let us know if you have any questions!
User | Count |
---|---|
107 | |
85 | |
76 | |
54 | |
40 |