We are using Alteryx to compare two very large datasets, with file sizes around 20-30GB and approximately 150-200 columns. Is it possible to shorten the execution time or reduce memory usage by changing the comparison method (for example, by converting the data using a hash function first)? Any suggestions or ideas would be greatly appreciated!
Solved! Go to Solution.
As you mentioned, you can use MD5 hash for the columns needed to be compared.
Search for 3 MD5 functions here.
https://help.alteryx.com/current/en/designer/functions/string-functions.html#example-6846024-12