fastest way to store data

Question

Within Alteryx we got the yxdb format, that one is lightning fast 😀

But I wonder what would be faster in this situation below?

Situation:

csv file: 6M+ rows with around 30 columns.

csv file: 20M+ rows with around 15 columns.

What would be faster: if I load the csv files into alteryx and then use a join (because these files need to join) and then save as an yxdb file

or should i convert the csv files first to yxdb in a seperate workflow and then use that as an input.

In my opinion the last option should be faster but i'm not sure. because if I convert to yxdb and then go to the join workflow this is an extra step (extra time)

Also could AMP Engine help out in improving speed?

MarqueeCrew · Accepted Answer

I'd AMP and write to YXDB after the join.  The I/O cost of the write and creation of multiple workflows wouldn't warrant any time saved.  I would expect this to still run fast.  You can SELECT and update the 254 vstring default for the field types after the input to make the join more efficient.

cheers,

mark