Alteryx Designer Desktop Discussions

MartWClimber · ‎09-05-2022

Within Alteryx we got the yxdb format, that one is lightning fast 😀

But I wonder what would be faster in this situation below?

Situation:

csv file: 6M+ rows with around 30 columns.

csv file: 20M+ rows with around 15 columns.

What would be faster: if I load the csv files into alteryx and then use a join (because these files need to join) and then save as an yxdb file

or should i convert the csv files first to yxdb in a seperate workflow and then use that as an input.

In my opinion the last option should be faster but i'm not sure. because if I convert to yxdb and then go to the join workflow this is an extra step (extra time)

Also could AMP Engine help out in improving speed?

Luke_C · ‎09-05-2022

Hi @MartWClimber

I'd suggest converting each csv to yxdb. Inevitably you'll probably end up running your workflow with the joins more than once so you'll save time by converting (you could cache the inputs but that is lost if you close alteryx).

MarqueeCrew · ‎09-05-2022

I'd AMP and write to YXDB after the join. The I/O cost of the write and creation of multiple workflows wouldn't warrant any time saved. I would expect this to still run fast. You can SELECT and update the 254 vstring default for the field types after the input to make the join more efficient.

cheers,

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

Alteryx Designer Desktop Discussions

fastest way to store data

Re: Alteryx Load Time

[Sharing] Custom Formula Function 'IsPrime' to Jud...

Re: Accounting for Zero Records on Email Output

Re: Extract text up to 3rd - from a string

Re: How to force Alteryx to input the column as te...