We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

fastest way to store data

MartWClimber
9 - Comet

Within Alteryx we got the yxdb format, that one is lightning fast 😀

But I wonder what would be faster in this situation below?

 

 

Situation:

csv file: 6M+ rows with around 30 columns.

csv file: 20M+ rows with around 15 columns.

 

What would be faster: if I load the csv files into alteryx and then use a join (because these files need to join) and then save as an yxdb file 

or should i convert the csv files first to yxdb in a seperate workflow and then use that as an input.

 

In my opinion the last option should be faster but i'm not sure. because if I convert to yxdb and then go to the join workflow this is an extra step (extra time) 

 

Also could AMP Engine help out in improving speed?

2 REPLIES 2
Luke_C
17 - Castor
17 - Castor

Hi @MartWClimber 

 

I'd suggest converting each csv to yxdb. Inevitably you'll probably end up running your workflow with the joins more than once so you'll save time by converting (you could cache the inputs but that is lost if you close alteryx).

MarqueeCrew
20 - Arcturus
20 - Arcturus

I'd AMP and write to YXDB after the join.  The I/O cost of the write and creation of multiple workflows wouldn't warrant any time saved.  I would expect this to still run fast.  You can SELECT and update the 254 vstring default for the field types after the input to make the join more efficient. 

cheers,

 

 mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels
Top Solution Authors