Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Save ODBC Input Data in workfow

cswanson21x
7 - Meteor

I feel like this is a newbie question but I dont typically work with input data as large as I will be with my current project... 

 

I am importing a large dataset through an ODBC connect and don't want the data to be pulled in each time I run the flow as I build it out. (Time and server strain)

 

Do I just click the Cache Data button when I first run the workflow? What about if I save and close the file, will I need to run and Cache the input each time I re-open the flow?

 

Thanks!  

4 REPLIES 4
jferrone
8 - Asteroid

This is approach I am taking for similar issue. It may or may not help with your issue. 

 

I need approximately 90 days of data for this workflow, but I don't want to pull 90 days of data each time query runs. What I am doing is saving output to an Alteryx Database (YXDB) file. I then run query for 3 days, Union that data to the YXDB, de-duplicate it and run the rest of the workflow. The YXDB is updated in the process, and additional days are removed if needed to maintain rolling 90 days. In this instance I am only saving Summary level detail, not the entire data set. The file may be large if you need to save all of the details. 

 

Caching will only remain as long as the workflow is open. 

 

 

echuong1
Alteryx Alumni (Retired)

You can "cache and run" on the input the first time you run it. This will not re-run the database pull and will use the data downloaded into temp space moving forward for the remainder of your session. The temp data will be cleared when you close the workflow, so depending on your use case this may not be the best route.

 

If you want to keep using the data without having to re-query the database each time you open the workflow, you can start off by inputting the data, then using an output and writing it to a .yxdb file. This will store the data in an alteryx optimized format, and that can be used as your input for subsequent runs instead of the actual database.

 

Hope this helps!

JagdeeshN
12 - Quasar
12 - Quasar

Hi @cswanson21x ,

 

If you looking to be able to run the workflow with the complete dataset every time(while developing), the caching method that @echuong1  mentioned would be viable.

 

But if you are looking to restrict the amount of data that comes in during development, i.e every run of the workflow during development does not need the whole data set thereby increasing the execution speed you can use this feature:-

 

 

Within the Input tool in alteryx one of the options is Record Limit. You can specify the number of rows here, thereby limiting the amount of data coming in during development.

 

jagdeeshn_0-1611696541354.png

 

 

Doing it this way would make sure you restrict the amount of data while being able to develop/test with an active query.

cswanson21x
7 - Meteor

Thank you! 

This is a perfect solution.


I should have know from my Core cert studying that the yxdb is the ultimate file format to save that data as!

Thanks again,

Labels
Top Solution Authors