Hi,
For example, let's say I have a file size of about 400GBs, while there are only about 20GBs of PC memory left.
To handle this, I've uploaded the file to Google Bigquery.
My question is, will Alteryx Designer be able to process this file through the workflow??
I'm not an IT expert, but my intuitive guess is that Alteryx Designer won't have enough temporary memory to save data for each tool.
It would be best to try running the workflow with a real example,
but I want to know this in advance before talking to my client.
P.S. IF I use Alteryx Server, will the problem above will be solved by making the operation cloud based?
(We're probably deploying Alteryx Server next year, and it would be informative to know in advance)
Solved! Go to Solution.
Hi @JunePark ,
Yes, Alteryx can definitely process files that big with no problems whatsoever, you only need to test it to check how long it takes and that is specific related to the workflow complexity of course.
Running that in a Alteryx designer locally in an average notebook, you can use the AMP engine, which is a multi-threaded processing engine, released in 2020.2 version which will run a lot faster than the regular engine. Find more about here: https://help.alteryx.com/current/designer/alteryx-engine-and-amp-main-differences
Related to the memory usage, AMP engine uses 25% of notebook's available memory to process a workflow and after consuming all of it, it starts to write temp files to process anything else.
Also, after processing the data for a specific tool, Alteryx moves forward and only leaves a sample of data showing in the results tab (1MB of data in memory by standard).
Tips:
Another option for you is to use in-database tools, where you can use your database performance to run your workflows, this will improve a lot your process time from minutes to seconds (we are currently using redshift in some projects and it is breath taking
https://help.alteryx.com/current/designer/database-overview
Lastly, if you move to Alteryx server with a higher RAM, it will improve your process time but currently (2020.3) you can't select the workflows that will run with the AMP engine and which will not, you have only a global selection of engines and that is only suggested for some cases with Alteryx server (keep in mind that this is a current state but will possibly change in the next versions)
Hope this clarifies a bit.
Best,
Fernando Vizcaino
As a non IT Professional, there's a lot to learn.
Thanks for your detailed advice, I'll walk through them thoroughly.
I hope Alteryx supports In database tools for BigQuery as well in the future.
Hope you have a wonderful day!