This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hi all,
I feel I'm missing something simple. I am using 2020.2 creating a workflow where I pull data from a sqlserver database. Using SISS the query pulls about 12k in about 5 secs. However using an in database query the query is fairly speedy but the profiler shows data stream out takes up 99% of time (usually about 3/4 of hour) to get the data to the designer side to do next step write to excel file. Not sure how I optimize the workflow so it takes seconds rather than an hour. One thing I looked at was to minimize # of fields. I have about 10 fields that are not large so the who file is about 12 Mb. Seems I'm missing something with Alteryx.
Thoughts?
Thanks,
Jeff
Solved! Go to Solution.
Hi @jeffv ,
A few common reasons for slow In-DB data movement is that your machine might not have the necessary technical requirements for Alteryx or you might be using the wrong driver. I've linked our technical specifications here for your review and would recommend making sure you have a 64bit driver installed and set up.
Another thing to consider is that narrowing down your query will always help speed up the Data Stream Out tool, so doing as much filtering in the IN-DB tools before the data stream out could help. In most cases a unfiltered Data Stream Out will take the same amount of time as a standard Input Tool hooked up to a database.
Thanks for reply... I believe I have the technical requirements for running Alteryx. I have an engineering work station running 64Bit Windows 10 with 32Gb Mem and 1Gb SSD. I'm running 1 Gb network card and a processor with 8 cores. The link you sent I can't really decipher how it describes the driver (though describes many things). I believe I have the right driver though as some queries seem reasonable, but on occasion with various queries it runs exceeding slow (IMHO). I have narrowed down query with a single integer being data streamed out and it still takes a long time (except when I remove data stream out tool).
Alteryx processes queries in a unique way as a sub-select. I decided to use Python to execute the query and bring the info back. Shaazam!!! The query data came back in a few seconds. I guess at this point how do I optimize sql queries for a sqlserver database for Alteryx? Can someone give me pointers to a blog or white paper?
Is python executing a SQL script and then writing to a new server from which the data is originally queried from?
I am dealing with a scenario where the stream in and stream out from Ex: Server 1 to Server 2 is taking many hours and causing us problems. I am currently only using IN DB tools but appear to have hit a roadblock.
My suggestion is to use python (or R but I'm not versed in R)... in my efforts Python is much faster than the way Alteryx generates sql queries.
Could you please share an example using python to load a table?