Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Speed Up IN-DB Data Stream Out

Barry_Cooper
5 - Atom

I have a large workflow with several data stream outs from the same connect In-DB tool which connects to Athena with 64 bit ODBC driver. The connect In-DB tool connects to Athena and then from the tool there are multiple branches with different filters applied. At the end of each branch there is a summarization tool and a data stream out. It's taking over 3 hours to run... Is there something I could be missing to speed this up? Would it be quicker to stream out the raw data just once and then branch off and filter from that? Any advice, tips, tricks would be much appreciated.    

2 REPLIES 2
ChrisTX
15 - Aurora

Have you looked at the ODBC config Advanced Options, like "Rows to Fetch Per Block" ?

 

The Inspire 2016 Tips and Tricks document has a few tips for in-DB connections:

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Inspire-2016-Tips-amp-Tricks/ta-p/2...

 

Not sure how dated the document is, or if things like "Cache Data" and "Do Not Show % Complete" are applicable to an Athena connection.

 

Is it possible to reduce the number of records queried by using the Data Stream In tool, and using that data in a where clause with the Dynamic Input In-DB tool?

 

ChrisTX_0-1645615033289.png

 

Chris

apathetichell
18 - Pollux

I can answer one part of this - its faster to run your summarize tools in-db and then stream out. 3 hours seems like a good chunk - a few quick questions - can you confirm how much ram you have your system? When Alteryx uses datastream-out it reads everything into memory and if it hits the cap of memory of available it will freeze. Use resource monitor to see if Alteryx stops and how much memory you have/how much is required.

 

Also @ChrisTX has pointed out that dynamic connect in-db with an extensive where clause is awesome.

 

If you don't see this as a memory issue in resource monitor perhaps your ODBC is configured wrong. Try limiting your query to a hundred million rows and then see if it works.

Labels