Speed Up IN-DB Data Stream Out
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have a large workflow with several data stream outs from the same connect In-DB tool which connects to Athena with 64 bit ODBC driver. The connect In-DB tool connects to Athena and then from the tool there are multiple branches with different filters applied. At the end of each branch there is a summarization tool and a data stream out. It's taking over 3 hours to run... Is there something I could be missing to speed this up? Would it be quicker to stream out the raw data just once and then branch off and filter from that? Any advice, tips, tricks would be much appreciated.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Have you looked at the ODBC config Advanced Options, like "Rows to Fetch Per Block" ?
The Inspire 2016 Tips and Tricks document has a few tips for in-DB connections:
Not sure how dated the document is, or if things like "Cache Data" and "Do Not Show % Complete" are applicable to an Athena connection.
Is it possible to reduce the number of records queried by using the Data Stream In tool, and using that data in a where clause with the Dynamic Input In-DB tool?
Chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I can answer one part of this - its faster to run your summarize tools in-db and then stream out. 3 hours seems like a good chunk - a few quick questions - can you confirm how much ram you have your system? When Alteryx uses datastream-out it reads everything into memory and if it hits the cap of memory of available it will freeze. Use resource monitor to see if Alteryx stops and how much memory you have/how much is required.
Also @ChrisTX has pointed out that dynamic connect in-db with an extensive where clause is awesome.
If you don't see this as a memory issue in resource monitor perhaps your ODBC is configured wrong. Try limiting your query to a hundred million rows and then see if it works.
