community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

In-Database Data Stream out 50 - 85% of Workflow Run Time

Alteryx Partner

Hello Community, 

 

Has anyone experienced workflows that have IN-DB Data Stream Out taking up the majority of the run time?

 

I have an application that retrieves data from MS SQL Server based on user input. Initially, I thought that the delays in return were the queries hitting large tables in the database but performance profiling pointed to the Data Stream Out tools as taking up the majority of time. There are 4 data stream out tools and the cumulative percent is anywhere from 50 to 80 percent.

 

*Edit* Workflow is only streaming 20 records or less through each data stream out tool. Main reason I was curious about the run time.

 

Figure the delay may be more on the SQL Server side and am curious if anyone has any experience with troubleshooting the issue.

 

Thanks,

 

Jack 

  

Alteryx Certified Partner
Alteryx Certified Partner

I/O costs lots of time when you're selecting the data from your tables.  You are piping your data into a single faucet and then reading that data into the workflow.

 

Can you do more processing IN-DB and bring fewer records together on your desktop?

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Partner

@MarqueeCrew My bad, I forgot to mention that the workflow is only streaming 20 records or less through each tool from the database back to Alteryx. 

Did you ever figure this one out? I’ve started experiencing the same thing and profiling puts my 2 out streams at 70-80%. I’m dealing with large data joined with large data, but the IN-DB functions drastically reduce the rows I’m trying to stream into the workflow.

For what it’s worth, I have a number of workflows that perform the same operations, but are filtered on different subsets of data. One of the workflows that filters down to 33M rows and performs does some work then streams out ~3,000 rows takes around 22 minutes for the streaming portion. I have another workflow referencing the same initial DB table that filters to 70M rows, does the same calculations, streams out ~1,000 rows, but the stream takes 40 minutes. So there are 3x rows in the first case, but it takes half the time. That makes me think that either the profiler doesn’t properly count time specifically allocated to the Data Stream Out tool, or somehow the stream out is impacted by the size of the data filtered into the In-DB calculation, which doesn’t feel right.

Anyway, thanks for any suggestions in troubleshooting Data Stream Out time issues.
Labels