community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Ability to execute tools in parallel within the same workflow

Tools within a workflow needs to be able to run in parallel whereever applicable.

 

For example: Extracting 10 million rows from one source, 12 million rows from a different source to perform blending.

currently the order of execution is the order in which tools are dragged into the canvas. Hence Source1 first, Source2 second and then the JOIN.

 

Here Source1 & Source2 are completely independent, hence can be run in parallel. Thus saving the workflow execution time.

 

Execution time is quite crucial when you have tight data loading window.

 

Hopefully alteryx considers this in the next release!

40 Comments
Fireball

I second this! 

 

Even if the parallelization was only for input tools and nothing else, it would greatly enhance the efficiency of worflows which source data from multiple databases. 

Meteor

I need this feature very often too! Usually I simple create several workflows and run them in parallel. With a little external code this parallelization can even pretty easily be automated. In my opinion however this is just a so clear-cut future feature of Alteryx that it doesnt make sense not to have it implemented. Adding this would make the ETL part of the tool more complete.

 

It is however my opinion that the feature should not stand alone, but be part of a larger set of features for controlling the execution order of nodes and tool containers. Alteryx would become truly awesome if this was implemented.

Alteryx Partner

I'd like this as well.  Thanks.

Hi Alex,

 

Thanks for your reply!

 

I'm very surprised to see that enterprise clients who have a tight data loading / analytical data processing windows to fetch data from source systems did not request this parallel processing feature!

 

Regards,

Sandeep.

Alteryx
Alteryx

I think this would be valuable also. In my mind, the request should be related to scaling processing via multi-threading.

 

I have a related feature request to allow batch macros to be multi-threaded. Since batch macros know all the possible inputs before the first iteration is run, they theoretically could be processing iterations on multiple threads / cores. This would be an extremely powerful feature, and if implemented only in the context of batch macros, could (maybe?) limit the implementation complexity.

I think this would be a great addition.  Why would this slow people down?  I'm not sure I get the reasoning behind not planning it.  

Alteryx
Alteryx
Status changed to: Under Review

Hi all,

 

Thanks for the continued feedback. Though we're still not planning to fully implement parallel processing for the entire workflow, we are starting to look at how we can extract data from multiple input sources at the same time.

 

Best,

Alex 

Meteor

Parallel input data extraction would be a very good starting point and to my organization perhaps the most important node to support parallel processing as : 

 

  1. We often apply a wide range of independent server hosted databases on different platforms in a single workflow - so letting multiple db servers work has no adverse performance effects
  2. We often use complex SQL selects in input nodes meaning that our db servers will often need some time to generate the resulting recordsets  - so even the extractions of data from dbs to Alteryx, that require local resources, will not necessarily happen simultaneously, which is one more argument pro parallel processing of input nodes

Hope to see this feature soon!

Bolide

Any updates on this?  There are many situations ripe for parallel processing, inputs, splitting a file into multiple streams for sorting in different ways to process same data in different contexts, executing multiple summaries from same source, etc.

The modern PC has multiple cores and multi threading to support this.  If I write two simple workflows to read two different inputs I can run them concurrently, that is essentially parallel processing.  But if I put both in the same workflow it will serialize, doubling my read time.

I suspect your code is sort of load & go where it is not precompiled but is loaded and interpreted at runtime so everyhing is forced into a single instance of the Alteryx process.  I wonder If you could take advantage of piping to build a new type of connector or IO tool to read/write from/to another concurrent Alteryx process.  then we could write workflows that are essentially macros that perform a unit of work, reading, writing or processing, All which can run concurrent as the do not bu pass data through pipes as quickly as the buffers allow.

I've used pipes to connect separate programs, one processing and writing and the other reading and doing further processing.  reduces significantly file management overhead, i/o and thus wall clock.

Rather than trying to rework internal code in what might be a more major way maybe this approach would offer a less disruptive and easier approach to build in.

Alteryx Certified Partner

Piggybacking on the above comments. This is a barrier to entry for some of my enterprise clients in terms of using Alteryx. Please implement parallel inputs, especially with in-DB tools.