Hi Community,
How can we measure the complexity of any alteryx workflow ? Can anyone guide on this ?
Thanks in advance.
There's no standard measure of workflow complexity. A very rough estimate might be the number of tools on the canvas.
I would weight this by the tool category, e.g. Prep = 1x, Transform, Join = 2x, Developer = 3x.
Hi @Gursharan ,
This is a good topic.
The number of tools is one indicator as proposed by @PhilipMannering .
The "shape" of the workflow can be another indicator.
I tried to capture the shape with Network Analysis tool as below.
I share this workflow just as a basis for discussion, as it causes error in some cases.
If the error is resolved, it can be converted to a batch macro so that it can deal with all workflows under a directory.
Workflow
Output 1 - Network Visualization
Output 2 - Tool Count
ToolName | Count |
Formula | 3 |
AppendFields | 1 |
Directory | 1 |
Filter | 1 |
MacroOutput | 1 |
Select | 1 |
Sort | 1 |
Summarize | 1 |
Output 3 - Network Statistics
Name | Value |
Diameter | 8 |
Clustering | 0.25 |
Path Length | 3.22 |
Density | 0.22 |
Hi @Gursharan
The complexity of the underlying algorithm that you're replicating in Alteryx will determine the minimum complexity of the workflow. How would you measure the complexity of an arbitrary process or algorithm? If you can provide detailed answers to this question, we can probably help you build the equivalent in Alteryx.
In response to the answers from @PhilipMannering and @Yoshiro_Fujimori regarding tool count. Complexity and maintainability are often in conflict. Reducing the number of tools in a workflow can make it less "complex" at the expense of making it less easy to maintain. In the extreme, you can probably replace an arbitrarily complex workflow with a single python or R tool. You've reduced the tool count to 1 at the expense of requiring detailed python knowledge to debug and maintain the algorithm. On the other hand, increasing the number of tools can often make the workflow easier to maintain without increasing it's complexity. Using a single Formula tool to convert multiple date fields of different formats(something I'm very guilty of) is more compact than using a chained set of DateTime tools. Logically the Formula tool and the set of DateTime tools perform the same operations so the complexity of the operation hasn't changed even though the tool count is decreased by using the formula tool.
@Yoshiro_Fujimori has a very interesting take on the problem using the network analyzer tool. In general, algorithms increase in complexity with the number of branches that must be taken so this is very useful workflow to have.
If you feel the need to create less complex workflows keep this quote by Albert Einstein in mind “Make things as simple as possible, but no simpler.”
Dan
I liked your Network Analysis workflow so much that I made few modifications to it. I started by increasing the Input line length to 100000 since any complex workflow will easily pass 256 chars on one line. The next change has to do with containers. Since any tools inside a container end up with the node xml inside the xml of the container node, your method of using an XML Parse tool missed the embedded tools. You could build an iterative macro to recursively parse all the nodes, but that sounded like too much work. I got around this problem by treating the lines individually and ultimately using a Multirow tool to come up with the tool name.
Here is the network of your initial workflow when analyzed by itself. Notice that tools inside the container are missing completely
Here's the same workflow network as parsed the modified version
You can actually trace the data path through the workflow from input at the top left to the various browse tools on the bottom and at the left
As a test, I analyzed one of our more "interesting" workflows and got this as the output
This workflow only works for 2022.3 and below since I didn't have the time to add in the new xml dealing with control containers.
Dan