Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to Dynamically Analyse Workflow/File Dependencies/Lineage across multiple Workflows?

kenyap
5 - Atom

Hi all, 


I have been trying to solve this problem for review purposes to be able to run an App over an Alteryx project folder, and be able to return: 

 

a) A list of all workflows, inputs, and outputs used for each workflow. I understand this should be straightforward based on parsing the XML workflows.

b) A list of all of the dependencies for each file. For the below example, I would want to see for any particular Output, what all of the workflows required to run it are, and all of the required input files. The exact sequencing of the workflows is not a big deal, although in theory one should be able to analyse the require run order to produce any output of any of the workflows based on dependencies.

 

Eg. Based on the following example below:

 

Output4.txt requires the following workflows:

Workflow 1

Workflow 2

Workflow 3

 

Output4.txt requires the following inputs:

Input1.txt

Input2.txt

Output1.txt

Output3.txt

 

I am a bit confuzzled by b), as it seems quite hard to be able to iteratively check which workflows and inputs are required to obtain the final output. Conceptually, I think if this example can be solved, it can be easily extrapolated to large, complex projects.

 

Test.JPG

 

Let me know if anyone has any ideas!

 

Cheers,


Ken.

1 REPLY 1
danilang
19 - Altair
19 - Altair

Hi @kenyap 

 

Here one possible solution.

 

danilang_0-1620211526635.png

 

 

danilang_1-1620211569564.png

 

It uses an iterative macro, which is the only way to approach a tree walking problem where you don't know how many levels deep you need to go.  The main workflow builds up a set of unique workflow-input-output triplets and also finds the final outputs and marks these with level 0.  The macro has two containers.  The Current Chain parses the rows that have a level and sends them to the data output.  The second container set up the inputs for the next level of iteration by marking the rows with outputs that correspond to the inputs of the current chain with the next level to process.  At every iteration, the set of rows decreases and the macro stops when there are no more rows to process.

 

The output looks like this, showing the output chain at each step of the process.  The final chains are the ones with the largest level

danilang_2-1620212259195.png

 

Note that this workflow will only work if there are no loops in the process, i.e. the output of workflow A feeds into the input of B that feeds into A.  In the case, the macro will hit the iteration limit since there will always be rows left after each iteration 

 

Dan

 

 

 

 

Labels
Top Solution Authors