Hello - I have use case to execute Altrex workflow in parallel. I have workflow parameterized using a one element which is getting data from Postgres and teradata and performing some comparison and creating an output file . This workflow is highly optimized but it takes 1 hr time to process the comparison as avg data volume is 50m . Now we want to execute the workflow in parallel say 5 parallel process by passing different parameter so that we can perform 5 comparison in 1 hour rather sone spending 5 hours . Could you please let me know how to do this in Altrex?
Have tried to transforme it on to macro with application tools inside and then you can call the macro as many time as you want and you can change the parameters as well.
You can see the data cleansing tool as it's macro. You can have an idea :)
Hope this helps,
Regards
Hi @Dkundu,
I would not agree with @messi007 , when you set up a macro or any tool in a workflow, it might not paralellize, plus it depends a lot on the content of the macro (tools inside...), most of the time the longest part of the process is extracting data between the server (in your case teradata and postgre) to your environement. In your case, there are a few different possibilities which come to my mind if the data extracted is the same:
If you data is different everytime, I don't see really another way since the most time consuming point will be the extraction of the data.
Generally I would advise you to run your workflow with the option Enable Performance profiling (Runtime settings), and see where is the most time consuming action in your workflow, when this is found, you may be able to factorise it with ease!
Thanks for your reply . For each run my data is different so we are making the db call well ahead and placing it in alterx server ahead in file . So our requirement is just call same workflow multiple time may be 5 for each data set or input. Pls note we are getting the input in spreadsheet. I am thinking what we save the job 5 times using different name and link it to 5 different spreadsheet so that it can run in parallel. As we are keeping in file our goal is use amp on top of it .
Thanks for your reply . For each run my data is different so we are making the db call well ahead and placing it in alterx server ahead in file . So our requirement is just call same workflow multiple time may be 5 for each data set or input. Pls note we are getting the input in spreadsheet. Do you think you macro will work based on my use case ?
I am thinking if we save the job 5 times using different name and link it to 5 different spreadsheet so that it can run in parallel. As we are keeping in file our goal is use amp on top of it .
Hi again @Dkundu,
here is a scheme of how I would do it, the first model is if you have multiple source files for your tests, meaning multiple workflow to process theses tests. The second model is a bit more straight forward, you only have 1 datasource for all the tests, you output it as yxdb to improve performance and then you run one workflow containing all the tests you might need.
My problem is little different . Let’s I have 10 run id like R1,R2…R10 … I have one Altrex flow now if Push R1 to my Altrex flow then it takes 2 hours to process . After that I have to push R2 and wait for 2 hours to complete so if I have 10 run id then I have to wait 20 hours . All I want to complete the process in 2 hours by runnning same workflow 10 times with different run id as parallel stream at same time .