Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

File potentially corrupt?

jbichachi003
9 - Comet

I have an Alteryx workflow that is an absolute mess. The tools are thrown on the canvas in no logical order, few connections are wireless, and there are so many tools (thousands), it's hard to make heads or tails of where the data is coming from and going to. That said, the workflow works just fine.

 

I've been tasked with cleaning it up. I made a copy of the original file and started cleaning it up by grouping tools together, making connections wireless, deleting unnecessary tools (for example, sometime the workflow flows into a Filter tool with no outgoing connection), etc. With the exception of a few formatting issues to expression editors (for example, writing each parameter of an IF statement onto a new line), no edits to the configurations of the tools themselves have been made.

 

Unfortunately for me, the original file was updated when I was already in the middle of the cleanup. I therefore used the compare workflow feature, added in the new tools that were added (this was easy to do as the default Tool ID number assigned to each tool was incremental), tried to configure them as best as possible, and made sure they were connected (inbound and out) to the correct tools.

 

Even worse is that whenever I try to run the workflow, a lot of the data doesn't flow through. Early in the process, for example, there's an Input Data tool which connects to a Union tool. Also connected to that Union tool is a Text Input tool with a single field and row of data. The data flows up until the Union, and then the Results pane says something along of the lines of "Memory limit reached".

 

This problem doesn't exist in the original workflow. I can't add Browse tools to the new workflow, because if further updates are made to the original workflow, that will throw off the Tool ID numbers which I use to reconcile the workflows. I've never seen this before, and I'm wondering if that means the workflow is corrupt?

 

Under the User Settings, I've chosen to also override the system settings to max out the memory limit. I've also unchecked the box to run Designer x64 at a lower priority. When that didn't work, I also went under the Advanced option and checked the box for "Tool Results Settings" to max out the memory limit per Anchor (I think this latter configuration made it worse).

 

Below is are photos of the workflow before the cleanup and some of the progress I've made (I've since made more progress). I'd hate to have to start from scratch, so any solutions would be greatly appreciated!

6 REPLIES 6
apathetichell
19 - Altair

A few things. 1) did you turn off amp? you should. 2) Is it one row/field or one field/no rows? one row/field is a basic test for if a record exsists. It would be used with an append  tool to  make sure that the tool didn't error out with lack of data. one field/no row -> union makes sure that a field header exsists downstream. 3) can you turn some of it into macros. my hunch - is that 1) the user requirements changed while building 2) the effort was on "get the workflow built" - not "make a workflow which can be edited by another person" 

 

It sounds like you had a visual disaster of an original workflow... that works - but trying to make it neater breaks it...  Are you changing this workflow purely for cosmetic reasons or did you need to make a reroute/logic change?

ChrisTX
16 - Nebula
16 - Nebula

Have you tried turning off the AMP engine?

 

Also, I did have luck copying all tools and pasting them into a completely black workflow.

 

Step 1: find the original workflow author and share best practices for workflow development.  Try to avoid the words you really want to say.

 

Guess I don't understand why you want to make many connections wireless.  For me, it's an extra step to click on a tool to see what it's linked too.

I do understand occasionally using wireless, but not often.

 

Chris

jbichachi003
9 - Comet

Thank you for both your replies.

 

Turning off the AMP engine seemed to help out. Do you know why turning that off potentially fixed the problem?

 

To answer some of your other questions:

  • The original workflow author is aware of best practices for workflow development. That said, the workflow, for a time, was being updated very regularly and with incredible speed. Likewise, the effort was, as @apathetichell put it, "get the workflow built" instead of "make a workflow which can be edited by another person". It's now up to me to convert the workflow to the latter. The changes are strictly cosmetic, but the changes will make it much easier to edit going forward, and review by others. I'm not upset with the original builder; this workflow has taken years to build and has been edited many times.
  • @ChrisTX I'm trying to make some tools wireless so that it is visually easier to review. I included two photos on the original post of what it looked like before I made edits and what it looks like after I've made some updates. The former is a disaster with what looks like spaghetti thrown all over the canvas instead of helpful lines showing the data flow. Once I finish the clean up and this is the final version going forward, I can always include comment boxes to help the user determine where the data flows through. Plus, there's always the navigation feature on the Configuration pane. Also, once things are rearranged, it might make sense to convert some of the already existing wireless connections to wired ones (I've done that a few times).
  • There are already macros within the workflow, and the workflow itself is a macro. The workflow is eventually copied, the copy is encrypted, and the encrypted copy is shared on a shared drive. The macro is encrypted as it contains sensitive IP, so we wouldn't want to break it up into smaller encrypted macros as that would be extra difficult to maintain whenever an update needs to be made.
  • The main template that feeds into this workflow has been updated many times. That said, we wanted the workflow to be backwards compatible. Because Alteryx will error if a particular column doesn't exist (specifically in an earlier version of the template), we add it in using the Text Input with a single column (the new column in later versions of the template) with a random value (such as "___DeleteME123!___") which eventually gets filtered out. That way, no rows are added, but the column structure remains compatible with the downstream processes of the workflow.

Again, thank you both for your help! I'd really appreciate a response to why the AMP engine would make a difference. Once that is responded to, I'll mark you both as the acceptable solution.

apathetichell
19 - Altair

Hey - I was actually curious if I was part of the team that built this. I will say that  I have built workflows that look similair to this. At it's core I think the more you do with Alteryx and the more dynamic a workflow is the more likely it will end up looking like this - especially if there is really convoluted/complicated business logic baked in and you are building it for an impatient client who is 'undecided' on renewing the contract.  I would posit the same is true with almost everything - it just becomes easier to do with Alteryx. That you don't see Airflow DAGs that look like this shows an oversimplificiation of what's going on in the transforms - not that there's some specifically Alteryx problem with abstract expressionist automation designs. 

 

I'd also recommend making a copy of the file and opening it in notepad while you edit. This should help you find the specific fields you are looking for see the connection logic.

 

AMP is multi-threaded so a) it divides up your base memory into memory/number of nodes which can drastically reduce your memory. It (depending upon the Alteryx version) does not handle bumps/outages/many other things gracefully. Often times when something in one part of the workflow throws an error - AMP because it's tracing an error upstream from a downstream error misdiagnoses where the error actually occurs. It might be that an erorr is there but AMP will tell you a totally different datastream which connects to the datastream 20 tools downstream is the incorrect source of your error. It also does not sync field metadata as well. 

 

There are also certain tools (Block Until Done/In-DB tools/Python) which AMP can have significant issues with.

jbichachi003
9 - Comet

Thanks for the clarification! Would you happen to know why I didn't/don't have issues running the original workflow with AMP, but I can only seemingly run this semi-cleaned up version of the workflow without AMP?

apathetichell
19 - Altair

that's an interesting question - my assumption was that the original version was set for AMP off - and that when you made some changes it did one of those Alteryx being Alteryx AMP resets where it turned it on.

 

If you didn't add any uniques/block until dones or anything major - it's weird that it would have started to trigger the issues. It also could just be a memory issue - so if you have 16gb of ram - and your run a 4 core AMP - you'll start hitting issues much much earlier from memory impact then you will if you run it single threaded. 

Labels
Top Solution Authors