Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
ALTER.NEXT:

Join us on Dec 2 for a half-day virtual analytics + data science event!
US & CA customers only

SAVE YOUR SPOT
It's the most wonderful time of the year - Santalytics 2020 is here! This year, Santa's workshop needs the help of the Alteryx Community to help get back on track, so head over to the Group Hub for all the info to get started!

Catching and responding to errors within a workflow

Highlighted
7 - Meteor

We have a system where an external web application will call the Alteryx Server using its API to run workflows stored on the Server. Each call passes in certain parameters to the workflow to be invoked - that is every run is unique. The workflows are very complicated. For example, one of the workflows will take a set of tables, transform them into a star schema, and build a SSAS cube - a pointer is returned to the caller using a webhook. 

 

After the workflow runs, we always want it to call the web hook to notify the caller that it has completed, whether on success or failure. Therein lies the problem. The webhook is not called if the workflow errors out. 

 

Since these workflows are very complex, there are many points where they can fail. For example, some of the workflows use nested macros which are 10 deep, and a failure that occurs at any of the macros results in the overall failure of the workflow.

 

The problem is that I have no reliable way to catch these errors, to ensure that the parent workflow ALWAYS calls the webhook. If the workflow fails, it just fails. There is no "Try/Catch" ability. 

 

The best that we've been able to come up with is to invoke an external powershell script upon error (using Events), passing the entire error log (%OutputLog%)  to it. This powershell script then attempts to parse the error log, and find the information need to complete the webhook call. This is unfortunately not reliable, because

(a) sometimes the  %OutputLog% does not yet contain enough information to allow us to call the webhook meaningfully. 
AND 

(b) the %OutputLog% variable  can be really large (1MB or more) and this prevents the powershell script from running at  all (it exceeds the parameter buffer limit of powershell)

 

What is a better way to do this?

 

 

Highlighted
15 - Aurora
Hi Joseph,

I think it is a good idea to scrape logs; one thing you could try is to use Alteryx for that too: it may handle large files better than PowerShell, and might be better at scraping also. Since files might still be being written, try to assess worst case for that (seconds? minutes?) Once you know that: in workflow, grab all logs using Directory tool... process only those older than "worst case" then use Alteeyx to quickly parse out the needed info and make the web hook exactly as it would had it been in a "catch" or "finally".

Just a thought; kinda high level but hope that helps!

PS, for try/catch, that seems like a really great idea for a product enhancement request.
Highlighted
11 - Bolide

I think this might be a good use case for the CReW macro's Runner Macro.  It is a node that lets you run a specific workflow, and then has conditional logic that outputs the resulting log file to either a success or failure node.  You can ignore the log results if it succeeds (which lets your natural webhook run and return the value you need).  But, when the workflow fails, you can use the Failure output to parse the resulting log and execute a new webhook that returns something meaningful from the log of the failed workflow.

 

I've found that the runner macro can add a bit of a processing overhead, but in cases like this where you absolutely need to handle success or failure without seeing the workflow run, it's not a bad deal.  You may need to write your input criteria into a configuration file that your workflow grabs on each run, as i'm not sure exactly how you handle configuring the workflows for each run currently.

 

Shoutout to @AdamR and the team for building it.  You can find the macro pack here: https://community.alteryx.com/t5/Engine-Works-Blog/Crew-Macro-Pack-2016-Q2-Release/ba-p/26482

 

The old fashioned option is to build out your requirements into a test macro that makes sure any data passed in will absolutely work.  Probably a fairly difficult solution in your case (but a noble exercise for the purist), but it would ensure that workflows don't fail.  I've done this a handful of times, and I still come back to those workflows every once in awhile when someone finds a new way to break them that I didn't anticipate.  All part of the game 😉

Highlighted
7 - Meteor

We did consider the CREW macros (which are very good) but needing to write parameters to a shared location was the deal breaker, since there is no way to pass parameters to the workflows with CREW runner. Some of the macros we run have both control inputs and data inputs - in this case one would need an(other) intermediate macro to pipe control inputs into the actual macro. That's two additional macros to run the workflow. 

 

Catching all possible failures is impossible in our case - there will be inputs we cannot anticipate.

 

Our current working approach is a background workflow that checks for completion of the foreground workflow - this is tricky since it requires state to be maintained  but at least there is no overhead on the performance of the foreground workflow and we can guarantee that a response to the webhook will always be sent. 

 

Native Try/Catch is still the best way to go, if only it existed. 

 

Highlighted
11 - Bolide

Yeah, I hear you on the try/catch.

 

You could try writing a random ID (a session key) with a timestamp into an initialized jobs table, and then as an additional final step when the heavy lifting in your workflow is done, write that same random ID and a new timestamp into a completed table.  Run a scheduled workflow that checks for initialized jobs left open for more than X+ minutes, and trigger it to send you and email and record some "under review" status into the completed table.

 

Again, inelegant in comparison with a traditional programming language, but there are some things I can do in five minutes within Alteryx that would take me five hours to do with code/syntax.

Highlighted
11 - Bolide

I posted info on capturing the output log.  I think you will find it useful

https://community.alteryx.com/t5/Setup-Configuration/Log-File-Location-and-Variables/m-p/80884/highl...

 

there are a number of ways to capture the log and once captured you can easily write a flow to analyze the log for errors or success.  I wrote my own scheduler because I needed more complex schedule criteria than what the Alteryx scheduler provided.  If you have the API license you can run jobs from the command line and therefore run them from batch files.  in essence that is what I do, I keep logs for each flow/job and each batch file that runs it so I can analyze bot the Flow messages and the command line messages and determine if jobs succeeded or failed and do a degree why it failed and resubmit under certain conditions.  the analysis of logs and schedule logic are separate workflows written with Alteryx and another flow will dynamically build batch files to run those flows determined to be ready to run based on the criteria defined, Time, dependency on data sources and other flows completing successfully and other business rules.

 

From the context above you could use one of the methods in the link above to capture the OutputLog info and then write a flow to run every 10 minutes or so to determine if it succeeded or not.  For those were it failed then trigger the webhook and communicate the failure.  I am not familiar with webhooks so i don't know if it is something you can derive on the fly or if you need some data that is present in the flow that was run.  If the latter you may also need to write a log of jobs, times and webhook info that can be tied back to the logs you will analyze but I would expect the easiest method would be to use the message tool to write the webhook info directly into the OutputLog itself which will be analyzed per the above.

 

The flow that analyzes and triggers webhooks for failed jobs is going to be a fairly straight forward workflow and while I said schedule it to run every 10 minutes you can schedule it at what ever frequency makes sense.

 

Other posts provide some insight into custom schedulers using the API license to leverage running flows from command line and thus batch if that also interests you.

Labels