community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Logging of Workflow performance/run details

In order to perform audit-trail logging - it would be valuable to have 2 new capabilities

 

a) environment variables which show the workflow name; filepath; version; run start date and time; etc.   For any worklows we build, we need to have a solid audit trail to be SOX compliant, so having this detail available as a data field to write and manipulate is essential

b) A logging component.   What would be great is a component that you can drop on a workflow, not connected to anything, which is able to trap the start; end; runtime; version; etc of a workflow; and commit this to any output data format (CSV or ODBC etc).   This logging tool would need to be able to capture the full runtime, so it would need to be the last thing that runs (which means it may need to exist in parallel to the main workflow in some way).    This is not currently possible with a complex workflow with outputs, because it's not possible to identify when the entire workflow ended; or the runtime (since output tools don't have an onward connector to pass flow-of-control to catch the final end-time)

 

Again, both of these are necessary to meet audit requirements for workflows and prodcution-quality ETLs for BI data warehouses.

4 Comments
Quasar
Quasar

I assume your questions relate to Alteryx Designer, since with Server you'd be able to see what was running and when.

 

Given that, I'd check out the 'Runner Macro' in the CREW macro pack available here: http://www.chaosreignswithin.com/2014/09/blog-macro-pack-2014-q3-release.html You could easily build a workflow that used it to run your core workflow and then write results to a log file.

 

I'd also suggest you dig into your requirements to understand what's actually needed. Especially in larger organizations, IT depts are often operating under assumptions based on how computer systems worked 10-15 years ago and have trouble wrapping their heads around modern analytics tools. 'Rules' that were put in place a decade ago for legacy ETL tools (ex. Informatica) may not make sense for a modern analytics shop and the spirit of the true requirement could possibly be met in some new, more efficient way.

Aurora
Aurora

Thank you Jason - several useful threads to follow up on here - especially around server capacity to generate logging.

 

On the CREW Macro - I agree that this is solution possibility, however this is not included in the core Alteryx product (and I believe that the Macro Runner relies on a specific .exe which is not part of the core product), which makes it difficult / impossible to use in a controlled corporate environment.   It would be great if the CREW macro pack capabilities could be included in the next release of the Alteryx suite.

 

On the issue of requirements - I do take your point that requirements have moved as tools have evolved - however when reports & analytics have to be 100% correct every time (e.g. regulatory submissions); there is a need to provide an audit trail of how data was transformed; to be able to generate alerts when specific quality checks fail during the transformation (usually to an enterprise production support team); and to be able to proactively validate that all jobs have completed successfully (without relying on the service itself to throw a failure alert since this does not act as a sufficient control in the case of a complete catastrophic failure).    These capabilities may be provided out-the-box by working in the server environment (apologies, I don't know the server env well enough to understand the responsbility split for job monitoring; logging; audit-trail mgmt; etc - between the alteryx package itself vs. the Alteryx server).

 

Thanks for your note Jason

 

I must be missing something here. Why wouldn't you enable performance logging in the workflow settings and/or set the logging directory to a network drive? My similar use case was to understand when the most network traffic was occurring so that I could pick better times to schedule workflows and then also understand which tools had the most problems if a failure was related to a network issue or a machine issue (I'm running scheduled workflows every two or three minutes from a server-grade desktop). I set the logging directory to a network path, then created a workflow in Alteryx which: uses the directory input to select and pull in all logs generated in the most recent X hours, then interprets them as a delimited text file with each row as a new row of data, then use RegEx to pick out workflow and tool names and separate performance information into separate columns.

I second both Sean Adams ideas.

i) Workflow name would be so useful to have as a out of the box variable (instead of hardcoding in a text input tool / workflow level constant ) everytime

 

ii) CReW runner macros-especially runner & conditional runner macros that controls the order of execution of workflows need to be part of alteryx offering as this sounds like a very common requirement.

I know there are options available out of the box such as block until done, workflow level post command events.. but these are easy to use only in SMALL analytics projects (only 2-5 max 10  workflows etc.,). It becomes increasing difficult to manage if you have tens-hundreds of workflows, where the subsequent workflow to run is silently embedded/hardcoded in a different workflow and a chain of workflows.

 

Thanks,

Sandeep.