Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
clmc9601
13 - Pulsar
13 - Pulsar

Background

 

You are parsing dozens, hundreds, or hundred-thousands of workflow logs. For one reason or another, you need to extract information from this never-ending stream of Alteryx workflow messages. Thankfully, you already know better than to click the files one-by-one (😅), so you are contentedly loading in the file paths and reading them with Dynamic Input.

 

clmc9601_0-1654039089072.png

 

Suddenly, to your dismay, Dynamic Input breaks. You get the dreaded error message: “Error opening file: The process cannot access the file because it is being used by another process.”

 

clmc9601_1-1654039089530.png

 

Problem

 

Dynamic Input can only read unlocked files. When an Alteryx workflow or Server job is running and the result messages are actively being written, the log file is locked. This log file is stored in the same location as the completed logs, and the filenames alone make it impossible (for me) to differentiate. If you can tell the difference between the filenames, please let me know! This highlighted log below was my problem file.

 

clmc9601_2-1654039089915.png

 

Solution

 

Enter the Filter Locked Files macro. By adding the Filter Locked Files macro, you can easily pre-remove file paths that will cause a Dynamic Input error.

 

The configuration is simple. You can either give the tool an incoming anchor containing file paths as I did here or insert a directory path directly into the configuration. I only needed the file path field from my incoming data, so I deselected the checkbox which keeps all other fields as well.

 

clmc9601_3-1654039090033.png

 

It will output the readable file paths to the U (unlocked) anchor…

 

clmc9601_4-1654039090348.png

 

… and the locked file paths to the L (locked) anchor.

 

clmc9601_5-1654039090517.png

 

Details

 

Python has smooth exception handling functionality. This locked file situation, with a predictable error message and a clear action upon error, is easy to direct within Python. The Filter Locked Files macro uses Python exception handling to remove file paths that result in a permission error.

 

A small caveat with the current version of the Python tool: there will be a metadata error until you run the workflow. This error is harmless and will disappear after you run the workflow. Hopefully, my caveat will soon be irrelevant in a future version of Alteryx Python!

 

clmc9601_6-1654039090620.png

 

Here is the Python script inside Filter Locked Files. Thanks, @mbarone, for the code comments!

 

 

 

# List all non-standard packages to be imported by your 
# script here (only missing packages will be installed)
Package.installPackages(['pandas','numpy'])

####### PRE-REQS #######
from ayx import Alteryx # allows python & alteryx to talk to each other
from ayx import Package # allows python packages to be imported into Alteryx
import pandas as pd # allows python dataframes to be created

####### MOVE DATA FROM ALTERYX TO PYTHON #######
fileDf = Alteryx.read("#1")[['FullPath']] # read in (as a dataframe) the list of files to be evaluated

####### LOOP THROUGH EACH FILE AND SEE IF IT'S LOCKED ############
fileDf['Success'] = 0 # take what you just read in (a one-column dataframe; column is 'Fullpath'), add a 'Success' column to it, and set 'Success' to 0 for each row
for i in range(0,len(fileDf['FullPath'])): # iterate through the rows
    try: # see next line of code for what will be tested with each iteration
        with open(fileDf['FullPath'][i]) as testFile: # try to open the file
            fileDf.at[i,'Success'] = 1 # for each row that didn't error on open, update that row's 'Success' field to 1
    except PermissionError: # for each row that errors on open (i.e., file is locked), do nothing (will leave the 'Success' field as it originally initated - a value of 0)
        continue # keep iterating through remaining rows

####### MOVE DATA FROM PYTHON TO ALTERYX #######
Alteryx.write(fileDf, 1) # output to alteryx via the python tool outgoing anchor '1'

 

 

 

My fellow ACE Mike (@mbarone) built an excellent app for extracting server logs from a specific workflow during a specific timeframe. (Remember how logs are indistinguishable by filenames? His app is super useful for finding logs from certain workflows!) It uses Filter Locked Files on the inside. You can read about his app here.

 

For more information about log files, check out the Alteryx Documentation. Once again, here’s the link to the Filter Locked Files macro. Happy solving!