Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Storing of output files from a workflow

Rohit_Bajaj
9 - Comet

Hi All,

 

My requirement is to store some sort of business rule fallout records in file.

It can be anykind of file, preference would be flat file, csv, xls, xlsx etc.

 

Additional requirements are as follows -

 

1) File should work on creation mode on 1st run.

2) File should work on append mode on next run onwards.

 

The above should happen without any code changes during/post run.

 

Also wanted to understand any best practices to store semi structured data i.e. different files will have different columns but are stored in a single file.

 

Thanks,

Rohit Bajaj

4 REPLIES 4
pcatterson
11 - Bolide

The only file type I've output to that can be created and then appended to is a sqlite file type. 

Rohit_Bajaj
9 - Comet

Thanks for your reply.

 

I tried looking into sqlite, indeed it has both the options.

 

But I was not able to understand that configuring only once (either create or append) how both the functionality can be achieved.

 

My requirement was that when I use my code for actual go-live load I do not want to keep changing configuration from Create to Append, rather it should work automatically as in - for initial run it should create and for subsequent ones it should append.

 

Not sure if you are refering to the 'Pre Create SQL Statement' functionality and have some create table code over there.

 

If you a sample with sqlite achieving the above functionality it would be much useful.

 

Thanks,

Rohit Bajaj

david_fetters
11 - Bolide

Edit: Rohit, you might consider not updating an individual file on each run, but outputting a single file from each run with a timestamp or some other method of preventing file name collisions, and then subsequently using the Input tool with the filename set to "*.yxdb" or whatever your type is.  That will union all of the results that match (in this case, anything that ends in the format .yxdb") into a single stream.  You can then output all the results as a single file if you need to, or do whatever processing is necessary.  This would be much easier than anything I suggest in my wall of text below!

 

Rohit,

 

This is one of those areas where you'll probably find a database table is the easiest way to hold this data because you don't have to worry about initializing the file (e.g. creating the blank file to add things to) and you can append records to it easily without needing to worry about locks.  Even a local non-production database that you can run reports from would probably be less hassle than trying to do this from flat files.

 

But, as Alteryx can do anything, there are two ways I can think of to do what you want to do.  The key problem you'll have is that Alteryx doesn't seem to natively have a way to check for existence of things without generating an error AFAIK.  To be able to do what you want to do entirely within Alteryx, we need to work around that limitation.

 

The hard way (which I won't demo here) would be to use the Run Command node to execute a command against cmd.exe that checks if a file exists, or runs a batch file that checks if the file exists.  You can then join a "DoesExist" flag field to the part of the stream that initializes your log file only when "DoesExist" = 0.  Use the block until done to control the order of execution to do the existence check prior to writing your data.  In this way, we're separating out the part that makes the file from the part that writes to the file.  By splitting the process, we know that we can write output no matter what happens.  But, we still have to figure out how to open the file (after initialization in this case) so that we can grab it's contents, union them with our new data, and write it back to the file so that our new row(s) are appended to the existing data.

 

The second way would be to precreate those log files using a master stream that anticipates what file names you'll need. The way you describe the problem hints that these files might get named according to user or date or some other criteria.  If you can determine that universe of names and create the files ahead of time, then the appending records problem becomes somewhat trivial.  Below is a section of workflow that shows how you would go about doing this if the files exist at the first run:

 

Capture_logging.PNG

 

In this workflow, our raw data moves into the workflow on the left hand side of the screen, where it is split into a portion for logging and a portion for handling whatever else you need to do with the data.  The portion for logging (which would likely need to go through a summary node so you can get summary rows and don't write your whole datastream to the log) is unioned to the file "Log_9_27_16.yxdb" which contains logging data for previous runs of this workflow.  After the union, we need the Block Until Done tool to stop Alteryx from attempting to write output to "Log_9_27_16.yxdb" before it's done reading that file in.  Without the Block Until Done tool, there's no guarantee that you'll be able to write to the file.

 

Hope this points you in a direction you can follow!

Rohit_Bajaj
9 - Comet

Thanks David.

 

I will follow the first part of your comment, i.e. to create the file name as some pre-determined name suffixed by DataTime.

Labels