community
cancel
Showing results for 
Search instead for 
Did you mean: 

Engine Works Blog

Under the hood of Alteryx: tips, tricks and how-to's.
Alteryx
Alteryx

You can download the Cache Dataset V2 macro from the Alteryx Analytics Gallery (in the Macro District) here.

 

Why the update?

 

Sometime last year, I made a tool that allowed me to create "save points" in my workflows, and avoid wasting time running the entire workflow after making incremental changes. A lot of people responded to a blog post I wrote about it, and some great suggestions came out of that discussion.

 

One user even made his own updated version of the macro, because he wanted to be able to specify where files were saved by the macro. That was really awesome, and inspired me to implement some of the other suggestions users had and share the updated Cache Dataset macro with you guys here.

 

Thanks to @ErikB for sharing his update with me! Hopefully this becomes a trend -- if you feel like some functionality is missing, don't be afraid to start poking around the guts of a macro and making your own changes and sharing them. This is good for everyone because it means we can make better tools together through collaboration, but it’s also a great opportunity to learn new patterns and tricks in Alteryx.

 

 

What's different in this version?

 

This update is really all about managing the storage and cleanup of cached files. In the previous version of the tool, all cached files would be saved to wherever the macro itself lived. But this wasn't always desirable, and if you use it a lot, you might end up with a big folder of files like this:

 

cached datasets.png

 

So let's take a look at what's new in the tool's configuration to deal with this problem:

 

config2.png

 

For the most part, the first tab should look pretty familiar (and if not, go back to the original post to see how it works), but there are three new options under "write mode" that allow you to specify how long you want the cached files to live on. If you select the first option ("Delete after shutting down Alteryx") then the cached files will be written to (and read from) the Alteryx temp directory, which gets automatically cleared out upon closing Alteryx.

 

However, if you select the second option, then the files will be deleted after a specified number of days. This is triggered by running the macro, so if you use it once and then never again, then that one cached file will not be removed. But if you use the tool regularly, then it will clear out the old files.

 

(To accomplish this, the macro maintains an index of locations where cached files have been created, and then each time it is used in a workflow, it looks for cached files in those directories that have an expiration date that is passed due and deletes them. In a future version, I may look into automating this with schtasks.)

 

Finally, you have the option to never delete a cached file. (This is how the tool used to work.)

 

The Advanced Options tab

 

Here, you can tell the macro where to save the cached files. The first option ("default") will tell the macro to save to a dedicated cached_datasets folder in the user-specific APPDATA directory. However, if the "delete after shutting down Alteryx" option is selected on the previous tab, the file will be written to (and read from) the temp directory instead. The second option ("workflow directory") tells the macro to save the files to the same directory as the workflow in which you've inserted the Cache Dataset V2 macro.

 

Finally, if you don't want the macro to check for expired cached files and delete them, check the last option on this tab.

 

That's it. Enjoy!

 

And keep the feedback coming (and of course, your own variations)!

 

Teaser

 

I'm excited to say that caching functionality is going to be officially integrated into the product in a future release in a very elegant and streamlined way! (Some of you may have picked up on this in the product suggestions thread here.) I've had the pleasure of being involved in some of the initial design discussions around the new functionality, and I genuinely feel that it is going to be totally game-changing to how we develop workflows in Alteryx. (That's all I can share at this point, but will do my best to provide updates as they become available!) Smiley Very Happy

Em Roach
Lead Research Scientist

Em loves creating new tools that help people do incredible things with their data and expand the capabilities of what they previously thought possible. She's been doing that since her first job as an actuary, and then as a consultant in fraud detection analytics, as a Content Engineer at Alteryx, and currently as Lead Research Scientist on the Analytics Products team at Alteryx. When she's not coding, you can find Em snowboarding in Colorado, skydiving in Wisconsin, or making music in Minnesota.

Em loves creating new tools that help people do incredible things with their data and expand the capabilities of what they previously thought possible. She's been doing that since her first job as an actuary, and then as a consultant in fraud detection analytics, as a Content Engineer at Alteryx, and currently as Lead Research Scientist on the Analytics Products team at Alteryx. When she's not coding, you can find Em snowboarding in Colorado, skydiving in Wisconsin, or making music in Minnesota.

Comments
Alteryx Certified Partner
Alteryx Certified Partner

@MacRo,

 

Are you teasing us?  I tried to download the macro and was denied access.  I'm sure that this will be quickly remedied and that either I'll be granted access or someone will post that it was a user problem (mine of course).  Thanks again for the macro and for the enhancements.

 

Mark

Alteryx
Alteryx

Smiley Surprised!

 

whoops try again!

Alteryx Certified Partner
Alteryx Certified Partner

We're in business!!!

Atom

Love this new feature! thanks!

Alteryx Alumni (Retired)

Nice new features!

Bolide
This is great stuff!
Bolide

This a great marco but I have been trying to find a way(in read mode) to have it read my yxdb so it picks up field names prior to running. I cannot find a easy solution.

 

I have built other macros where the yxdb file is fixed so when i connect to the macro it immediatly picks up the fields so I can modify selection etc if needed prior to running.

 

To avoid putting everything upstream in a container and disabling, i just cut and paste macro/downstream and open a temp file to kepe working, but you have to run it before you can make a modification to any down stream tools since it cannot 'see' the data/fields in the marcro.

 

Any simple work around you can think of? Short of me just writing to a temp yxdb then using a input tool?

 

Thanks,

Anthony

 

 

When I use this my workflows get stuck at 25% on the very first join.  What could the problem be?  I put a broswe after the chache tool and see the data, without the cache in the workflow it runs super slow but never gets stuck at 25% on the first join.

Alteryx
Alteryx

@shaynap84 -- interesting, are you able to post an example workflow? (or send me a private message if that would be better) id be happy to take a look

Alteryx
Alteryx

Hey all, just wanted to let you know I've made a few fairly substantial changes to the tool. The newest version -- 2.02 -- contains the following updates:

- The issue mentioned by @anthony has been fixed -- now the field metadata will flow through the tool properly in both read and write mode!

- An "advanced" option has been added that allows you to suppress the "read mode" error. I personally find that error message to be helpful, but understand that an error message is not always desirable (for example, with the CReW "runner" macros) and so you now have the option to disable it!

- There was a pretty major bug with the "delete after session" option -- this has been fixed. My deepest apologies for any frustration this may have caused!

Bolide

Mac thanks for fixing the Metadata flow through - cut my processing time in half+ and now I have too much free time Smiley Happy

 

 

Meteor

Even with version 2.02 I'm having trouble getting metadata passthrough to work.  In the screen capture below #1 uses the V2 cache tool, while #2 uses the V1 cache too..  You can see that #1 does not show the metadata when the input tool is disabled, while #2 does include the metadata.  The test workflow is available here.

 

 

Metadata Passthrough

Alteryx
Alteryx

Yep you're absolutely right @Grahambo! I noticed this as well and have been going crazy because I could've sworn that this had been fixed! Well either I was just flat out wrong or I lost some changes at some point. Either way, I haven't gotten a solution working yet, but will be sure to update when I do!

Atom

Hi Mac!

 

I was wondering if the above issue with the metadata issue was solved yet?  

 

Thanks for this macro, I use it all the time! 

Asteroid

Same here, the metadata issue in Read Mode is a major issue. It kills the functionality, as i cannot join on anything (join tool doesn't see any valid fields)!

 

This needs to be fixed! 


Meteoroid

+1

This bug is driving me crazy....following this thread

I hate to ask a daft question, but I'm not sure I'm using this right. (I'm VERY new to Alteryx) I dragged an input data step, summarized the data and then cached it. I reran the analytic workflow, but it still appeared to run the data from the first step (the Alteryx data file). 

 

Can you tell me how to setup the workflow so that it starts from the Cached Dataset?

 

cached_workflow.PNGcached workflow

Meteor

If you put the two tools prior to the cache tool into a container and disable the container, then the workflow will start with the cache tool.  Be sure to set the cache tool to read mode as well.

Meteoroid

@bdanielatl

 

There are 2 high level steps

 

1) When you first run the workflow, the data gets stored in the 'cache macro'

2) when you want to read from it, you need to do 2 things

       a) Put everything that is the input to the cache in a container and disable the container so that the input nodes don't run again

       b) Click on the cache macro and change the configuration to 'read' ...that way Alteryx knows that you will be reading from the database and ignores the input essentially.

 

Hope that helps.

Asteroid

@MacRo 

 

Any updates on the metadata bug? Even with that issue this saves me hours each week, but if the metadata gets resolved I would start proselytizing Smiley Wink

An awesome feature to be added. Thank you so much for sharing this!

Hello,

I'm new to Alteryx and I try to install the Cache Dataset V2 on my computer.

My problem is that, once I've uploaded the file and ran it with Alteryx, I get this message :

"Select "Install" or "Uninstall"

But I don't find any place where to chose Install or Uninstall...

Is there any one who has got this problem and can help me with it ?

Many thanks in advance!

Asteroid

@Mathias123 I believe if you select install it will install the macro into the proper folder within Alteryx (it's been a while but I'm pretty sure that's what I did).

Meteor

 I just posted a comment tagged to Macros regarding the same issue mentioned above.  I just installed the macro today and have found that when attempting to run a process using the cached data (after disabling the preceding containers), the data is being read as though it's in a single-field text file, so joins are failing.

 

I'm using Designer 2018.1.  Is there a step I am neglecting to perform, or is this a programming bug?

@pliskers I've put a comment over here that would be relevant to your situation. I believe that if you change the macro and check the option about output fields changing, that should fix your issue. See that post for more details and pictures.

Meteor

Thank you very, very much! I will check out the changes first thing tomorrow.


Question - how does one delete the incoming connection? Do you mean to say I should (in version 2) simply uncheck the box that says "Optional Incoming Connection"?

Alteryx Partner

What am I doing wrong ?

 

- I've downloaded v2.02 from the Gallery

- Installed it (no errors) on Designer 2018.1.3

- Created a simple workflow

  - some inputs that are joined

  - dropped in the macro

  - containerised everything to the left of the macro

  - ran the workflow once (takes a few mins to run)

  - disabled the container

  - changed the macro to "read mode"

  - ran the workflow again (takes a few seconds to get the same result)

 

 

So it's working fine, BUT .... it's generating an error as well that just says "read mode".

 

What am I doing wrong that the workflow works but an error is generated as well ? 

Alteryx Partner

RTFM is the answer Smiley Happy

 

2018-08-08 12_29_47-lcaalx01 - Remote Desktop Connection.png

Meteoroid

I know I'm late to the game on this one but this is a very nice tool.  I know the new version has added "Cache and run workflow" as a native option but this macro feels more "deliberate" and helps you further think about your workflow in chunks.  +1

Labels