05-30-2017 11:25 AM - edited 08-03-2021 11:43 AM
This article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of the Directory Tool on our way to mastering the Alteryx Designer:
The Directory Tool gives you a data-stream input that contains information about the files and folders (file name; file date; last modified, etc.) for the location of your choice, which you can then use for more complex interactions with the file system. Basically, the Directory Tool could also finally help me track down my keys - not just where I put the keys in the house, but also how long they've been there, and when they were last moved.
There are two common uses for this functionality:
If you wanted to import all CSV files in a folder, this can be done with a wildcard on an Input Data Tool. However, if you're only looking for a subset of the files (you want dog.csv; cat.csv; bird.csv; but you don't want fish.csv), or you only want the most recently modified (which may be deep in sub-folders), then you'll need to enlist the help of the Directory Tool!
In the example below, we have treatment records from a vet clinic in sub-folders by animal type. We want to bring in the treatment record for most recently treated animal (by looking at file modified date) - however, we don't want the birds or fish. Each treatment record is named after the animal, so there's not much consistency in file naming.
The initial directory scan brings back all the files stored in the various folders, along with important information about the size and modified date:
You can see that it brings back files in each of the treatment folders, and cousin Freddy seems to be the most recently updated (but he’s a frog…)
Because we have the file information, we can then sort and filter, and then we’re left with the final file name, which is then imported using the Dynamic Input Tool:
… to bring back the treatment details for Felix the Cat:
So the Directory Tool gives us a way to look inside all our folders, and filter and sort file information just like any other data, so that we can get to exactly the file we want.
Here we look at cases where you're not looking for the Directory Tool to tell you about the files so that you can import data, but more where you want to do analysis on the file system data itself.
For example, you get that dreaded e-mail from your system administrator saying "your network folder is at 95% capacity - please clean some files up to create extra space." If you are anything like me, you probably have working files in there going back for years and it's tough to quickly spot one or two easy, big, files that you may be OK to archive or delete.
This is relatively easy to do using Alteryx:
Use a directory tool to bring back all files in the relevant folder (with subfolders)
Then filter for files not modified or accessed in the last month/year, sort by size, and the result should be the biggest files that are not being used frequently:
By now, you should have expert-level proficiency with the Directory Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts every Tool Tuesday by following @alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.
Interesting post, thank you!
What I`m wondering though, is can the Directory Tool provide a list of folders or subfolders in any given directory, rather than files - (plain file folders, with no extensions)?
Hey @joannatg
You can use the directory tool, and push the "directory" field through a unique control, and that gives you a list of all the subfolders of a given directory.
There's no switch to say "only return directories, not files", but on my machine it's fast enough that I can let it bring everything back, and just ignore the filename pieces.
Thanks! Works very well indeed. I wonder if this works, however, in case of empty folders/subfolders. Does it/should it? Say, you have empty folders/subfolders ready to be filled with files and you just want the list (in form of directory of course, so you have the hierarchy preserved) of those folders/subfolders, can the directory tool generate that or it is not intended to work this way?
I don't believe the tool works for empty folders/subfolders. I'm assuming it's not meant to work that way but there are certainly use cases for knowing what folders are empty. Hopefully that can change.
Hey @aquinta4,
It would be worth submitting your idea of directory tool optionally picking up empty directories too, as a product idea here: https://community.alteryx.com/t5/Alteryx-Designer-Ideas/idb-p/product-ideas
Hello Sean,
The email use case has triggered a question.
Could the directory tool be used to query Lotus Notes email files ?
Hi @Chris6 - to be honest, I don't know.
How are Lotus notes files stored? If they are stored in a way that the Windows File system can treat as a folder, then there's a possibility that this could work.
However - if these are stored like they are in Microsoft Exchange server - then you need to go through an API to hit the content.
Give it a try, if you can get access to the right folder?
Hello , @SeanAdams , Thanks so much for your contributions to the community. Would you know how I can access the Tags I have created in the Properties Section of the Excel Files . That Properties information is not being pulled by Directory Tool . Thanks Sourav
Hey @SD3 - you can't do this with the directory tool - but you can automate the Excel object from Powershell to get the properties.
Just google "PS Script for Excel meta data", and you should find some ideas.
Alternatively, you can experiment with doing the same in the Python tool.
Awesome , Appreciate the help, Thanks Sean , SD3
The "Directory" Tool and the "Dynamic Input" Tool are excellent! My one need however is pull the directory name in with the file data. I'm pulling together over 400 csv files, which all have the same name, and to distinguish them from each other, I need to grab the date that's in the sub-directory name. There does not appear to be a way to pull in the directory name with the "Dynamic Input" tool.
Is it possible to include a directory (a folder) referenced by the Directory Tool (and all of its contents) within a packaged workflow using the Options>>Export Workflow feature?
Specifically, is it possible to do this if the directory is specified as a relative path?
Even though the relative path is listed as a dependency in the Options>>Advanced Options>>Workflow Dependencies window, this doesn't seem to carryover to the "Export Workflow" window.
Also, in the Asset Management window of the Directory Tool it doesn't seem like a user can add a directory (a path to a folder) under "User added assets"; only files seem to be able to be added.
Hey @HelloWorld1
There is no way currently to add an entire directory as a dependency - if you need to create directories as part of your workflow you can do a little python to create these dynamically.
Okay, thank you Sean.
@55dhr, were you able to get an answer to your question? I have the same need. --Dean
Hey @MeanLeanDean,
If the directory tool doesn't give you what you need - the next step upwards is to use the python tool. There are a series of directory and file tools under the os package that should do the trick.
If you continue to have a gap to close - log a question under the discussion board for designer - that way people will spot this as an open item.