community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Knowledge Base

Definitive answers from Designer experts.

How to dynamically run the most recent file in a file folder

Alteryx
Alteryx
Created on

How to dynamically run the most recent file in a file folder

 

Sometimes you may have daily, weekly, monthly or yearly data dumps where you want to only run the most recent file. Within Alteryx you can make this process dynamic and seamless through the use of a few tools.

 

Step 1: Directory Tool

The Directory Tool will allow you to browse to a folder and return all the metadata related to the files which exist within that folder. The field of interest in the metadata is the 'Creation Time'. 

 

Step 2: Sort Tool

Using the field called ‘Creation Time’ we can use the Sort Tool to sort the date and time values into descending order to get the most recent file at the top of the dataset.

 

Step 3: Sample Tool

After sorting the 'Creation Time' field I now have my most recent file in record 1. Yet, I still have rows of data for the other files within that folder that I need to remove. I can now use the Sample Tool to take the ‘First 1 Record’ and this will result in the latest file information being left.

 

Pic 1.png

 

Step 4: Dynamic Input

Currently the field I have in my dataset only shows metadata avaliable for that file such as Full path, Creation Time etc. I now need to read this file and pull in the data by using the Dynamic Input Tool. In the ‘edit’ section select a placeholder file.

 

Then in the Read a list of data sources ‘Field’ dropdown this will need to be the ‘Full Path’ field coming from the directory Tool. In the Action dropdown this will need to be set to ‘Change Entire File Path’.

 

Pic 3.png

 

Step 5: Run the workflow

You can now run the workflow and it will dynamically always pick the latest file from that folder and read the data into Alteryx.

 

Please find an example workflow attached to this article.

Attachments
Comments
Meteoroid

I have created a process which gives me the latest csv file name but i am unable to read that file. I tried using Dynamic Input but not surer how to enter file which i am getting from my workflow. Below is my process.......

 

Untitled.jpg

Meteor

Thanks  for the solution can we add logic to get from most recent directory and most recent file?

Alteryx
Alteryx
Are there any naming conventions in the new file folders that we can use to sort on?

Best,

Jordan

Get Outlook for iOS<>
Asteroid

Hey @JordanB Hello again! 

 

Similar to the use case above, I wish to reach the entire contents for a folder but the catch here is I wanted to read only the files that are inside the MOST RECENT FOLDER created in a target location. The system is in such a way for every data request the external system creates new folder every time in a designated location. 

 

I don't see a timestamp column for sub directory creation time or a way to list the sub-directories only. Any thoughts? 

 

Thanks. 

Alteryx
Alteryx

Hi @Raghu_s

 

Do the new file folders have any name convention we could use?

 

Best,

 

Jordan Barker

Asteroid

The other system is programmed to create a folder with cuz text always and appended the date and timestamp of creation while the files are generated. Ex: Downloads_2017-12-19_12-24-45 

Alteryx
Alteryx

Hi @Raghu_s

 

You can parse this date time stamp in the file path and format it correctly to sort within Alteryx. This will then allow you to plug this in to the workflow above.

 

To parse it you can use the regex tool and this function .*_(.+)

 

Best,

 

Jordan

Asteroid

This is exactly what i did before posting for the query. Was keen to understand if there are any direct tools or methods to pull directories inside a folder. 

Alteryx
Alteryx

Hi @Raghu_s

 

Have you looked at turning on sub directories within the directory tool? This should pick up multiple layers of files.

 

Best,

 

Jordan

Asteroid

Happy Christmas! 

 

Yes, that option helps you to read files inside a sub directory but the catch here is I will not be able to read only the latest folder created in a directory. 

 

Ex: An FTP location has a historical folders which might run into hundreds( inside there might again a hundred of files) So the workaround right now solves the problem but I feel it consumes more time reading all files inside all folders. If we have a feasibility of adding a parent folder as a column in your metadata along with creating/modified time field it would be much easier. 

Meteoroid

Tip:

If you're working with excel file the above solution needs to be tweaked slightly by adding excel sheet name. Create a new field using formula tool after the directory tool e.g -


image.png

Meteoroid

Thanks for posting this.  Extremely useful! 

Atom

Thanks @JordanB, in step 4 what do you mean by a "placeholder file" is this just a dummy file whats it purpose?

 

@ishtiaq thanks for added step for excel files, I'm having a syntax issue for some reason. I've copied your formula but I cant get it to find the sheet. My sheet is called RAW so I've tried the following:

 

[FullPath]+'/RAW$'

[FullPath]+"/RAW$"

[FullPath]+'/RAW

[FullPath]+"/RAW 

[FullPath]+RAW

 

Any idea what I'm doing wrong? 

 

Thanks,

 

Phill

Alteryx
Alteryx

Hi @Phill

 

The dummy files purpose is needed to let Alteryx know the file type that is being read in.

 

Regarding the sheet syntax you will need:

 

[FullPath]+'|RAW'

 

Best,

 

Jordan

Meteor

At the beginning of new month, my data files are saved in a new folder (that particular month/year folder). Can Alteryx import the data from Most Recently Created Folder AND Most Recently Created File within that folder? 

 

Example: All the data files for Jan 2019 are saved in a folder called "Jan 2019" , however on Feb 1st, a new folder will be created called "Feb 2019". Is there a way that on Feb 1st, Alteryx picks up the data file from "Feb 1st" folder rather than looking for the most recently created file "Jan 2019" folder? 

 

I wouldn't want to change the folder path on Alteryx every month since the idea is to fully automate this whole process. Any suggestions will be highly appreciated.

Alteryx
Alteryx

Hi @Muhammad

 

In the directory tool you can select the sub directories check box which will bring through all the folders.

 

Once you have all folders and files listed you can then parse out 'Jan 2019' or 'Feb 2019' from the file path and format into the correct date format using the datetime tool. You would use the 'custom' option and type Mon YYYY to convert correctly into 2019-01-01

 

You can then use this to sort (descending) on this field and sample to select the latest folder and file in the directory.

 

Best,

 

Jordan

 

 

Meteor

Thanks @JordanB.. It worked!

Atom

Does this same dynamic file input procedure work for gallery WF's?

Atom

What if i need to 'pick' a few files based on certain conditions. For example, read the file that contains last day of the month when the files are stored in weekly batch. The goal is to only pull in relevant files, not the enter repository (too large to handle).

 

What I have done so far is have a separate 'list' of dates that i want to read. But i don't know how to pass the dates into file name and directory. 

Alteryx
Alteryx

@ktshort2 if I understand your question correctly yes it could do.

 

@yelin Can you screenshot the list of dates you have and the structure of your file name.