This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 06-23-2016 02:53 PM - edited on 07-27-2021 11:39 PM by APIUserOpsDM
How to dynamically run the most recent file in a file folder
Sometimes you may have daily, weekly, monthly or yearly data dumps where you want to only run the most recent file. Within Alteryx you can make this process dynamic and seamless through the use of a few tools.
Step 1: Directory Tool
The Directory Tool will allow you to browse to a folder and return all the metadata related to the files which exist within that folder. The field of interest in the metadata is the 'Creation Time'.
Step 2: Sort Tool
Using the field called ‘Creation Time’ we can use the Sort Tool to sort the date and time values into descending order to get the most recent file at the top of the dataset.
Step 3: Sample Tool
After sorting the 'Creation Time' field I now have my most recent file in record 1. Yet, I still have rows of data for the other files within that folder that I need to remove. I can now use the Sample Tool to take the ‘First 1 Record’ and this will result in the latest file information being left.
Step 4: Dynamic Input
Currently the field I have in my dataset only shows metadata available for that file such as Full path, Creation Time etc. I now need to read this file and pull in the data by using the Dynamic Input Tool. In the ‘edit’ section select a placeholder file.
Then in the Read a list of data sources ‘Field’ dropdown this will need to be the ‘Full Path’ field coming from the directory Tool. In the Action dropdown this will need to be set to ‘Change Entire File Path’.
Step 5: Run the workflow
You can now run the workflow and it will dynamically always pick the latest file from that folder and read the data into Alteryx.
Please find an example workflow attached to this article.
I have created a process which gives me the latest csv file name but i am unable to read that file. I tried using Dynamic Input but not surer how to enter file which i am getting from my workflow. Below is my process.......
Thanks for the solution can we add logic to get from most recent directory and most recent file?
Hey @JordanB Hello again!
Similar to the use case above, I wish to reach the entire contents for a folder but the catch here is I wanted to read only the files that are inside the MOST RECENT FOLDER created in a target location. The system is in such a way for every data request the external system creates new folder every time in a designated location.
I don't see a timestamp column for sub directory creation time or a way to list the sub-directories only. Any thoughts?
Thanks.
The other system is programmed to create a folder with cuz text always and appended the date and timestamp of creation while the files are generated. Ex: Downloads_2017-12-19_12-24-45
Hi @Raghu_s
You can parse this date time stamp in the file path and format it correctly to sort within Alteryx. This will then allow you to plug this in to the workflow above.
To parse it you can use the regex tool and this function .*_(.+)
Best,
Jordan
This is exactly what i did before posting for the query. Was keen to understand if there are any direct tools or methods to pull directories inside a folder.
Hi @Raghu_s
Have you looked at turning on sub directories within the directory tool? This should pick up multiple layers of files.
Best,
Jordan
Happy Christmas!
Yes, that option helps you to read files inside a sub directory but the catch here is I will not be able to read only the latest folder created in a directory.
Ex: An FTP location has a historical folders which might run into hundreds( inside there might again a hundred of files) So the workaround right now solves the problem but I feel it consumes more time reading all files inside all folders. If we have a feasibility of adding a parent folder as a column in your metadata along with creating/modified time field it would be much easier.
Tip:
If you're working with excel file the above solution needs to be tweaked slightly by adding excel sheet name. Create a new field using formula tool after the directory tool e.g -
Thanks for posting this. Extremely useful!
Thanks @JordanB, in step 4 what do you mean by a "placeholder file" is this just a dummy file whats it purpose?
@ishtiaq thanks for added step for excel files, I'm having a syntax issue for some reason. I've copied your formula but I cant get it to find the sheet. My sheet is called RAW so I've tried the following:
[FullPath]+'/RAW$'
[FullPath]+"/RAW$"
[FullPath]+'/RAW
[FullPath]+"/RAW
[FullPath]+RAW
Any idea what I'm doing wrong?
Thanks,
Phill
At the beginning of new month, my data files are saved in a new folder (that particular month/year folder). Can Alteryx import the data from Most Recently Created Folder AND Most Recently Created File within that folder?
Example: All the data files for Jan 2019 are saved in a folder called "Jan 2019" , however on Feb 1st, a new folder will be created called "Feb 2019". Is there a way that on Feb 1st, Alteryx picks up the data file from "Feb 1st" folder rather than looking for the most recently created file "Jan 2019" folder?
I wouldn't want to change the folder path on Alteryx every month since the idea is to fully automate this whole process. Any suggestions will be highly appreciated.
Hi @Muhammad
In the directory tool you can select the sub directories check box which will bring through all the folders.
Once you have all folders and files listed you can then parse out 'Jan 2019' or 'Feb 2019' from the file path and format into the correct date format using the datetime tool. You would use the 'custom' option and type Mon YYYY to convert correctly into 2019-01-01
You can then use this to sort (descending) on this field and sample to select the latest folder and file in the directory.
Best,
Jordan
Thanks @JordanB.. It worked!
Does this same dynamic file input procedure work for gallery WF's?
What if i need to 'pick' a few files based on certain conditions. For example, read the file that contains last day of the month when the files are stored in weekly batch. The goal is to only pull in relevant files, not the enter repository (too large to handle).
What I have done so far is have a separate 'list' of dates that i want to read. But i don't know how to pass the dates into file name and directory.
Hi,
I have multiple files in the folder and I want to pull the latest file first but also pull other files in the descending order. Is there a solution for that?
Regards,
Pramil
I'm trying to choose the latest excel price file in a folder where these are saved daily and then use this as an input file to build a workflow from but I keep getting the below error with the dynamic input tool. What am I doing wrong here?
Hi @JordanB , thanks for the solution.
But when I am trying it at my end - I am getting an error "Dynamic Input (1): No sheet specified, you must specify a sheet". I am following the exact steps as you suggested. The only difference between your and my workflow is that I am sorting my directory contents by "CreationTime" in Descending order and then picking up the first file. I am using Alteryx 2019.3 version.
It would be great if you could please help, thank you!!
Hi @JordanB
I am trying to use the workflow above, but I am getting an error when I get to the Dynamic Input node. It is saying that there is "No sheet specified, you must specify a sheet', but I did specify the sheet. Am I missing a step somewhere?
I have a requirement to run on the most recent version of the same file that Alteryx processed a little while ago.
When i first run on the file, my Alteryx workflow makes some changes and I want to run another workflow on the changes made by the 1st workflow. But, the dynamic input tool is not picking up the changes done by the 1st workflow. Any help here would be greatly appreciated.
I have tried following this flow but my file is a xlsx file so I got error where I need to specify the sheet name in the dynamic input tool.
so I use formula tool and input the sheet name, ie. [FullPath]+"|sheet1$" before the dynamic input tool. owever, I still failed and get the file not found error. I have makesure the file is there. Can anyone help?
You need to keep 3 pipe symbols ( | ) so it will be - [FullPath]+"|||sheet1$". I hope the FullPath also has the File name in it. Else, FullPath + (Filename) + "|||Sheet1$".
https://community.alteryx.com/t5/user/viewprofilepage/user-id/2690
Thanks very much! Great solution really helped me out!
I am also facing error.
Please advise what is the issue?
Appreciate your support !
@Paresh1pandya - The issue is not with the Directory tool that you are showing, it is with Dynamic Input tool. Did you select any Table, Database, excel file, etc., as template for Dynamic Tool to start with. Please share details of Dynamic Input tool configuration to identify the problem.
Hi Bharat,
Thanks for looking into the issue,
I am trying to connect .xls format file from the folder, Please find below screenshot of Dynamic Input tool.
Please ignore the above screenshot, please check this.
Note: I am using .xls format file not .xlsx
Thank you in advance !
@Paresh1pandya - did you mention sheet name in your formula tool, see below.