This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
How to dynamically run the most recent file in a file folder
Sometimes you may have daily, weekly, monthly or yearly data dumps where you want to only run the most recent file. Within Alteryx you can make this process dynamic and seamless through the use of a few tools.
Step 1: Directory Tool
The Directory Tool will allow you to browse to a folder and return all the metadata related to the files which exist within that folder. The field of interest in the metadata is the 'Creation Time'.
Step 2: Sort Tool
Using the field called ‘Creation Time’ we can use the Sort Tool to sort the date and time values into descending order to get the most recent file at the top of the dataset.
Step 3: Sample Tool
After sorting the 'Creation Time' field I now have my most recent file in record 1. Yet, I still have rows of data for the other files within that folder that I need to remove. I can now use the Sample Tool to take the ‘First 1 Record’ and this will result in the latest file information being left.
Step 4: Dynamic Input
Currently the field I have in my dataset only shows metadata avaliable for that file such as Full path, Creation Time etc. I now need to read this file and pull in the data by using the Dynamic Input Tool. In the ‘edit’ section select a placeholder file.
Then in the Read a list of data sources ‘Field’ dropdown this will need to be the ‘Full Path’ field coming from the directory Tool. In the Action dropdown this will need to be set to ‘Change Entire File Path’.
Step 5: Run the workflow
You can now run the workflow and it will dynamically always pick the latest file from that folder and read the data into Alteryx.
Please find an example module attached to this article (Built in Alteryx Designer 10.5)
Suppose that your spreadsheet has multiple sheets with the same structure and you would like to read several sheets into your module at once. In this case, the preferred alternative is to use the Dynamic Input tool.
This error is typically thrown when trying to read a file that does not match the selected File Format, but also in rare cases, from a corrupted file.
Error: FileID does not match in the FileHeader
Common causes/ resolutions:
Inputting a yxdb file? Make sure your file type is .yxdb rather than .csv, .kml, etc.
Trying to access a yxdb while it's updating? If you try to access the yxdb before the module is done running, or if the module crashes/errors during the creation of the yxdb, you'll see this error. Use a Block Until Done tool immediately before your Output tool.
Using a Directory tool/ Dynamic Input? Specify the file type. If your Dynamic Input is pointing to a yxdb template, use the *.yxdb wildcard in your Directory tool, rather than the all-extensions wildcard *.*
Running an app in the Gallery? Check 'Replace a specific string' in the Action connected to the Input tool where you specified the file type.
As always, if this does not resolve your error, please open a Support ticket through the Case Portal.
How do you import multiple Excel ranges in a single Input tool, in order to automatically Union the data into one large dataset?
A user asked me this question in the Solution Center at Inspire this past week. The short answer is, with an Input tool, you cannot specify multiple ranges within an Excel sheet to import at once.
The longer, and more fun answer is that it is possible to achieve this within Alteryx using a slightly different process!
The tool you want to use to accomplish this task is the Dynamic Input tool. The process is quite simple. Simply enter your Excel ranges into a Text Input Tool, complete with Sheet name and tick marks (Shift + ~), with one range on each line:
Connect your Dynamic Input to the Text Input and select the file you want to pull the ranges from. Set up the input as you would any standard Input tool to read a range. When doing so, be sure to check the option indicating that the First Row Contains Data (we'll rename the headers later). Using this option avoids rows being skipped due to a mismatch in the header fields:
For more information on the specifics of doing this, take a look at this article: http://community.alteryx.com/t5/Alteryx-Knowledge-Base/Input-an-Excel-range/ta-p/1248
Once this is done, continue configuring the Dynamic Input tool. Choose the option to Modify the SQL Query and click the 'Add' drop down. Select the option to "replace a specific string", and remove all text except for the sheet name and range:
Once you are finished here, bring down a Dynamic Rename tool, and select the rename option that says "Take Field Names from Frist Row of Data". This option will move the first row of data into your field headers, reversing the effect of choosing the option for First Row Contains Data:
Finally, run the workflow and enjoy your newly unioned file!
The attached sample demonstrates the process using dummy data, and was written in Alteryx v. 10.5. You'll just need to repoint the Dynamic Input tool to the sample data spreadsheet.
One of the great features of the output tool is the option to take the file or table name (or part of it) from a field. It allows you to append a suffix, prepend a prefix, change the entire file name, or the entire file path. It also gives you the option whether to keep the field on output.
These methods should work in most versions of Alteryx.
One of the most common issues we see from clients trying to read in a .csv file is that they are receiving an error starting “Too many fields in row x”, and because of this Alteryx can’t read in the file.
There are a couple of different ways you can resolve this error.
First, in the Input Data tool, you can tell Alteryx to treat read errors as warnings to allow the file to be read in.
This will change the Error to a Warning so that Alteryx can read in the file so you can investigate.
The other method is to read the file in with no delimiter. You can do this by changing the delimiter in the Input Data tool to \0.
Once you have your data parsed back out into its fields, you can use the Dynamic Rename tool to correct your field names, a select tool to remove the original field, and a simple Trim() function to remove the extra delimiter from your data.
This process is illustrated in the attached workflow, created in version 10.1.
When you’re frequently writing and rewriting data to Excel spreadsheets that you use for Excel graphs and charts, it can quickly become a hassle to make and remake your reporting objects to keep them up-to-date so you’re visualizing the most recent data. A best practice to keep the hassle out of the process exists, though! If you keep your plots isolated to their own spreadsheet, referencing cell values in another sheet used to capture your data, you can simply overwrite the source data sheet and your plots will update automatically upon launching Excel. In the example below (attached in the v10.6 workflow Dynamically Update Reporting from Excel Spreadsheets.yxzp) we’ve included the workaround to make your Excel outputs seamless.
The Directory Tool gives you a data-stream input that contains information about the files and folders (file name; file date; last modified, etc.) for the location of your choice, which you can then use for more complex interactions with the file system. Basically, the Directory Tool could also finally help me track down my keys - not just where I put the keys in the house, but also how long they've been there, and when they were last moved.
For most tools that already have “dynamic” in the name, it would be redundant to call them one of the most dynamic tools in the Designer. That’s not the case for Dynamic Input. With basic configuration, the Dynamic Input Tool allows you to specify a template (this can be a file or database table) and input any number of tables that match that template format (shape/schema) by reading in a list of other sources or modifying SQL queries. This is especially useful for periodic data sets, but the use of the tool goes far beyond its basic configuration. To aid in your data blending, we’ve gone ahead and cataloged a handful of uses that make the Dynamic Input Tool so versatile:
If you have a file that doesn't have the field names in the header row but rather in the first row of data, the Dynamic Rename Tool has an option to take field names from first row of data, which allows you to rename your headers easily and dynamically.
But what if you have many files in a directory, all with the same problem?
You want to combine these two files and turn them into one clean output:
But you don't want to have to map each and every one into Alteryx and use a Dynamic Rename Tool for each one of them....
The solution is to wrap a batch macro around the Input Tool and Dynamic Rename Tool. This will let you read in each file individually, fix their header row and then combine them into one dataset!
It's a fairly simple macro:
It accepts full file paths as the input, which are used to update the file mapped into the Input tool inside the macro. The Dynamic Rename Tool is set to accept "Dynamic or Unknown Fields" so it will pick up on any column from an input file. It then replaces the current header with the values found in the first row of data.
The macro "holds" the data until all iterations have run (all files have been read in) and then unions the data together based on field names because we selected "Auto Configue by Name" as the Output Mode:
To use the macro in a workflow, point a Directory Tool to the folder containing the files to be read in and then use the FullPath to feed into the macro.
See the attached workflow for an example.
The key component of any batch macro , the Control Parameter Tool is the gear that keeps things moving. Using the input , the control parameter accepts a field of values that will be used within the batch macro to reconfigure and rerun the macro for each of the standard input records - unless using the GroupBy feature that matches certain control parameters to buckets of records to be batched through the macro together. Adding this interface tool to any macro will upgrade it to a batch macro and will give you the ability to loop through macro configurations for added customizability. While one of the more sophisticated solutions you can build into your workflows, there are few problems you can’t solve with a batch macro:
In this article we’ll take a look at how to leverage a batch macro to read in multiple sheets from multiple Excel files using Designer version 10.0. The two Excel files I’ll be using in the example are called Fruits and Vegetables with sheets named Apples, Oranges, Broccoli and Spinach, respectively.
The first step will be to read in the list of sheet names from one of the Excel files using an Input tool. This feature is new to 10.0 and is a great addition when used in conjunction with the Dynamic Input tool.
We can then connect our Input tool from the previous step to a Dynamic Input tool. In our configuration, use the same Excel file as our Input Data Source Template. We can use the Sheet Names field from our Input tool as our List of Data Sources and our action is going to be to Change File/Table Name.
Now we’re ready to add our Interface tools to the canvas. Bring in a Control Parameter tool, 2 Action tools and a Macro Output. Connect the Control Parameter to the 2 Action tools, and then connect on to the Input and Dynamic Input. The Macro output will be connected to the output of the Dynamic Input.
In our Action tool configuration our action type will be Update Value. We want to update the File – value that is being passed on to the Input and Dynamic Input. We will, however, only need to update the portion of the string without the sheet name, which is why we’ll also select Replace a specific string: This will allow us to batch our process with multiple files.
Lastly, if our Excel files have different schemas, in the interface designer we can set the macro to Auto Configure by Name or Position so that our workflow does not error out. Note: sheets within the same file will have to be the same schema.
We’re now ready to save the macro and put it to use!
Now, on a new canvas, we can place a Directory tool (If all your Excel files are in one directory), or, we can use a Text Input to manually enter each individual file path location. After the Directory or Text Input tool, insert your macro (right click on the canvas->insert->macro), then choose the full file path for your question in the macro configuration. Add a browse, hit run and look at the results!
We now are able to read in multiple Excel sheets from multiple Excel files!
The attached workflow package includes a workflow, batch macro and excel files created in Alteryx v.10.0
Client Service Representative
One of the best things about Alteryx is the ability to read in multiple files very easily and automatically combine them into a single dataset. This becomes a bit trickier when dealing with files that have different schemas or Excel files with multiple tabs. Adding both multiple excel files with multiple tabs, and having the schema change within each tab takes it to another level.
Dynamically adding data into your workflows is one of the really great things about Alteryx. Having the ability to be able to adjust what you are bringing into the tool can create tremendous efficiencies. So how do we go about doing this? One example is to use the Dynamic Input Tool. This tool allows you to make adjustments to your SQL queries, Stored Procedures or bring in files with the same schema.
Configuration of the Dynamic Input Tool-
Input Data Source Template - Connect to your data and choose the tables and fields or files you would like to bring into the workflow. This is a “Template”, so you are able to change this template with the configuration of the tool, but this “Template” is necessary to bring in the data.
Note: if your data has different schemas then you want to either fix the schema or create a batch macro that will ignore schema and either load data in by name or position
Read a List of Data Sources - This section is used for bringing in a list of files, changing table names, or adding Suffix and Prefixes to your Table Names. If you are bringing in Excel files, the sheet name will need to be added to the Full Path using a Formula Tool.
Change File/Table Name: Takes the specified database table name and changes it based on the values in the specified field.
Change Entire File Path: Takes the entire specified database input path and changes it based on the values in the specified field.
Append Suffix to File/Table Name: Takes the specified database table name and appends a suffix to it. The suffix is the value in the specified field.
Prepend Prefix to File/Table Name: Takes the specified database table name and prepends a prefix to it. The prefix is the value in the specified field.
Modify SQL Query:
Pass field to the Output: This will allow you to pass fields that you may want to use downstream that will append to your data.
Replace a Specific String: This modification is great if you want to pull different data from your tables or fields. You can also use this to update the sheet name from an Excel spreadsheet if you are querying a specific range of cells. To use this modification you will want an input connection that has a field that you want to replace the specific String with. In the replacement window you will remove any text that you do not want to be replaced, then choose which field will replace the string.
SQL: Update Where Clause: This works similar to the replace string, the difference is that the Where clause will be the only portion that you will be able to update. The Where clause will appear in the Update window. You will be able to specify which portion you would like to update and from what incoming field. Please note that the Where portion of the statement will only show. If you have a statement that has Where (the clause) AND (another clause), the Where (the clause) will only appear in the window.
SQL: Spatial Filter: Does you database have latitude and longitude coordinates for stores, business, or customers? Have you built polygons for trade areas or customer profiling and you want to see who in your database fits into these polygons? This is the filter you will want to use. The filter will determine whether or not your data falls within the bounding rectangle of the polygon. Allowing you to look at the data contained within the polygon only.
SQL: Updated Store Procedure: Stored Procedures in SQL are great, but sometimes you need to make small adjustments to them. This Update will allow you to choose the Parameter to update and use a field from your data to replace a specific part of the Stored Procedure.
Sometimes, data will come with multiple header lines, for example year and quarter might be broken into two rows, like this:
That is not very helpful when you are trying to analyze your data.
You could use a Select Tool and type in new names OR you could do a few transformation and use the Dynamic Rename Tool to merge the rows dynamically, so you are prepared when 2017 rolls around the corner and they start adding new columns to your file!
To fix data that looks like the above, we start with the Sample Tool to split the data into two data streams - one with just the values and one with the header information. Then, we transpose the data to let us fill in the missing years and use the Summarize Tool to concatenate the resulting field values:
Now we can use the header lookup we created as a right input into a Dynamic Rename Tool using the "Take Field Names from Right Input of Rows" option and selecting "Name" as the old field name and Concat Value as the new combined field name.
See the attached workflow for details of tool configurations.
Between the RegEx , Text To Columns , and XML Parse Tools , the Alteryx data artisan already has an exceptionally robust selection of tools to help parse uniquely delimited data. However, there are still some data sets so entangled in formatting that it’s labor intensive to parse even for them. Enter the Find and Replace Tool , which captures the ability to find your nightmarish parsing workflows and replace them with sweet color by number pictures. Just kidding, it finds bad jokes and replaces them with good ones. Seriously, though, you could do both if you wanted to because this tool has the capability to look up a table of any number of specified targets to find in your data and will replace them with a table of specified sources. With the help of a few quick configuration steps, this tool can simplify some parsing use cases significantly.
Upon creating a BINGO game, I came across a technique that I thought could be useful in "real world" scenarios for users who are attempting to iterate a process and then replenishing the data after a certain amount of time.
Have you ever wanted to restrict the fields that flow through your workflow based on their data type? What about only allowing fields that begin with the same word or are in a specific position? The Dynamic Select tool is just what you’re looking for!
The Multi-Row Formula Tool functions much like the normal Formula Tool but adds the ability to reference multiple rows of data within one expression . Say, for example, someone was on the ground floor of a house and had a Formula Tool. They would only be able to talk to the people also on the ground floor. If they had a Multi-Row Formula Tool, though, they would also be able to talk to the people upstairs, in the attic, and in the basement as well.
The Multi-Field Formula Tool offers the same functionality as the Formula Tool, but offers the added benefit of applying a function across multiple fields of data all at once. Gone are the days of writing the same function for multiple fields.
Say there are four fields with dollar signs ($) that need to be removed. It could be done with a Formula Tool and a function written for each field: