This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
How to check for encoding or formatting issues with Excel worksheets
When Excel worksheets are used in Alteryx Designer, sometimes Designer has difficulty reading the data. Viewing the worksheets in XML format is a good way to check for encoding or formatting issues as all of the encoding and formatting is shown.
Product - Alteryx Designer, Gallery, or Server and an Excel file used in Alteryx workflows or applications
To view in XML, open the Excel worksheets in 7-zip or another application used for .zip files. Use the format C:\folder_path_of_the_workbook\Excel_file_name.xlsx\xl\worksheets. Next, click on the sheet name and the XML for the worksheet is shown.
Notice the type of encoding is listed at the top. Followed by the formatting schemas, and then formatting for the worksheet. If needed you can change the extension of the worksheet from .xml to .txt and edit the encoding or formatting of a worksheet to match a working example, and then change the extension back to .xml.
To edit, right click on one of the sheets, select Rename, and change the file extension to .txt. Edit as a text file, and then Rename again to change the extension back to .xml.
If the Excel workbook is created automatically, such as being created by a script, information needed for Designer to open the worksheets may be missing. Excel can sometimes add this information when opening the worksheet, but Designer is not able to add default encoding and formatting if it is missing. If you come across an issue, it is a good idea to try adding example data directly in Excel and saving as Excel should automatically add the needed encoding and formatting. Afterward, you can compare this with the broken file, and update it to match the working example.
There are other possible reasons why Designer may not be able to open an Excel spreadsheet. However, checking for encoding and formatting issues in XML view will catch many of the problems with opening Excel data in Designer, so it is a good place to start.
Character encoding Microsoft Office XML formats Structure of a spreadsheet
You may already know how to use the MIN() and MAX() functions to find the smallest and largest values in a list. But what if you needed the second smallest number or 3rd largest number in the list?
Excel has a function for this. Using the =SMALL function, you would specify the data range followed by 'x' smallest number you want to find. In the example below, we find the 2nd smallest value in a list:
Similarly, if you want the find the 'x' largest value in a list, you would use the =LARGE function. Here we use the =LARGE function to find the 3rd largest value in our list:
Let's look at how we do the same thing using Alteryx. We'll start with the same array of numbers using a Text Input Tool:
We want to find the 2nd smallest value in the list. We'll start by sorting the list in ascending order. Then we assign a record id to each row of data. Filter to select record id = 2 and use a Select Tool to drop the record id field (we don't need it in our final result) and that will leave us with our answer, '6'.
To find the 3rd largest value in the list, we simply change the sort order to descending and filter to select record id = 3:
Our result is '13', so everything checks out.
We replicated in Alteryx the Excel functions =SMALL and =LARGE. But let's add a couple of bells and whistles to our workflow and make it an app. This gives a user the ability to decide if they want to select the smallest or largest value from our list as well was what the value of 'x' will be. Begin by bringing a Drop Down Tool to the canvas.
Enter the text or question to be displayed ('Return smallest/largest value:). Under 'List Values' we'll choose 'Manually set values' and under 'Properties' enter:
Connect the Drop Down to the lightning bolt on top of the Sort Tool. Automatically an Action Tool will be inserted between the Drop Down and Sort Tools.
In the Action configuration widow, select @order - value = "Ascending".
Bring another Drop Down Tool to the canvas and enter the test or question to be displayed ('Enter Nth smallest/largest value:'):
Connect the Drop Down Tool to the lightning bolt on top of the Filter tool. An Action Tool will automatically be inserted between the Drop Down and Filter Tools:
Notice the expression in the Filter Tool is set up to send record id = 1 to the 'T' (true) output side of the tool. In the Action configuration window, select 'Expression - value = "[RecordID] = 1" and enter '1' under 'Replace a specific string:' located at the bottom of the configuration window.
Optional: Add an Output Data Tool to the end of the workflow so the results can be displayed. In this example, we will be exporting results to a temp html file:
So the complete workflow/app looks like this:
Let's run the app. Under the main menu and to the right of the run icon, click on the wand:
A window will pop up displaying the drop down menus you've setup with the Drop Down Tools. Let's find the 2nd smallest number in our column of numbers:
Click 'Finish' and a 'App Results' window pops up. Click on 'OK':
Our temp html files returns the value of 6 which is correct.
You now have an app other users can use to easily and quickly select the 'x' smallest or largest value in a list of numbers. To learn more about apps and interface tools in general, see here.
In this posting, we'll take a look at Excel functions that return today's date and current time. Then we'll see how to use Alteryx to do the same thing. We'll take this a step further and show how Alteryx can be used to return a large number of date-related information for any date using macros and apps.
To get today's date in Excel, you use the =TODAY() function.
And the =NOW() function will return today's date and time.
You can format date and time the way you'd like (eg. Nov-10 or November 10, 2016 instead of 11/10/2016).
Alteryx also has a couple of ways to get today's date and time. The first is macro available in the In/Out toolset called 'Data Time Now'. The tool's configuration provides many options for how you'd like to see the data, including date as well as date and time.
The other method is to use a tool (such as the Formula Tool) where an expression can be used with the function 'DateTimeNow()':
Use the same method if you want just the date or time.
There is similar function called 'DateTimeToday()' which will return the current data as of midnight (so the time comes back as 00:00:00).
What if you want information about a date other than today, however? I've written about calendar and date aggregation before and have made a calendar macro available for anyone to use. If you have a date in yyyy-mm-dd format, you join it to the Date field in the macro which returns the following fields:
Date: yyyy-mm-dd format; includes every day beginning 2000-01-01 through 2099-12-31.
Year: yyyy format.
Quarter: numeric representation of quarter (1, 2, 3 or 4 rather than Q1, Q2, Q3 or Q4).
Month: numeric representation of month; NO leading zeros.
MonthName: January, February, March, April, May, June, July, August, September, October, November and December; completely spelled out rather than an abbreviation.
WeekNumber: numeric representation of week; generally values range from 1–52, but occasionally a year will have a week 53; weeks 1 and 52 (or 53) may be partial weeks (i.e. less than seven days).
Day: corresponds to the calendar date with values from 1-31 (for months with 31 days); NO leading zeros.
DayName: Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday; spelled out rather than abbreviations; Sunday is the beginning of the week.
DayYear: day of year; values range from 1-365 except for leap years which have a day 366.
DayQuarter: day of quarter; values range from 1-92.
DayWeek: numeric representation of week where 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday and 7=Saturday.
Week StartDate: date in yyyy-mm-dd format and data type = Date; week begins on Sunday with the possible exception of week 1.
Week EndDate: date in yyyy-mm-dd format and data type = Date; week ends on Saturday with the possible exception of week 52 (or 53).
Month StartDate: date in yyyy-mm-dd format and data type = Date.
Month EndDate: date in yyyy-mm-dd format and data type = Date.
Quarter StartDate: date in yyyy-mm-dd format and data type = Date.
Quarter EndDate: date in yyyy-mm-dd format and data type = Date.
I've taken this a step further and created an app with the calendar macro embedded in it which allows a user to select a date and the fields they want returned at run time.
I've made a couple of version of the Calendar macro; one where the week begins on a Sunday and the other where the week begins on Monday. In the attached app, the macro where the week begins on a Sunday is used but can be easily replaced by the one beginning on Monday.
You probably already know that you can output results to multiple sheets of an Excel file. If not, you should check out our resource that explains how to do that very thing. But what if you run that workflow every day, and you want to keep the outputs from days past?
To find the full path and filename of a saved file in Excel, you use the =CELL function.
In Alteryx, you use a Field Info Tool to get this information:
The Field Info Tool allows you to see in tabular form the name of fields in a file as well as the field order, field type, and field size.
Name: field names within the file
Type: type of data field
Size: length of a data field
Scale: with respect to fixed decimal data types, scale refers to the digits of precision
Source: contains the full path and filename
Description: may or may not contain information; you can add a description via the Select tool
We're only interested in the Source field and this information will be the same for each field.
Using a Sample Tool, we select just the first record:
Notice the data in Source begins with 'File:'. We don't want that in the final output so we'll use a SUBSTRING function in the expression of a Formula tool to clean it up. Complete the workflow with a Select tool so we only get the Source field:
I'll mention here you can use the Directory Tool to find the full path and filenames in a directory.
Select the directory you want to search. File Specification has wildcard characters so you can limit your search to files containing specific character patterns or file types. In the example below, let's set up the File Specification to only return files with the '.xlsx' file extension:
We're only interested in the field 'FullPath' (first column) so we'll use a Select tool to drop the remaining fields.
While the Directory tool returns multiple filenames, it will not contain a worksheet name if the file is an Excel file. To get that information, you'll need to use the Field Info Tool as we did above.
In this article, we demonstrate how to import multiple worksheets from one Excel file, how to import multiple worksheets from multiple Excel files, and how to import a specific range of cells from an Excel sheet.
How To: Import multiple worksheets from one Excel file
In the case where you have one Excel workbook with several sheets, you can use a Dynamic Input tool to import the sheets instead of an individual Input tool for each sheet.
Note: for this method to work, the sheets of the Excel file need to have the same schema.
Use an Input Tool to select Sheet Names.
Connect a Dynamic Input tool to the Input tool.
Configure the Dynamic Input tool to read a list of Data Sources from the Sheet Names field provided by the Input tool.
The workflow is now configured to import the Excel sheets in a single input stream (you can see the Regions South and West from the two different Excel sheets).
How To: Import multiple worksheets from multiple Excel files
When you need to import multiple sheets from different excel files, you can modify the above method to work by turning it into a Batch Macro.
Start with the same workflow from the previous example.
Add a Control Parameter tool and a Macro Output tool to the canvas.
Connect the Control Parameter tool to the top of the Dynamic Output and Input tools. You should see two Action tools being automatically added to the canvas between the interface tools and the standard tools.
Connect the Macro Output tool to the output stream of the Dynamic Input tool. Your canvas should now look like this:
Now, we need to configure the Action Tools. The action type should be set to "Update Value". We need only to update the file name without changing the sheet name. Therefore, for both Action Tools "Replace a specific string" should be enabled. Please note that this string should contain the path to the file to input without the extension for the sheet.
If the Excel files have different schemas, in the interface designer we can set the macro to Auto Configure by Name or Position so that our workflow does not error out. Note: sheets within the same file will have to be the same schema.
By default, the Interface Designer window will be displayed in the left-hand side of the Designer window.
Click on the Cog icon in the left bar to access the Properties tab.
Select the Output fields change based on macro's configuration or data input option.
You now created a Macro. In order to use it, you will need to add it to a workflow.
Save the Macro as a Designer Macro (*.yxmc). Do not close the window with the Macro yet.
Right-click the new workflow canvas and select Insert > Macro.
Any Macro that is currently open in Designer can be added this way to the workflow. Note that it is also possible to create a Macro repository. This will enable to save Macros in one designated place and easily use them in workflows. See this Help Doc article .
Finally, add a Directory tool, and connect it to the Macro's input.
With this configuration, you can import the Excel sheets of multiple XLSX files in the given input directory (you can see Regions North and South from the two Excel files).
The file is being used by another process Unable to open file for read: FILEPATH. The process cannot access the file because it is being used by another process. To resolve, close any other applications (typically Microsoft Excel) that are accessing the file Some of the columns are not imported, but in Excel the file looks fine Alteryx uses the drivers that come with Microsoft Excel to import the files. Sometimes 3rd party software does not write data in the correct Excel format. To resolve, re-save the file in Microsoft Excel. It should now be correctly imported in Designer.
How To: Input a Specific Excel Range
Another functionality of Alteryx Designer is the ability to input a data subset of an Excel file. This comes handy when working with large data.
The procedure to input an Excel range would depend on the Designer version.
Prior to 2018.1
In this example, our data starts in row 5 and column B and ends in row 7 and column D.
Bring the Input tool into your module, then browse to the particular sheet in your Excel file that you wish to pull data from. It will look like the following image. Notice that Option #3, Table or Query , points to ‘Sheet1$’ , we will modify this to point to our data range.
To edit the Table or Query , click on the button with three dots ( … ) on the right side of Option #3.
Click the SQL Editor button and change the range to SELECT * FROM 'Sheet1$B5:D7'
Click OK. Now click on the Update Sample link In the Input Tool properties window to see the new range.
2018.1 and later
In versions more recent than 2018.1, the button with three dots (…) on the right side of Option #3 cannot be used anymore. Instead, click into the file path field, and edit it.
Use the following syntax filepath.xlsx|||'SheetName$RangeCell1:RangeCell2'. For example: \sample.xlsx|||'Sheet1$B5:D7'
imports the Excel file sample.xlsx from Sheet1, and the range B5 : D7 . Note that the value of 3. Table or Query has changed.
The Dynamic Input Tool Mastery article contains valuable insights on practical use cases of this powerful tool.
A more general take on inputting data can be found in this insightful Knowledge Base article.
Bananas grow on plants that are officially considered a herb since the stem does not contain woody tissue.
With an Input tool alone, you cannot specify multiple ranges within an Excel sheet to import at once, however, i t is possible to achieve this within Alteryx using a slightly different process!
The tool you want to use to accomplish this task is the Dynamic Input tool. The process is quite simple. Simply enter your Excel ranges into a Text Input Tool, complete with Sheet name and tick marks (Shift + ~), with one range on each line:
Connect your Dynamic Input to the Text Input and select the file you want to pull the ranges from. Set up the input as you would any standard Input tool to read a range. When doing so, be sure to check the option indicating that the First Row Contains Data (we'll rename the headers later). Using this option avoids rows being skipped due to a mismatch in the header fields:
For more information on the specifics of doing this, take a look at this article.
Once this is done, continue configuring the Dynamic Input tool. Choose the option to Modify the SQL Query and click the 'Add' drop down. Select the option to "replace a specific string", and remove all text except for the sheet name and range:
Once you are finished here, bring down a Dynamic Rename tool, and select the rename option that says "Take Field Names from Frist Row of Data". This option will move the first row of data into your field headers, reversing the effect of choosing the option for First Row Contains Data:
Finally, run the workflow and enjoy your newly unioned file!
The attached sample demonstrates the process using dummy data. You'll just need to repoint the Dynamic Input tool to the sample data spreadsheet.
We get quite a few requests asking how to add the current date to Excel spreadsheet file names. The difference in adding information to your file name in Excel and other output configurations is that you have to use the Reporting tools for Excel to accept the new file name.
Why would I have to use the Reporting tools, you ask? Simply put, Excel thinks that when you are adding a prefix or suffix to the file name that you are stating that additional information is the “Sheet Name” and not appending the file name from the Output Tool. The Render tool in the Reporting tab on your palette basically tricks Excel into thinking that you are creating a separate report and is able to append the date.
In order to add the date to the file name you will connect these tools to your workflow in the order, you see below.
In the Formula Tool, create a new field that is a String type. Go to the DateTime functions and add the DateTimeToday() expression. No configuration of the expression is necessary.
The Table tool will allow you to Group By this new field.
In the GroupBy Configuration, choose the Date field you created in the Formula.
In the Per Column Configuration, uncheck the new field. This will allow you to group by the new field name in the Render Tool and still remove it from the data so that your new field does not appear in your report.
Use the Render Tool to output your Excel spreadsheet with today’s date in the file name.
Output Mode - Choose a Specific Output File.
Output File - Specify the Excel 2007 Spreadsheet (xlsx) and point to where you would like to save the documents. The file name you specify will be replaced with the date in step 5 below.
You will then check the box that says Group Data into Separate Reports.
Field to Group on - You will choose your new field that you created.
Modify Filename By – Replacing Filename With Group.
Report Data – You will choose Table and can leave the rest as defaults.
When creating reports within Alteryx, you can name Excel tabs by specifying a column to use in the Layout tool. This is done in a 3 step process.
Choose your grouping column; below, the Table tool is being used and [DMA_Name] is being used for the Group By field. This would also apply with the Charting and Map tools.
In the Layout tool, change the Layout Mode to Each Group Of Records. Next, choose the column you would like to Group By; for the example we will use [DMA_Name]. Next, change the Orientation to Vertical with Section Breaks. By doing this, the Section Name option will appear at the bottom of the tool configuration and allow you to choose the column you would like to use to name the Excel tabs.
Update the Render tool to create an Excel spreadsheet. In our example, each of the Excel tabs was named for the DMAs contained in the data.
Keep on reporting!
Here in Customer Support, we like to make sure our workflows run as smoothly as possible. Sometimes you may ask yourself “Why can’t my excel file just open right after I output it?”
Where there is a will, there’s a way.
Ladies and Gentleman, I introduce to you the beautiful functionality of the Run Command Tool.
With this trick, you never have to go into the directory to manually open the saved file again.
Note: There are multiple ways of doing this with the run command. This is just a simple example.
For this trick, you’ll only need 5 additional tools.
Before your output tool, you’ll need a Block Until Done Tool. This is for quality control as Excel doesn’t like to open when you are writing into it at the same time.
This prevents Alteryx from writing and opening at the same time.
Next, you’ll need a Formula Tool, which will just add another field with your command. When making this dynamic, you’ll want to make sure you update both the path of your output and formula tool.
Whether you do that in a macro or app, it’s entirely up to you. After the formula tool, you’ll want a sample tool so that way we only select tool to deselect all the fields except for your command. Now just make sure your run command is set up properly. In this example, it’s set to write out the command as a bat file and be read back in as a command.
To learn more about how to configure the run command, I would read through this helpful article.
After you have this set, watch as your workflow has made your life even more automatic.
Finding the future date after 30 from today is easy in Excel. Use the =TODAY function and add 30.
There's an Alteryx function - DateTimeAdd - that does the same thing:
What if we wanted to know the date 30 working/business days from today? Excel has a function called 'WORKDAY' to do that.
And it isn't limited to just today's date. You can use any date.
Finding a future date 'x' working days from today - or any day - in Alteryx is a little trickier. At least it was. Attached is an app you can use to pick a date and the number of working days in the future.
From the snapshot above, you'll see the Calendar macro is used. There are two versions of this macro. One where the week begins on Sunday and the other where the week begins on Monday. In the attached app uses the macro where the week begins Sunday but the macro starting on Monday can easily replace it.
Can I read in an Excel file located in a zipped archive file from Amazon S3?
Unfortunately, this is not an option within the Amazon S3 Download Tool, as it only allows you to choose between CSV, DBF and YXDB files. However, this is possible within Alteryx with the use of a simple workflow utilizing a three line batch file, the Run Command Tool (master it here), and the AWS Command Line Interface (CLI).
In order to use the CLI, you must first download it and configure its settings. Please visit this page for information on how to do that. Once that is setup, you simply need to setup the batch file and configure the Run Command Tool.
In the first step, you will use a Text Input Tool to write the batch file code. This code will use the CLI to copy the ZIP file from the S3 bucket to a locally accessible drive. Configure the Text Input Tool as follows:
Make sure that line 2 points to where your CLI is installed.
In line 3, replace "alteryxtest" with the name of your bucket, "ExcelTest.zip" with the name of your ZIP file and enter in the correct location to copy the file to.
In the second step, you will use the Run Command Tool to do the following:
Write out the batch file ("Write Source")
Run the batch file created in the previous step ("Run External Program")
Read the file into the workflow ("Read Results")
When entering in the "Read Results" section, your ZIP file will not exist at this point so you cannot simply navigate to and select the file. So, you have two options:
Click on the "Input" button and enter in the full path of where you are copying the ZIP file (found on line 3 of the Text Input tool) along with the file name, a pipe character, and then in brackets, the sheet name. For Example:
Run the workflow once without the "Read Results" section completed in order to copy the ZIP file from the S3 bucket. Then, click on the "Read Results" button and navigate to the ZIP file and choose the Archive file to read it.
This same workflow can be used to read other archived files as well. However, you will have to make slight adjustments to the "Read Results" section of the Run Command tool. For example, if reading in a CSV file, you would simply include the archived file name. Since a CSV file does not have "sheets", the bracketed sheet name is not needed.
I plan to create a simple macro with a user interface that will do the same thing. Once complete, I will post it in the reply section.
Thanks for reading!
When writing to Excel, have you ever had the need to populate a column with values that come from formulas referencing other cells in your data? You’ve probably noticed that when writing to either .xlsx or .xls Excel file formats that your output still appears as a string, rather than a calculated value, like the below:
Each formula cell has the proper cell references and syntax but will need to be interacted with (add or remove a space in the formula and hit enter on your keyboard) to calculate the true formula value.
When expanding on the string you’ll see there’s an apostrophe (‘) preceding the formula and you’ll need to remove each and hit enter on your keyboard for the formula to execute.
In the attached v10.1 workflow, Writing to Excel with Formulas.yxmd, we provide a sample formula expression to build formulas with basic cell indexing (using a record ID field) and the ability to write formulas that can be interpreted by Excel. This is done by leveraging the .csv file format; when excel opens this output file type it automatically recognizes the formula syntax and will interpret it as such – populating each cell with the true formula calculated value:
Suppose that your spreadsheet has multiple sheets with the same structure and you would like to read several sheets into your module at once. In this case, the preferred alternative is to use the Dynamic Input tool.
How do I output to an Excel template file? It is possible to output your data to an existing Excel document that already has modified formats and column names. For example, the below Excel file has existing data in the first 4 rows. If you wanted to add addresses to this file while keeping the first 4 rows, the first step would be to highlight the area you want to write to. If you don’t know the exact length/width of your data, I would recommend going large: Once you have your desired area highlighted, right-click and choose the Define Name… option: A popup box will appear, enter in a name of your choosing and click OK: You also need to make sure that the sheet you are saving to doesn’t contain any spaces in the sheet name. Once verified, save the template and close out: Below is an example of the sample data that will be added to the above template: In Alteryx, use a Input tool to point to the data you would like to use to update the template file: In the Output, you will want to choose the template file, which will cause the below message to appear, choose yes to overwrite: When saving to Excel, the below window will popup, enter the name you used for the range you highlighted in the template file: After clicking OK, the Output configuration area will populate. Change the Output Options to Delete Data & Append: You can now run the module. Once the module is finished, you can open the updated template file, you should see your previously formatted rows/columns plus the new data you wanted to append: If you set a format to the range you named (color, text style, bold, etc), Excel will keep it so that the data you are writing to the file will appear with the specified format.
If you have two or more files, different structure, and you would like to output each file into a separate tabs in an Excel spreadsheet. You could use the table tool to create snippets and the Layout tool to create sections breaks. Bring in your files using the Input tool and connect them to Table tools to create the snippets. Finally, Join them by record position. The Layout Tool properties should look as follows. Select Vertical with Section Breaks for the Orientation setting. The output will show each file in a separate tab:
Multiple files can easily be combined into a single Excel file containing multiple worksheets (or tabs). The format of the files do not need to be the same; they can be completely different.
In the example attached to this article, two files, a Customer .csv file and a Store .xlsx file, are output to a single Excel spreadsheet (.xlsx format) with one tab for Customer and another for Store. These files contain different data elements and have different layouts.
The first step is to create a new field in each dataset (e.g.: 'TabName') and set the expression to the name you want each worksheet to have. In the attached example, 'TabName' is set to 'Customer' and 'Store,' respectively for each dataset.
When you get ready to output your data, use an output tool for each dataset and configure them in the same way. The file format is 'Microsoft Excel (*.xlsx)' and when prompted to select a worksheet for the output, select 'Sheet1'. Check the 'Take File/Table Name' checkbox located at the bottom of the configuration window. Select 'Change File/Table Name' in the dropdown and under 'Field Containing Name or Part of File Name' enter 'TabName'. You probably don't want TabName in your final output, so uncheck the 'Keep Field in Output' checkbox.