This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
While the Join tool is easily one of the most used tools in Alteryx, it can also be one of the most misunderstood. This is even more likely true if a new user hasn’t previously used joins in any other data-manipulating platform or they are joining big tables where they might not be keeping track of the records inside the fields they are joining on.
For most tools that already have “dynamic” in the name, it would be redundant to call them one of the most dynamic tools in the Designer. That’s not the case for Dynamic Input. With basic configuration, the Dynamic Input Tool allows you to specify a template (this can be a file or database table) and input any number of tables that match that template format (shape/schema) by reading in a list of other sources or modifying SQL queries. This is especially useful for periodic data sets, but the use of the tool goes far beyond its basic configuration. To aid in your data blending, we’ve gone ahead and cataloged a handful of uses that make the Dynamic Input Tool so versatile:
The Fuzzy Match Tool provides some pretty amazing flexibility for string joins with inexact values – usually in the case of names, addresses, phone numbers, or zip codes because many of the pre-configured match styles are designed around the formats of those types of string structures. However, taking advantage of the custom match style and carefully configuring the tool specific to human entered keyword strings in your data can also allow you to use the loose string matching feature of the tool to match those values to cleaner dictionary keyword strings. If done properly, it can help you take otherwise unusable strings and, matching by each individual word, recombine your human entered data to a standardized format that can be used in more advanced analyses.
As long as you know where to look, data has all the answers. Sometimes, though, those answers aren’t clear as day. More often than not, they need to be communicated in an effective format - a format that can let the data talk and highlight the important motifs for you. Another favorite of the Reporting Tool Category , the Charting Tool can do just that by adding expressive visuals to any report or presentation. Offering an exhaustive list of charts to choose from (area, stacked area, column, stacked column, bar, stacked bar, line, tornado, pareto , box and whisker, scatter, bubble, polar, radar, pie), the Charting Tool will give you the ability to add descriptive visuals, with legends and even watermarks, to your reporting workflows that will help you find the answers in your data.
Data Integrity refers to the accuracy and consistency of data stored in a database, data warehouse, data mart or other construct, and it is a fundamental component of any analytic workflow. In Alteryx, creating a macro to compare expected values to actual values in your data is quite simple and provides a quality control check before producing a visual report. Let me show you how to build this.
The two inputs represent the actual and expected values in your data. These data streams are passed through a Record ID tool to keep positional integrity and then passed on to the Transpose tool to create two columns. The first column contains the field names and the second column shows the values within each field. This data is then passed on to a join, matching on Record ID and the Name of the field, in order to compare each value. Lastly, if the data does not match from expected to actual, a custom message will appear in the results messages alerting the user where the mismatch happened within the dataset. The image below shows the error message produced if values differ across datasets.
Web scraping, the process of extracting information (usually tabulated) from websites, is an extremely useful approach to still gather web-hosted data that isn’t supplied via APIs. In many cases, if the data you are looking for is stand-alone or captured completely on one page (no need for dynamic API queries), it is even faster than developing direct API connections to collect.
The Directory Tool gives you a data-stream input that contains information about the files and folders (file name; file date; last modified, etc.) for the location of your choice, which you can then use for more complex interactions with the file system. Basically, the Directory Tool could also finally help me track down my keys - not just where I put the keys in the house, but also how long they've been there, and when they were last moved.
The Sample Tool allows you selectively pass patterns, block excerpts, or samples of your records (or groups of records) in your dataset: the first N, last N, skipping the first N, 1 of every N, random 1 in N chance for each record to pass, and first N%. Using these options can come in the clutch pretty often in data preparation – that’s why you’ll find it in our Favorites Category, and for good reason. While a great tool to sample your data sets, you can also use it for:
You know what really stinks? Working with addresses that aren’t standardized or verified. Whether human-input, or one of the many address formatting standards in the U.S., being stuck with an address you can’t either (1) identify or (2) ensure it exists can be a real pain in the… well…
CASS is here to help!
The ConsumerView Matching macro enables users to match their customer file to the Experian ConsumerView data. Starting with customer information such as name and address you can leverage the ConsumerView macro in Alteryx to append a variety of information about your customers such as household segmentation, home purchase price, presence of children in a home, estimated education and income levels, length of residence, and many more!
The Field Info Tool is another one of the gems hidden in the Developer Tool Category – however don’t be intimidated, this is a tool for all of us! The purpose of the Field Info Tool is to give you the information about the fields in your data in a way that you can use down-stream as part of your workflow. There are no settings to configure, so just drop it on your canvas and you’re good to go!
Believe it or not, data can be beautiful. Take your black and white data points and add some color to them in visuals with the suite of tools found in the Reporting Category https://help.alteryx.com/current/index.htm#Getting_Started/AllTools.htm#Report_Presentation_Tools ! If you’re looking to create reports, presentations, images, or simply output data with a bang, you can use the Render Tool https://help.alteryx.com/current/PortfolioComposerRender.htm paired with other Reporting Tools to create HTML files (*.html), Composer files (*.pcxml), PDF documents (*.pdf), RTF documents (*.rtf), Word documents (*.docx), Excel documents (*.xlsx), MHTML files (*.mht), Power Point presentations (*.pptx), PNG images (*.html), and even Zip files (*.zip) – packed with formatting and visual aesthetic that’ll make any data-geek’s mouth water.
When you’re frequently writing and rewriting data to Excel spreadsheets that you use for Excel graphs and charts, it can quickly become a hassle to make and remake your reporting objects to keep them up-to-date so you’re visualizing the most recent data. A best practice to keep the hassle out of the process exists, though! If you keep your plots isolated to their own spreadsheet, referencing cell values in another sheet used to capture your data, you can simply overwrite the source data sheet and your plots will update automatically upon launching Excel. In the example below (attached in the v10.6 workflow Dynamically Update Reporting from Excel Spreadsheets.yxzp) we’ve included the workaround to make your Excel outputs seamless.
The Auto Field Tool : a tool so easy you don’t have to do anything – just put it on your canvas and viola. Automatically optimized data types. If you’re running into data type related issues and errors in your workflows, or just looking to add some speed or reduce the occupied disk space your data is hoarding – look no further than the Preparation Tool Category ’s Auto Field Tool, which reads through all the records of an input and sets the field type to the smallest possible size relative to the data contained within the column.
This article provides step by step instructions for setting up a standard data install. There is already a fantastic article about Network Installs you can check out if you're interested.
A standard data install is pretty straight forward. Plug in the external drive and double click on the DataInstall.exe file to launch the installer.
When the welcome screen comes up, go ahead and click Next:
Read and Accept the license agreement on the next screen and click Next again:
Select the data sets you would like to install. If you want all of them just click the All button on the right. Otherwise, you can select individual datasets by selecting the check box next to them:
Next, choose any previously installed datasets that you would like to remove by selecting them in the tree structure similar to the previous screen. You don't have to choose anything here if you want to keep everything, however, keep in mind that the data bundle is very large and you may not have enough space to keep multiple vintages installed locally.
Finally, browse to the file path you would like to install the data to. The default path will be auto-populated but if you'd like to install it somewhere else just update the path:
Once you set your path, click Finish and sit back and relax while your data installs.
Question I know I can input .csv and other delimited text files - what about text (.txt) files?
Answer Yes - you can input text (.txt) files! The approach described below applies to all delimited text files - and can also input other file types (like XML) as plain text.
Simply point your Input Data Tool to a file of your choice and specify the configuration options below:
The approach uses .csv as the File Format and the \0 (none) delimiter to keep from delimiting the file, inputting it as a single body of text. Since the whole of the file is being input as one string field, you'll have to enter an adequate string length to capture all the file's contents (9999 in the above - you can use an Auto Field Tool to minimize this after inputting). This will usually mean keeping from using the first row of the file as field names, so pay mind to Option 6 (unchecked).
Once the file is input as a single block of text, the Text to Columns Tool can make short work of parsing it into a tabulated format!
The technique shown above can also help to workaround common input errors from delimited files.
Question Can I read in an Excel file located in a zipped archive file from Amazon S3?
Answer Unfortunately, this is not an option within the Amazon S3 Download Tool, as it only allows you to choose between CSV, DBF and YXDB files. However, this is possible within Alteryx with the use of a simple workflow utilizing a three line batch file, the Run Command Tool (master it here), and the AWS Command Line Interface (CLI).
In order to use the CLI, you must first download it and configure its settings. Please visit this page for information on how to do that. Once that is setup, you simply need to setup the batch file and configure the Run Command Tool.
In the first step, you will use a Text Input Tool to write the batch file code. This code will use the CLI to copy the ZIP file from the S3 bucket to a locally accessible drive. Configure the Text Input Tool as follows:
Make sure that line 2 points to where your CLI is installed.
In line 3, replace "alteryxtest" with the name of your bucket, "ExcelTest.zip" with the name of your ZIP file and enter in the correct location to copy the file to.
In the second step, you will use the Run Command Tool to do the following:
Write out the batch file ("Write Source")
Run the batch file created in the previous step ("Run External Program")
Read the file into the workflow ("Read Results")
When entering in the "Read Results" section, your ZIP file will not exist at this point so you cannot simply navigate to and select the file. So, you have two options:
Click on the "Input" button and enter in the full path of where you are copying the ZIP file (found on line 3 of the Text Input tool) along with the file name, a pipe character, and then in brackets, the sheet name. For Example:
Run the workflow once without the "Read Results" section completed in order to copy the ZIP file from the S3 bucket. Then, click on the "Read Results" button and navigate to the ZIP file and choose the Archive file to read it.
This same workflow can be used to read other archived files as well. However, you will have to make slight adjustments to the "Read Results" section of the Run Command tool. For example, if reading in a CSV file, you would simply include the archived file name. Since a CSV file does not have "sheets", the bracketed sheet name is not needed.
I plan to create a simple macro with a user interface that will do the same thing. Once complete, I will post it in the reply section.
Thanks for reading!