Looking to install additional R packages? Here's how! Download the Install R Packages app found in the Predictive District of the Gallery: Unzip the .yxzp and run the app. Provide a comma-delimited list of the packages you'd like to install: Packages are installed in the user's personal R library folder, which is the folder that R searches by default to find available R packages. If installing to a different folder, ensure you have write permissions to that folder. In some cases, due to firewall settings and proxy server use, R's default method for accessing the Internet does not work; in these cases, R's Internet2 option will usually address the issue, and use of this option can be selected by the user. And don't forget to call that package in your R code, using library(). Happy Alteryx-ing!
The Sample Tool allows you selectively pass patterns, block excerpts, or samples of your records (or groups of records) in your dataset: the first N, last N, skipping the first N, 1 of every N, random 1 in N chance for each record to pass, and first N%. Using these options can come in the clutch pretty often in data preparation – that’s why you’ll find it in our Favorites Category, and for good reason. While a great tool to sample your data sets, you can also use it for:
Scenario: You've been given data for a new project and it contains lots of extra (and unnecessary) rows before you even get to the information you need to work with. Look familiar? For many Alteryx users, this situation is all too common. Luckily, there's a pretty easy way to resolve this issue using the Sample and Dynamic Rename tools! Method: To demonstrate this approach, we'll use some sample data that has extraneous information and space at the top (Rows 1-4) of the spreadsheet in Figure 1 (below). While the information itself might be important, it's going to interfere with our data analysis. What we really want to see is the information in Row 5 as our header name and the information from Row 6 onwards to be our data. Figure 1: The data in rows 1-4, as seen in Excel, should not be included in the data analysis. Rather than manually re-format our dataset, we'll bring it into Alteryx and let the data preparation begin! Using an Input Tool, we'll navigate to the location of our data file. The tool gives us a preview of what to expect when bringing in the data (Figure 2). This format is nowhere near perfect, but we still have a few tricks up our sleeve! Figure 2: The Input Tool shows how the data will be brought into Alteryx. Our heading is not correct, and we still have a few lines of data (in light yellow) to eliminate while keeping the data we want to analyze (in dark yellow). A quick visual assessment indicates that we'll need to skip the first three rows of data (the information in Row 4 will become our field names). We can remove these data using a Sample Tool. In the Sample Tool configuration (Figure 3), we'll opt to "Skip the 1st N Records"; in this case, N will be equal to 3. Figure 3: Set the number of records to skip, or remove, from the top of the dataset. Now that we've removed the first 3 rows of data, we are much closer to the version of the data format we'd like to work with. The data we'd like to use as the field names (Number, FirstName and State) are now in the first row of data. We'll use the Dynamic Rename Tool to re-name our fields using the option to "Take Fields from the First Row of Data" (Figure 4). And, voila!! Our data is now ready to use for the next steps of our analyses. Figure 4: After removing unwanted rows of data and re-naming the fields, our data is ready for further analyses. *See the attached sample workflow (v10.5) for an example of this process.
The Fuzzy Match Tool has the ability to match first names against a set of Nicknames to help return better matches. The Nickname table (which can be found at C:\Program Files\Alteryx\bin\RuntimeData\FuzzyMatch\Nicknames) is used as a lookup within the Fuzzy Match tool when you select it as an option. Selecting “Name w/ Nickname” as your Match Style automatically selects the Common Nicknames table, but often users would like to add to this list or even create their own custom table. This article will walk you through how to edit this list, and provide you with some tips and tricks when matching with nicknames. Creating a Nickname table The nickname table is installed by default in C:\Program Files\Alteryx\bin\RuntimeData\FuzzyMatch\Nicknames and is saved as an Alteryx database file (.yxmd). We can easily pull this into Alteryx to add additional names, or we can even generate our own table. The .yxdb file contains 2 fields: GroupName Name Full Name goes here Nickname goes here William Bill Adding Additional Names: Creating your own file: Once the file is created, place the .yxmd in the directory above. You should now be able to see multiple tables available from the dropdown within the tool.: See the attached v10.6 workflow for an example of the above! Tips and Tricks when working with Nicknames Set Generate Keys to “None” when using the Names w/ Nicknames match style IF you have the First Name in a single field. If your name is contained in a single field (John Smith or Smith, John), you will want to select a method to Generate Keys and check the box “Generate Keys for Each Word”. The “Soundex” method of generating keys is generally preferred when working with names.
A common application in spatial analytics is to visualize and analyze data in multi-ring trade areas. This type of analysis is helpful for analyzing data in incrementally increasing distances from a spatial object. Such examples might include quantifying the number of customers with a 10, 20, and 30 minute drivetime, analyzing the demographics of a population within certain distances of a location, or visualizing the strength of a cell tower signal with increasing distance from the tower. Depending on the analysis to be done, the multiple Trade Areas may need to be configured as non-overlapping (separate rings) as opposed to nested as concentric circles. Using the Trade Area tool , both types of spatial objects can be created! Creating Multiple Trade Areas (Overlapping) To create multiple Trade Areas as nested concentric circles, select the option to specify the Trade Area radius as a 'Specific Value' and enter the radii for each Trade Area. A polygon output will be created for each specified Trade Area Radius. The sample configuration (Figure 1) creates three polygons (5 mile radius, 10 mile radius and 15 mile radius). Polygons are visualized in the order that they are listed. To create a "bullseye", list the radii for the Trade Areas in descending order. Note that the larger Trade Areas include the areas of the smaller Trade Areas (i.e., the 10 mile radius Trade Area includes the same area specified by the 5 mile Trade Area). To avoid "double counting" of area, consider creating non-overlapping trade areas (described below). Figure 1: Create multiple Trade Areas around a point spatial object with the Trade Area tool. List the multiple radii (and specify the unit of miles, kilometers or DriveTime minutes), separating each with a comma. Creating Non-Overlapping Trade Areas: To create multiple non-overlapping Trade Areas ("doughnuts"), select the option to specify the distance ranges as 'Specific Values' and enter the distance ranges for each ring. Distance intervals should be specified as ranges (0-5 and 5-10), which each range separated by a comma (Figure 2). As a result, each ring is created as an individual polygon that begins at the distance specified by the minimum of the range and ends at the distance specified by the maximum of the range. Figure 2: Each "doughnut" is specified as a distance range to create non-overlapping spatial objects. The highlighted "doughnut" below represents the second distance range (5-10 miles) from a point spatial object. * For additional information on tool configuration, see the attached workflow (created in v10.6).
The Join Tool is the quintessential tool for data blending within Alteryx. As such, it is also one of the most widely used tools. The Join Tool allows you to join data together from two different sources in two different ways: by record position and by specific fields. Selecting by record position will attach the two datasets together where it will match up each record by the position it is in. Thus record 1 of the left dataset will be in the same row as record 1 on the right in the J output and so on. If one dataset from either side has more records than the other those records will not be joined and they will be placed in there corresponding right or left output (L or R). Joining by specific field will match records up based on a specific field or multiple fields. This article goes into how that option works in more depth and detail. I highly recommend it as a read, as it covers frequent behaviors of the tool that you might run into.
The Email Tool is a tremendously useful shortcut when it comes time to disseminate your analyses and other results straight from your workflow. However, in order to do so, it must communicate using Simple Mail Transfer Protocol (SMTP), which is often restricted by IT infrastructure and firewalls to protect organizations from spam. As a result, many users excited to try the tool get the direct, yet demoralizing, error below (among others): “SMTP Failed.” That’s why we’ve detailed in this article the steps you can take to investigate what, exactly, is giving you trouble: Autodetected SMTP server Using autodetect SMTP Alteryx communicates with destination mail servers directly, acting as its own mail server. If autodetect isn’t working, this usually implies firewall restrictions, as it is quite common for IT to block SMTP from any machine other than the company's SMTP server. You can check that autodetect’s default port (25) is open using the Telnet instructions in the section below. Manually-entered SMTP server First make sure a colon and port number are appended to the server name: Does this SMTP server use SSL/TLS or require username/password authentication? Unless the SMTP server uses windows authentication you won’t be able to use the Email Tool, as SSL and TLS are not yet supported through the tool. You can, however, look into other approaches to sending emails in the Designer that can accommodate those requirements. If not, do you have the required ports open in your network firewall? You can check with your IT team for port numbers and statuses, but the default ports you can check yourself are usually 25, 445, 465, 587, and 993: You can check to see if a server and port are open using the Telnet utility; if you have Telnet installed, open the command prompt and simply type telnet. If you do not see the second prompt above then you’ll have to install a Telnet/SSH third party client like PuTTY. From either the Telnet prompt or client, you can open a connection to the server and port to test: In Telnet, connect to the server and port using the command below. In PuTTY, opening the port will look like the following. Either approach will then send you to the following prompt. Then use these commands ( is the enter key) to send a test email that, if received, will indicate that your port is open. HELO mail from: rcpt to: data subject: . To send the email, you must end the body by hitting the enter key (), then period, then enter again (please note that after specifying your subject you must also press the enter key twice – not doing may neglect the message body argument). The test should look something like the below: If the email sends and the mail to address confirms receipt, then your port is open. Otherwise, you should receive an error that should help your IT team diagnose why the traffic is being blocked. Use the steps above to determine likely causes for the error and you’ll be able to take steps to get the Email Tool unrestricted in your network. Once that happens, bid adieu to whatever repetitious emails you might have to send in the future!
Question Why did all of my scheduled workflows fail? What does the error “Logon Failure: unknown user name or bad password” mean? Answer This error is due to the credentials that are used for the scheduler are no longer valid and have most likely expired. To fix this, the Run As setting has to be updated in the scheduler machine. Someone with local admin rights to the machine should go to the System Settings (Alteryx > Options > Advanced Options > System Settings) then click Next until the “Run the Worker as a Different User” menu is selected and update those credentials.
The Association Analysis Tool allows you to choose any numerical fields and assesses the level of correlation between those fields. You can either use the Pearson product-moment correlation, Spearmen rank-order correlation, or Hoeffding's D statistics to perform your analysis. You can also have the option of doing an in-depth analysis of your target variable in relation to the other numerical fields. After you’ve run through the tool, you will have two outputs:
The Date Time Now Tool is part of the Input Tool Category and it is actually a macro encapsulating other Alteryx tools. To use it, only one selection needs to be made: an output format. That's it, then you can go about your business. You also have the option to output the time with that date.
Date/Time data can appear in your data in string formats (text fields) or date formats. The DateTime Tool standardizes and formats such data so that it can be used in expressions and functions from the Formula or Filter Tools (e.g. calculating the number of days that have elapsed since a start date). It can also be used to convert dates in datetime format to strings to use for reporting purposes.
Let's start with the basics of how to create a report map in Alteryx. To start off, ensure that the layers you want to show in your map have a spatial object field. This can be checked by placing a select tool and confirming that there is a column of type 'SpatialObj.'
The Union Tool, the aptly named join category tool with DNA on it, accepts multiple input streams of data and combines them into one, unified, data stream. Whereas the Join Tool combines datasets horizontally (either by a record ID or record position), the Union Tool combines datasets vertically. Not unlike how two nucleic acid strands are unified to form the double helical DNA.
The RegEx tool is kind of like the Swiss Army Knife of parsing in Alteryx; there are a whole lot of ways you can use it to do things faster or more effectively, but even if you just use the blade it's still immensely useful. Sometimes that's all you need, but if you do take the time to figure out how to use a few other tools in that knife, you'll start to see that there isn't much you can't do with it.
The Multi-Field Formula Tool offers the same functionality as the Formula Tool, but offers the added benefit of applying a function across multiple fields of data all at once. Gone are the days of writing the same function for multiple fields. Say there are four fields with dollar signs ($) that need to be removed. It could be done with a Formula Tool and a function written for each field:
Linear regression is a statistical approach that seeks to model the relationship between a dependent (target) variable and one or more predictor variables. It is one of the oldest forms of regression and its applications throughout history have been endless for modeling all kinds of phenomena. In linear regression, a line of best fit is calculated using the least squares method. This linear equation is then used to calculate projected values for the target variable given a set of new values for the predictor variables.
The Message Tool within the Alteryx Designer is your own personal car alarm. This tool can provide you warnings or errors when your data doesn't meet a user-specified criteria or it can set up to tell you when data does not match. The Message Tool can be set up to pick up records before, during and after the records have passed through the tool itself. This makes it useful for evaluating your dataset at different parts of your workflow.
Data blending, transformation and cleansing..oh my! Whether you're looking to apply a mathematical formula to your numeric data, perform string operations on your text fields (like removing unwanted characters), or aggregate your spatial data (among many other things!), the Formula Tool is the place to start. With the examples provided below, you should be on your way to harnessing the many functions of the Formula Tool:
Often times in data preparation, the need for order in your records will arise. When that situation occurs, the Sort Tool has your back. It’s just that sort of tool. Effortlessly arranging your records – be it alphabetical, numeric, or chronological in order – while not quite a mind-numbingly complex operation, has ample utility. Sorting your records upstream of many tools can even optimize processing time. The fairly simple use cases below are techniques that frequently pop up in the data blending trenches:
There’s a lot going on in the world of analytics. Endless data stores and insight are at the other end of an internet connection and, as analysts, we’re always in on the action. Being in the thick of the fray with data whizzing by at lightning speeds, being equipped with the right tools is a must. Like you, Alteryx also likes to live dangerously, and we’re always ready for action.