community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Knowledge Base

Definitive answers from Designer experts.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE
 You are using an unsupported browser for translation. Please switch to another browser.

This article describes and explains the outputs of the Decision Tree Tool.
View full article
Welcome to the closing chapter of our voyage through the Pre-Predictive series! This has been a four-part journey introducing you to the thrilling world of data investigation. This section covers the plotting tools included in the Data Investigation Toolbox.
View full article
Question What are some good resources to help figure out how to estimate the model?  Know if results are good or bad?  Understand regression reports? Answer Alteryx focuses on how the tools function in the Designer.  The help pages for each tool explain how to use the tools, with reference links to assist in understanding the methodology.  In addition, please find links to some of the outside and Alteryx resources below:   Outside Resources * Online courses (mix of recorded presentations and weekly practice exercises). Many of them are free (with the option of paying a fee for a certificate) https://www.coursera.org/courses?query=statistics   * Book (Free online with video) https://www.openintro.org/stat/textbook.php   * Websites (includes extensive links) www.analyticsvidhya.com www.datasciencecentral.com   Alteryx Resources * Alteryx has a series of four recorded webinars which cover the predictive tools. http://www.alteryx.com/virtual-training * Community!
View full article
  With the release of 10.6 came awesome new features, and an upgrade in the underlying R version (from 3.1.3 code named "Smooth Sidewalk," to 3.2.3 code named "Wooden Christmas-Tree").  Using the incompatible R version will cause errors in your R macros.     Simply make sure that your Predictive Tools download is the version compatible with your Alteryx Designer version:     Users on 10.5 should continue to use the R3.1.3 version.   When using Alteryx and Microsoft Revolution R Enterprise, a separate predictive tools install is required (in green). For details, see the Alteryx and Revolution Analytics Integration Guide.   And remember to use the non-Admin Predictive version with non-Admin versions of Alteryx Designer.   To install Predictive Tools for Alteryx 10.0, go to Previous Releases. For Alteryx 9.5, within Designer, go to Help > Install Predictive Tools.    Happy Alteryx-ing!
View full article
An explanation of stochastic processes, pseudorandom number generators, and their existence in Alteryx.
View full article
We love helping users be successful with Alteryx, and this means providing a ton of great resources for getting started, learning more, and keeping you up to date with all the amazing stuff we're doing here at Alteryx… and the most compelling is Predictive!   Check out the Predictive District on the Gallery. There are great macros, apps, and sample workflows to demonstrate some nifty new tools.  This post by DrDan on the Analytics Blog gives an overview of what's currently available – stay tuned for additions!   One of my favorites is the Predictive Analytics Starter Kit Volume 1.  It enables you to learn the fundamentals of key predictive models with an interactive guided experience. Examples include Linear Regression, Logistic Regression, and AB Testing, and demonstrates the steps necessary to develop the dataset needed for analysis, and then how to actually build these predictive models yourself.   With v10.6, we introduced the Prescriptive Tool Category, comprising the Optimization and Simulation tools, to assist with determining the best course of action or outcome for a particular situation or set of scenarios. The Engine Works Blog has an introduction to this toolset, plus an extensive use case demonstration.   If you need more Optimization and Simulation action, there are several sample workflows, including Fantasy Sports Lineups (hey, sports fans – blog post here!), a mixing problem, workforce scheduling, and more!   Speaking of use cases, the software itself contains a plethora of predictive sample workflows - and the installed Starter Kits show up here, too!  Help > Sample Workflows > Predictive Analytics.   Of course, don't forget the Predictive Analytics help pages, for overviews and configuration tips.   Visit our Product Training page for On-Demand and Virtual webinars on everything Predictive – regression modelling, cluster analysis, time series…  As always, please begin with Data Prep and Investigation! Can I mention the Field Summary Tool enough times?   Want to show off the interactive visualizations from the models you've built?   This Knowledge Base post shows you how.   Another Engine Works post outlines how to build your own Custom Interactive Visualizations (Part 1 and counting…)   For the most in-depth, resource-rich training on leveraging predictive analytics to answer your business questions, consider the Udacity Predictive Analytics for Business NanoDegree.  It consists of seven courses focused on selecting the right methodology, data preparation, and data visualization as well as four courses that will equip you to use predictive analytics to answer your business problems.   But really, it all starts with the Community. Cruise the Knowledge Base posts, search for Predictive or other favorite keywords, follow the blogs… and for the love of Ned, just play with the software!  It's how we learn :)   Happy Alteryx-ing!
View full article
How to save your predictive model.
View full article
If you are building a predictive model, inevitably you will want to analyze the effect that your independent variables have on your dependent variable. This article is meant to shed some light on the Alteryx-specific options for this type of analysis!
View full article
A common task that analysts can run into (and a good practice when analyzing data) is to determine if the means of 2 sampled groups are significantly different. When this inquest arises, the  Test of Means  tool is right for you! To demonstrate how to configure this tool and how to interpret the results, a workflow has been attached. The attached workflow (v. 11.7 ) compares the amount of money that customers spent across different regions in the US. The  Dollars_Spent  field identifies the amount of money an individual spent and the  Region  field identifies the region that the individual resides in (NORTH, SOUTH, EAST, WEST).
View full article
Welcome to Part 2 of the Pre-Predictive series! After a strong start but long hiatus, we will be resuming our tour of the Data Investigation Tools. This section will cover the Frequency Table, Contingency Table and Distribution Analysis Tools.
View full article
Regression analysis is widely used for prediction and forecasting. Alteryx customers use these statistical tools to understand risk, fraud, customer retention and pricing, among many other business needs.
View full article
Impute missing data with dynamic values other than the field's mean, median or mode.
View full article
The TS Filler Tool t akes a data stream of time series data and fills in any gaps in the series.
View full article
We had a fun question in the solutions center at Inspire this year.  A customer showed us the sample for network analysis and asked us if it is possible to create one that has a picture for each icon.       We've provided a sample workflow (attached) that you can use as a reference to add as many pictures as needed.     Please note that the field names are case sensitive.   Happy Visualizing!
View full article
My code runs in R, but not in the R Tool?
View full article
With Alteryx, the power to blend, clean and perform advanced analytics on disparate data is as easy as dragging and dropping tools with the click of a mouse. The new release of Alteryx Designer 10.0   expanded our predictive analytics tools to include MB Affinity, Network Analysis, In-DB Linear and In-DB Logistic regression tools.   Alteryx also has the flexibility to add additional R packages not integrated with our robust collection of predictive tools. We have had a question arise from one of our clients asking where they can find exponential (non-linear) regression. The recommended the R package for this type of analysis is the “nlstools”.   For more information on this R package please find the below link for reference: https://cran.r-project.org/web/packages/nlstools/index.html   Also, please find the link for the Alteryx Gallery app below for reference on how to install additional R packages. If you need help with the process, please let us know, and we can get our R experts in Customer Support to assist with the setup. Just email us at support@alteryx.com.   https://gallery.alteryx.com/#!app/Install-R-Packages/57bb2a58a248970b4472c2e6   Tony Moses Client Services Representative
View full article
This post is part of the "Guide to Creating Your Own R-Based Macro" series.   In the case of this macro, polishing involves several elements, adding error checking, adding a reporting capability to the macro, documenting the workflow of the macro, making the connections for the interface tools "wireless", and providing the macro with its own icon. In terms of error checking, a lot of what typically would need to be addressed is handled in Alteryx's interface tools by limiting user input to only appropriate data types. The are two other possible user input errors for this macro, the user may neglect to select any potential predictor fields, or may not have selected any of the three entropy measures.   Adding error checking involves adding Error Message tools and connecting them to the appropriate interface tools (in this case the List Box tool to select predictors, and the three Checkbox tools to select importance measures). What the Error Message tools do is determine whether predictors and/or predictor importance measures have been selected by the user, and return a error message if one or both of them has not been provided by the user. Examining the Error Message tools in the macro accompanying the introductory post in this series is the best way to see how this is done.   Adding a report requires both the addition of some additional lines of code in the R tool, and some additional tools in the macro's canvas. One construct that is very common in the R code used in the predictive macros packaged in Alteryx is the use of what are really key-value pair tables. One common one is often labeled grp_out in the R code, and contains the fields grp (the labels) and out (the values), which is used to bundle report elements together in a way that allows them to be easily sent from R to Alteryx, and easily manipulated within Alteryx to create a report. To assist in accomplishing these objectives, there are R "helper functions" that are included in the AlteryxRDataX package (an open source R package that is part of the Alteryx predictive installer) to quickly format data in a convenient way. In addition, there are a set of "supporting macros" that help format the data from R quickly in Alteryx. Often these tools help address differences in the "dimensionality" of outputs. In this case, we want to include the name of the target field that is the focus of the analysis in the report, which is only a single data item. In contrast, the number of reported measures depends on both the set of potential predictors specified by the user (which needs to be one or more), and the set of measures to report (which can range from one to three).   Ultimately, R data frames (or certain types of lists) are passed as tables to Alteryx, resulting in all the data elements needing to have the same number of fields when passed as a single data frame / table. The use of the grp_out table allows this to be accomplished. To make things more concrete, the target field name is passed in the first row of the R data frame, with its label being "table name" (the value in the "grp" field) and the actual name contained in the "out" field. The header row and the data rows of the table require a bit more processing. Each row of values is converted into a string, which consists of a (numerically rounded) set of table values in a pipe ("|") delimited string. There is a R helper function to accomplish this, which is named matrixPipeDelim . The pipe delimited string is the value of "out" for each row of the table, and the label contained in the "grp" field is "table header" for the table header, and "table data" for each row (potential predictor) in the data. The code used in the Macro's R tool to create the grp_out table is given below, and shows the use of the matrixPipeDelim function as part of a character string manipulation operation: ## Create a grp_out key-value pair table for creating a report ## using Alteryx's reporting tools # The grp (key) field grp <- c("target field", "table header", rep("table data", length(names(the_data)) - 1)) # The out (value) field out <- paste(as.character(the_output[[1]]), matrixPipeDelim(as.matrix(the_output[-1])), sep = "|") out <- c(the_target, paste(names(the_output), collapse = "|"), out) grp_out <- data.frame(grp, out) write.Alteryx(grp_out, nOutput = 2)   The portion of the macro that creates the report is shown in Figure 1. The Filter tool makes use of the "grp" field to split the target name from the table of importance weight measures. The name of the target variable is sent to the a Report Text tool to create a report title, while the table of importance weight measures is sent to the Pipe to Table Rows supporting macro to convert the pipe delimited strings into an actual table. The Pipe to Table Rows macro is located in the directory C:\Program Files\Alteryx\bin\RuntimeData\Macros in a standard Alteryx installation.   Figure 1: The reporting portion of the macro There are three ways comments can be inserted into an Alteryx macro to document the underlying process: the Text Comment tool, the Tool Container tool, and the annotation capability of standard Alteryx tools. The ways in which these tools are used reflects personal taste to some extent. Personally, I'm inclined to make use of the annotation capability of Alteryx tools, since I can restructure a workflow without having to move a number of Text Comment tools around as well. Other people make heavy use of Text Comment tools. To get to the annotation panel of a standard Alteryx tool, press on the "pencil" icon in the tool's configuration window, as show in Figure 2.   Figure 2: The annotation panel of an Alteryx tool By right clicking on input and output nodes of the interface tools, a context menu will appears that allows the user to make the connection to or from that tool wireless.   The final bit of polishing is giving the macro a new icon. By default, the icon a new macro receives is a blue circle. You can either use this or other generic icons provided with Alteryx, create a completely new icon from scratch, or use clip art images from the Internet or other sources. Ideally, the icon has some connection to the tool. In this case, we are creating a macro to provide entropy based measures, so something that conveys entropy seems like a good choice. An image that always conveyed entropy to me is the artist Salvador Dali's melting pocket watches from the painting The Persistence of Memory . After a bit a search, I found an image of a melting pocket watch that works well as an icon. To use the icon, go to the Interface Designer for the macro, and press the wrench icon, from there, you can change the macro's icon, as shown in Figure 3.   Figure 3: Using the "Wrench" panel of an Alteryx macro to select an icon The workflow of the completed macro is shown in Figure 4.   Figure 4: The completed Entropy Importance macro It took me about one hour and 20 minutes to create the base version of the macro, and a little over an hour to polish it, for a total time of around two and half hours from start to finish.
View full article
This post is part of the "Guide to Creating Your Own R-Based Macro" series.   The workflow is now ready to be converted into a macro. To do this, click on the canvas and then on the Workflow tab of the Workflow - Configuration menu, click on the Macro radio button to convert the workflow into a Standard Macro. At this point you will want to use the drop down menu option File > Save as... to save the file to yxmc format. My original workflow was saved to the file Entropy_Importance.yxmd, and I saved the macro to the file Entropy_Importance.yxmc.   We are now ready to add the user interface elements to the macro, and make several other changes. Figure 1 shows the final version of the basic macro.   Figure 1: The basic macro As the figure suggests, the major changes are the addition of a number of interface tools, the Text Input tool has been converted to a Macro Input tool, and a Macro Output tool has replaced the Browse tool of the original workflow. All of these tools fall under the Interface tool group.   I won't go into great detail, but I do want to give an overview of what is going on with the interface tools in the macro. Starting from the top left of the canvas, the first interface tool is a Drop Down tool that allows the user to select the target variable for the analysis. Inside, it is configured to only allow string type fields (which are converted to categorical variables in R) to be selected. The Action tool that it connects to modifies the upper Select tool to filter out all fields except the target field.   Moving to the right, the List Box tool allows the user to select a set of predictors. Within the tools configuration, only numeric variables (various integer, float, fixed decimal, and double types) are allowed to appear in the user interface. The Action tool associated with it modifies the lower Select tool based on the user's selection.   The final three tools as you move to the right in the canvas are Check Box tools, which if checked indicates whether a particular measure will be calculated. As you may have guessed, the macro itself will not only provide the information gain measure, but also the option of including the gain ratio, and symmetrical uncertainty entropy based measures as well.   Given the above, the code within the R tool (provided below) has gone through some alterations to allow for this additional functionality. In addition, the code example also illustrates how the user's input to the Check Box tools can be used as "question constants" in an R tool's code: # Load the FSelector package suppressWarnings(library(FSelector)) # Read in the data from Alteryx into R the_data <- read.Alteryx("#1") # Create a string of the potential predictors seperated by plus signs the_preds <- paste(names(the_data)[-1], collapse = " + ") # Get the name of the target field the_target <- names(the_data)[1] # Create a formula expression from the names of the target and predictors the_form <- as.formula(paste(the_target, the_preds, sep = " ~ ")) # Initialize the output data frame the_output <- data.frame(Field = names(the_data[-1])) col_names <- "Field" # Calculate the entropy based measure(s) selected by the user # via the "questions constants" if ('%Question.info.gain%' == "True") {     out <- information.gain(the_form, the_data)     the_output <- cbind(the_output, out[[1]])     col_names <- c(col_names, "Information Gain") } if ('%Question.gain.ratio%' == "True") {     out <- gain.ratio(the_form, the_data)     the_output <- cbind(the_output, out[[1]])     col_names <- c(col_names, "Gain Ratio") } if ('%Question.symm.uncertainty%' == "True") {     out <- symmetrical.uncertainty(the_form, the_data)     the_output <- cbind(the_output, out[[1]])     col_names <- c(col_names, "Symmetrical Uncertainty") } # Prepare the final output names(the_output) <- col_names # Output the results write.Alteryx(the_output)   It is now time to test to see if the basic macro works as expected in a workflow using different data. For the test workflow I decided to work with the Bank Marketing dataset from the UC Irvine Machine Learning Archive. The full dataset was used, which comes in CSV file format. As a result, the Auto Field tool was used to set appropriate field types. In addition, one of the predictor fields (pdays) is the number of days since a prospective customer was previously contacted with an offer to invest in a term savings account. Those who were never contacted for this product were given a code -1. Given this, the data is separated into those who have, and who have not, received a past telemarketing offer for a term savings account using a Filter tool. Finally, the basic macro was inserted into the workflow twice (based on right-clicking on the canvas and inserting the macro twice), and used against both of the data streams coming from the filter tool, with a Browse tool attached to both of them. The completed version of the test workflow is shown in Figure 2.   Figure 2: Test workflow Frequently, things will work as expected in the workflow contained in the macro, but not when the macro is used in a new workflow, and the test workflow should allow you to find any major errors in your macro.
View full article
This post is part of the "Guide to Creating Your Own R-Based Macro" series.   Now that we have the needed R packages installed, we can use them in an Alteryx workflow. The real purpose of this workflow is to begin to put together the macro itself. As a result, there will be some minor differences between this workflow and the one you would likely create if you didn't plan on using as the basis of developing a macro. The starting workflow of the macro is show in Figure 1.   Figure 1: The Initial Workflow The data used in this macro (contained in a Text Input tool) is Fisher's well known Iris data set. This data consists of the length and width of both the petals and stamens of individuals from three species of the Iris flower family. In this instance we want to know how important these four measures are in determining what species to which a particular flower belongs. While this dataset is pretty far afield from a business application, it is a nice dataset to work with for creating this macro since it is small (150 rows and five fields), and represents the correct case (a categorical target, species, and numeric predictors, height and width measurements).   The basic workflow consists of only six tools. A Text Input tool contains the Iris data, which feeds into two Select tools. The upper of the Select tools selects out the target field (the field Species), while the second selects the potential predictor fields to be examined. The downstream Join tool is used to bring the data back together in a way where the first column contains the target, and the subsequent columns contain the potential predictors to be examined.   This combination of three tools would be somewhat out of place in a standard (non-macro) workflow. In general, column position does not matter, moreover, even if it did, a single Select tool could be used to alter column position. However, in this case we will alter the position of columns based on a user's choices in the final macro's user interface, and the use of two select tools allows us to accomplish this task.   The data flowing into the R tool now consists of only the target field (the first column) and the selected numeric predictors in the remaining columns. The R tool contains the following lines of code # Load the FSelector package suppressWarnings(library(FSelector)) # Read in the data from Alteryx into R the_data <- read.Alteryx("#1") # Create a string of the potential predictors seperated by plus signs the_preds <- paste(names(the_data)[-1], collapse = " + ") # Get the name of the target field the_target <- names(the_data)[1] # Create a formula expression from the names of the target and predictors the_form <- as.formula(paste(the_target, the_preds, sep = " ~ ")) # Get the information gain measures out1 <- information.gain(the_form, the_data) # Prepare the results for output out <- data.frame(a = names(the_data)[-1], b = out1[[1]]) names(out) <- c("Field", "Information Gain") # Output the results write.Alteryx(out) The R code is fairly straightforward, with the possible exception of how the locations of values are indexed. For example, the code snippet names(the_data)[-1] takes all the provided field names except the first one (the [-1] index), which is the target field. The code snippet out[[1]] obtains the first (and only) column of the data frame returned by the information.gain R function.   The contents of the Browse tool (the sixth and last tool in the workflow) are the results of the analysis.
View full article
This post is part of the "Guide to Creating Your Own R-Based Macro" series.   There are two major repositories of R packages, CRAN (the Comprehensive R Archive Network) and Bioconductor . The Bioconductor repository has over 1000 packages, which are focused specifically on bioinformatics related applications, while CRAN does not focus on a specific application area, and has over 6000 contributed packages. In general, the functionality you will want to bring to Alteryx via R will be from a package that is on the CRAN repository.   With over 6000 packages, searching for a CRAN package with specific functionality by browsing through the contents of the CRAN repository is not very practical. The two ways I recommend finding a relevant package is by either looking at the appropriate "Task View" (a description of available packages that address a particular application), or doing a web search on the feature you are hoping to obtain, coupled with the addition of "R" to the search string.   For this macro, I used the web search approach, and entered the search string "entropy information gain R" into my preferred search engine. The first hit on this search was a link to the CRAN package FSelector . Examining the documentation to this package revealed that the package delivered the desired functionality through a function called information.gain, and this was one of three entropy based measures the package provides (the other two measures are the gain ratio and symmetrical uncertainty). All three of these functions took as arguments a formula of the form target ~ predictor1 + predictor2 +...+ predictorN   and an R data frame (R's equivalent of a data table) containing the data. The output of each of these functions is a data frame that contains a single column with the value of the selected measure with one row for each of the predictor fields. The predictor field names are contained in the row.names metadata element of the data frame. We will make use of this information in creating an Alteryx macro to wrap this functionality.   The FSelector package provides exactly what we need, so it is time to install the package. There are a number of ways to install an R package in a way that allows it to be used with Alteryx. The one complication that can arise in doing this is on user machines where multiple copies of R are installed. For users not using Microsoft R, the Alteryx predictive installer places the R executables within the Alteryx installation (usually C:\Program Files\Alteryx ). To make sure you are installing packages into the version of R Alteryx is using, open a command prompt and enter the command   "C:\Program Files\Alteryx\R-3.3.2\bin\x64\Rgui.exe" making sure to use the quotes. This will bring up the R console program. In the console window, type the command install.packages ( "FSelector" )   This will bring up a GUI asking you to select a CRAN mirror to download the package from, along with its dependencies (there are several). Select a mirror that is geographically close to you for best performance. In addition, the FSelector package makes use of several other packages that call Java, so you also need to have a JVM installed on your computer to create and use this macro (I'd recommend the Windows x64 Offline version available here).   Once R is done downloading and installing the packages, make sure that FSelector and all its dependencies were correctly installed. To do this, in the R console enter the command library (FSelector)   This will cause R to load the FSelector package. If you did get an error message that some packages were not available (one possibility is the RWekajars package), install them using the install.packages command in the R Console. Once the needed packages have been installed, you can exit the R Console program.
View full article