This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
As most of us can agree, predictive models can be extremely useful. Predictive models can help companies allocate their limited marketing budget on the most profitable group of customers, help non-profit organizations to find the most willing donors to donate to their cause, or even determine the probability a student will be admitted into a given school. A well-designed predictive model can help us make smart and cost-effective business decisions.
You may have run across this error, using the html plugin predictive tools (Linear Regression, Logistic Regression, Decision Tree):
Logistic Regression: Error in searchDir(dbDir, lang) : Logistic Regression: Expecting a single string value: [type=NULL; extent=0]
In 2018.2, this can happen when you have previously had an Admin version of Designer installed, but have since uninstalled. Once you've installed the 2018.2 non-Admin version with Predictive tools, these errors will now occur.
Help is on the way! (In the form of suggestions and an upcoming stable release.) You have several options. First, you can install an Admin version of Designer concurrently - 11.8, 2018.1, 2018.2, etc.
Last ditch effort: delete registry keys. This is not recommended - only delete keys if you cannot install a current version, or cannot wait until the next stable update.
Step 0) Save your license key somewhere easy to find: Options> Manage Licenses
Step 1) Open the Registry Editor (type regedit into your windows search bar) and delete the following directory:
Now, go predict stuff! Happy Alteryx-ing.
Logistic Regression is different from other types of regression because it creates predictions within a range of 0-1 and it does not assume that the predictor variables have a constant marginal effect on the target variable - making it applicable to many dichotomous problems including: estimating the probability that a student will graduate, the probability that a voter will vote for a specific candidate, or the probability that someone will respond to a marketing campaign.
A common concern in predictive modeling is whether a model has been overfit. In statistics, overfitting refers to the phenomena when an analytical model corresponds too closely (or exactly) to a specific data set, and therefore may fail when applied to additional data or future observations. One common method that can be used to mitigate overfitting is regularization . Regularization places controls on how large the coefficients of the predictor variables grow. In Alteryx, the option of implementing regularized regression is available for the Linear Regression and Logistic Regression Tools.
The subtitle to this article should be a short novel on configuring the Decision Tree Tool in Alteryx . The initial configuration of the tool is very simple, but it you chose to customize the configuration of the tool at all, it can get complicated quickly. In this article, I am focusing on the configuration of the Tool. However, because it is a Tool Mastery, I am covering everything within the configuration of the tool
Overview: I wrote this as a short example into how one might use Alteryx to write a further Alteryx module to do complicated or repetitive tasks dynamically that would be difficult to do through the front end.
This module will automatically produce another Alteryx module that will do frequency statistics for a file. This should save the manual time (for files with lots of columns) adding a summarize for each column. It also saves transposing the file (which for large files is very slow to run). Instructions:
Change the input to that module to whichever file you like (or use Testing.yxmd which is provided)
Run it – this will create the Result.yxmd module
Open Result.yxmd – and change the input in the module to be the same file you used in step 2
Change the output if necessary (it defaults to an Alteryx database)
At the moment it does deal with &’s and single quotes in files, but won’t do anything clever like do stats on substrings for long fields.
I hope this inspires people to use this technique and build on the module I’ve built.
The humble histogram is something many people are first exposed to in grade school. Histograms are a type of bar graph that display the distribution of continuous numerical data. Histograms are sometimes confused with bar charts, which are plots of categorical variables.
Time series forecasting is using a model to predict future values based on previously observed values. In a time series forecast, the prediction is based on history and we are assuming the future will resemble the past. We project current trends using existing data.
The Field Summary Tool analyzes data and creates a summary report containing descriptive statistics of data in selected columns. It’s a great tool to use when you want to make sure your data is structured correctly before using any further analysis, most notably with the suite of models that can be generated with the Predictive Tools.