This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 02-01-2018 02:34 PM - edited a week ago by SydneyF
If you are building a predictive model, inevitably you will want to analyze the effect that your independent variables have on your dependent variable. This article is meant to shed some light on the Alteryx-specific options for this type of analysis. The options for analyzing these effects will vary depending on what type of model you are using, so I will run through the different options for each predictive tool. The attached v11.7 workflow has examples for each model.
1. Coefficients
The first and probably easiest way to analyze the partial effect can be done by simply viewing the coefficient. This method of directly interpreting the coefficient as the partial effect that an independent variable has on a dependent variable is typically only appropriate for the Linear Regression model.
Using the examples in the attached workflow, we see that this is relatively easy:
A. Linear Regression
We can find the coefficients in the R Output or the I Output of the Linear Regression tool.
R output:
From these results, the coefficients can be interpreted as follows:
2. Plots
Analyzing the Effect Plots that are included in some of the predictive tools is relatively easy and intuitive. This option is provided for the following tools in Alteryx: Logistic Regression tool, Naïve Bayes Classifier tool, Neural Network tool, Boosted Model tool, and the Spline Model tool.
A. Logistic Regression (Conditional Density Plot)
The I output of the Logistic tool provides an interactive Conditional-Density Plot. This tool allows you to view the conditional probability of your dependent variable as it relates to non-categorical independent variable. It is worth noting that the Conditional-Density Plot reveals the probability of a response of No (or 0).
You can find this option by clicking on the Conditional-Density Plots option of the I input.
I Output:
Since this is an interactive output, you can hover your mouse over the graph to reveal the probability that someone does not donate relative to the number of degrees they have.
From these results, we see that the estimated probability that someone does not donate is approximately 0.5451542 (or 54.51542 %) when someone has 1-1.5 degrees (C.P.). We can also see that the probability that someone does not donate is approximately 0.3674242 (or 36.74242 %) when someone has 1.5-2.5 degrees, (C.P.).
B. Naïve Bayes Classifier (Effect Plots)
The R output of the Naïve Bayes Classifier tool provides an Effect Plot for each predictor variable used in the model. It is worth noting that the Effects Plots reveal the probability of a Yes (or 1).
These graphs are not interactive, so the probabilities are not exact. This option is automatically included in the R output.
R Output:
This graph reveals that the estimated probability that someone does respond is approximately 0.58 (58%) when someone has 1 degree (C.P.) and that the probability that someone does respond is approximately 0.35 (35%) when someone has 1.5-2.5 degrees (C.P.).
NOTE: If you are estimating a model that includes many independent variables, you may have to click on the Records arrow in order to view the Effects Plots.
C. Neural Network (Effect Plot)
The R output of the Neural Network tool provides an Effect Plot for each predictor variable used in the model. These graphs are not interactive, so the probabilities are not exact. This option must be selected in the configuration of the tool.
Include Effects Plots:
R Output:
This graph reveals that the estimated probability that someone does respond is approximately 0.41 (41%) when someone has 1 degree (C.P.) and that the probability that someone does respond is approximately 0.47 (47%) when someone has 1.5-2.5 degrees (C.P.).
D. Boosted Model (Effect Plot)
The R output of the Boosted Model tool provides an Effect Plot for each predictor variable used in the model. These graphs are not interactive, so the probabilities are not exact. This option must be selected in the configuration of the tool.
Include Effects Plots:
R Output:
This graph reveals that the estimated probability that someone does respond is approximately 0.46 (46%) when someone has 1 degree (C.P.) and that the probability that someone does respond is approximately 0.62 (62%) when someone has 1.5-2.5 degrees (C.P.).
E. Spline Model (Effects Plot)
The R output of the Spline Model tool provides an Effect Plot for each predictor variable used in the model. These graphs are not interactive, so the probabilities are not exact. This option must be selected in the configuration of the tool.
Include Effects Plots:
R Output:
This graph reveals that the estimated probability that someone does respond is approximately 0.45 (45%) when someone has 1 degree (C.P.) and that the probability that someone does respond is approximately 0.62 (62%) when someone has 1.5-2.5 degrees (C.P.).
3. Scoring
When all else fails, you can always score your data! This process is relatively simple:
1. Score the original values of your data
2. Score your data after the desired change in the independent variable has been made
3. You can view the change in the score for each individual
4. You can view the average change in score across all individuals (Average Partial Effect APE)
Since the Decision Tree tool, Forest Model tool, Count Regression tool, Support Vector Machine tool, and Gamma Regression do not have effects plots and the coefficients cannot be directly interpreted, this option has been demonstrated with these tools in the attached workflow.
A. Decision Tree
When we compare the individual scores, we see that records 1 through 7 are unchanged after we increase degrees by 1, but record 8 decreases from 26.663636 to 1647619.
The output of our Summarize tool reveals that the Average Partial Effect (APE) is -3.443006, or predicted MPG decreases by 3.443006 average when a Cylinder is added.
B. Forest Model
When we compare the individual scores, we see that record 1 increases from 0.216 (21.6%) to 0.24 (24.0%) when the number of degrees is increased by 1.
The output of our Summarize tool reveals that the Average Partial Effect (APE) is 0.010182, or the probability of a response increases 1.0182 percentage points on average when a Degree is added.
C. Support Vector Machine
When we compare the individual scores, we see that record 1 increases from 0.533969 (53.3969%) to 0.587181 (58.7181%) when the number of degrees is increased by 1.
The output of our Summarize tool reveals that the Average Partial Effect (APE) is 0.05625, or the probability of a response increases 5.625 percentage points on average when a Degree is added.
D. Count Regression
When we compare the individual scores, we see that the predicted number of claims for record 1 increases from 1.658907 to 1.671996 when the average cost of a claim is increased by 1.
The output of our summary tool reveals that the Average Partial Effect (APE) is 0.573603, or the predicted number of claims increases by 0.573603 on average when a average cost is increased by 1.
E. Gamma Regression
When we compare the individual scores, we see that the predicted average cost increases from 203.770014 to 203.807904 when the number of claims is increased by 1.
The output of our summary tool reveals that the Average Partial Effect (APE) is 0.046081, or the predicted average cost increases by 0.046081 on average when the number of claims is increased by 1.