Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Stepwise logistic regression - which output report model?

WonderHog
7 - Meteor

When I run stepwise logistic regression, I get different sets of coefficients for the model from the Report output of the logistic regression and stepwise tools, respectively.  Which one is the correct output model to rely on?  Or is the answer none of the above?

 

Thanks

3 REPLIES 3
AmeliaG
Alteryx
Alteryx

Hi @WonderHog,

 

Thanks for your question! The coefficients may be different as perhaps there are a different mix of variables used in the model(s). The stepwise adds and subtracts variables as it sees fit and therefore could have a different mix. You can use the 'Lift Chart' to compare the two models. There is a great example in the sample workflows (Help > Sample Workflows > Predictive > Lift Chart) which demonstrates how to do this. 

WonderHog
7 - Meteor

Thanks Amelia.  So the report from the logistic regression tool is for all entered variables and the report from the stepwise tool is after elimination of variables, correct?

Is there a way to control the p-value for elimination/inclusion of variables in the stepwise process?

 

Thanks

SydneyF
Alteryx Alumni (Retired)

Hi @WonderHog,

 

Yes, the model built from the Logistic Regression Tool includes all of your selected variables (a "full" model), and the Model built from the Stepwise Tool is with a subset of variables (a "reduced" model). The way that the Stepwise Tool selects variables to include is either using the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC), which you can select between. 

 

To determine which model to use, I would also consider using the Nested Test Tool. The Nested Test Tool is specifically for comparing two models where one of the models is comprised of a subset of variables in the other model (a reduced model and a full model).

 

The best model will be the most parsimonious, so if there is not a significant difference in explanatory power between the model with all variables and the model with variables selected by the Stepwise Tool, it is best to go with the model created by the Stepwise Tool. However, if a significant amount of explanatory power is lost by removing the variables selected by the Stepwise Tool, it is best to select the model generated by the Logistic Regression Tool.

Labels