When I run stepwise logistic regression, I get different sets of coefficients for the model from the Report output of the logistic regression and stepwise tools, respectively. Which one is the correct output model to rely on? Or is the answer none of the above?
Thanks
Solved! Go to Solution.
Hi @WonderHog,
Thanks for your question! The coefficients may be different as perhaps there are a different mix of variables used in the model(s). The stepwise adds and subtracts variables as it sees fit and therefore could have a different mix. You can use the 'Lift Chart' to compare the two models. There is a great example in the sample workflows (Help > Sample Workflows > Predictive > Lift Chart) which demonstrates how to do this.
Thanks Amelia. So the report from the logistic regression tool is for all entered variables and the report from the stepwise tool is after elimination of variables, correct?
Is there a way to control the p-value for elimination/inclusion of variables in the stepwise process?
Thanks
Hi @WonderHog,
Yes, the model built from the Logistic Regression Tool includes all of your selected variables (a "full" model), and the Model built from the Stepwise Tool is with a subset of variables (a "reduced" model). The way that the Stepwise Tool selects variables to include is either using the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC), which you can select between.
To determine which model to use, I would also consider using the Nested Test Tool. The Nested Test Tool is specifically for comparing two models where one of the models is comprised of a subset of variables in the other model (a reduced model and a full model).
The best model will be the most parsimonious, so if there is not a significant difference in explanatory power between the model with all variables and the model with variables selected by the Stepwise Tool, it is best to go with the model created by the Stepwise Tool. However, if a significant amount of explanatory power is lost by removing the variables selected by the Stepwise Tool, it is best to select the model generated by the Logistic Regression Tool.