Hi guys,
I'm trying to learn how to perform correctly a logistic regression.
Let's say I have a datasource1 with old data with 5 fields.
Field A is my target variable (yes/no)
Field B and C are qualitative
Field D and E are numeric, quantitative
And I also have a datasource2 with fresh data I want to use to score my prediction.
What's the best approach?
1) Should I only use the Regression Tool using all my fields as predictor variable and see which fields are significant? (Number of *** in the regression report?).
2) Or should I perform an association analysis, decide what fields are significant (table of contigency for qualitative fields and pearson/spearman correlation for the quantitative ones) and then only input the significant fields in the regression tool as predictive variables?