Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Logistic Regression...?

IJH34
8 - Asteroid

This is more of a statistics question rather than an Alteryx question....

 

I am currently working on credit line segmentation, and identifying characteristics of high limit borrowers compared to lower limit borrowers. I am wondering if I segment credit lines based on credit limit such as....

$5,000-$8,000

$8,000-10,000

$10,000-$15,000

$15,000-$20,000

$20,000-$30,000

Can I consider these to be categorical variable and run a logistic regression against the different segments? If not what would you recommend?

 

3 REPLIES 3
JohnJPS
15 - Aurora

Hi, yes, if your interest is high vs. low, you can absolutely consider it a logistic regression problem; and in general any continuous predictor variable can be split into segments and regarded as factors, if desired. Only downside is you can lose information... e.g. $1 is much less than $100,000, but if $1 goes into bin "A" and $100,000 goes into bin "F" and there is no "much less than" connotation between bins... that's something to consider.  Maybe convert "bin" to a integer value.

Philip
12 - Quasar

Hi @IJH34

 

Yes, it wold be considered a categorical variable. The other option would be to assign an increasing value to each from 1 to 5 and consider it an ordinal variable. If you do use it as a categorical variable, I would still assign it an increasing numerical variable so that when R chooses the base category it compares the other categories to, it will choose the one you want instead of choosing one you don't want.

 

A different question is how did you arrive at the categories? Is there a hypothesis or business rule that supports these groupings?

IJH34
8 - Asteroid

Philip,

 

Thank you for your solution. Makes complete sense!

The segments present in the questions are just hypothetical, however, I've been able to appropriately bin segments for actual model purposes using statistical reasoning. 

Labels