Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Input files in logistic regression tools

FranzMazza
6 - Meteoroid

Dear all,


I have an excel file with 3 numeric variables (2 predictors and 1 target, 0/1), and I need to look up for correlations between those two variables and the 0/1 target. 

However, after importing the file, the logistic regression tool does not allow me to select the target variable. I checked and the variable format is numeric (double) for all the three of them. 

 

Can you please help me with this? Is there something wrong in the input?

 

Thanks,
Francesco 

11 REPLIES 11
danilang
19 - Altair
19 - Altair

@FranzMazza 

 

Can you share you input files, or at least a sample of each of them?

 

Dan

FranzMazza
6 - Meteoroid

Hi Dan,


Here you can find the input file I am trying to use.

Thanks for your help,
Francesco 

RolandSchubert
16 - Nebula
16 - Nebula

Hi @FranzMazza ,

 

the target variable of the logistic regression has to be a binary categorical variables (yes/no, success/fail) => variable type string/v_string.

Best regards

 

Roland

RolandSchubert
16 - Nebula
16 - Nebula

It should work, if you convert the field "Default" to "String".

FranzMazza
6 - Meteoroid

Thanks,

It works now. Can you please also help me in getting the right output? 
I am expecting the beta of the regression as well as the p-values, Rsquared,... 
Where can I get these?


Thanks,
Francesco  

RolandSchubert
16 - Nebula
16 - Nebula

You have to add Browse tools to the "R" and "I" anchor of the Logistic Regression tool, you will see values like RSquared there.
To get the probabilities for going default, you have to add a Score tool. I've attached a simple workflow showing the setup,

FranzMazza
6 - Meteoroid

Thanks, 

This is really useful. I don't know why in the flow I created (same as you), the Browse module did not show any result. 

The X_0 and X_1 i see after the Score module are the "beta" of non default/ default, based on both variables P1 and P2, correct?


Would it be possible to switch from a multivariate to a univariate analysis, e.g. testing only one variable per time?

Thanks again,
Francesco 

RolandSchubert
16 - Nebula
16 - Nebula

It's possible to switch to a univariate analysis by simply selecting only one predictor variable (at least one must be selected).

X_0 and X_1 are the probabilities for non default/default based on the model (applying the model to the data).

 

If you are looking for "betas" as the regression coefficients (like scikit-learn), you'll find intercept and coefficients in the Brows tool
attached to the report anchor.

FranzMazza
6 - Meteoroid

Thanks a lot for your help.
I can now fully study all the regression parameters. 

However, where can I find how the output parameters (accuracy, precision, recall, F1, optimal probability of cutoff) are calculated?


Thanks again,
Francesco 

Labels