Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Calculating Gini index (AUC ROC) for non-standard prediction model

Denikin
6 - Meteoroid

Hello,

From my data I constructed a dictionary-based prediction model based on a single dependent variable.

That is my model Y = f(X) is a set of bins (~100) over X variable mapped to fixed values of Y, e.g.

[0.1;0.5] -> 3.4

[0.5;0.7] -> 5.3

 

Now I have to calculate the Gini index (AUC ROC) for this dictionary-based prediction model.

As I understand I cannot use "Lift Chart" or "Model Comparison" tools as my model is not constructed using Alteryx model building tools (e.g. Linear Regression, Neural Network).

"Drawing" the ROC curve and calculating it bottom-up seems very troublesome.

Another idea I had was to invoke a function from R (e.g. https://www.rdocumentation.org/packages/MLmetrics/versions/1.1.1/topics/Gini) with R tool in Alteryx from Developer category. However, this will also take some effort as I havent been calling R functions from Alteryx before.

Does anyone have other solution ideas?

 

Thanks in advance!

3 REPLIES 3
SydneyF
Alteryx Alumni (Retired)

Hi @Denikin

 

You are correct, because your model is not saved as a Model Object (an output of the Alteryx Predictive Tools) you will not be able to use the Lift Chart or Model Comparison Tools, as they require a Model Object as an input. 

 

I also believe you identified the two best courses of action for calculating the Gini Coefficient in Alteryx without a Model object; either from the bottom up, generating a Lorenz Curve, or creating custom R code. 

 

I personally think that the easier of the two would be to create the custom R code. The Gini() function comes from the MLmetrics R package, which is included with the Alteryx Predictive Tools installation. This means you will not need to install any new packages to access the function, if you have the Predictive Tools installed it is already on your machine.

 

The arguments of the function also seem to be relatively straightforward. You would need to bring in a data frame with two fields, one with the predicted probabilities output by your model, and the other with 1s and 0s, 1's indicating the models predicted correctly , 0's indicating false. You will want your data types to be numeric

 

The code would look something like this - all you need to do is read in the data with the read.Alteryx function, call the Gini function on your data (COLUMNAME and COLNAME2 should be replaced with the correct field names) and then write out the output with the write.Alteryx function.

 

#read in your dataframe
data <- read.Alteryx("#1", mode="data.frame")

#use Gini function
gini.index <- MLmetrics::Gini(y_pred = data$COLUMNNAME, y_true = data$COLNAME2)

#write out Gini index
write.Alteryx(gini.index, 1)

 

Does this all make sense? Please let me know if you have any questions!

 

Thanks!

 

 

Denikin
6 - Meteoroid

Sydney, thanks for guidance on using R!

Before digging into invoking R from Alteryx I found a simplified way to calculate the Gini index for a model.

Description is here: http://mathspace.pl/matematyka/wskaznik-giniego-na-bazie-wartosci-oczekiwanej-tips-tricks-na-krzywyc... (only in Polish)

Denikin
6 - Meteoroid

Sydney, I also tested your R snippet to perform a validation of my approach and it worked smoothly. Thanks a lot!

Labels