community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE
SOLVED

Variable importance plot - Boosted model - Values

Highlighted
Atom

I am trying to know the values of Variables in the boosted Variable importance plot but it doesn't give that. How can I add values to Variable Importance plot so i can use them in further calculations?2019-06-11_10-43-42.png

Sr. Data Science Content Engineer
Sr. Data Science Content Engineer

Hi @amrutas

 

The image you've posted looks like it came from a Forest model. Are you trying to extract the variable importance values from a boosted model, random forest, or both? I ask because the process to do so will be slightly different depending on the model.

 

Either way, at a high level you will want to:

 

1. Connect an R tool to the O anchor of the tool you used to generate your model. The O stands for Object, and this is where the model object is streamed onto the canvas.

2. Write R code into the tool to read in the model and unserialize the object.

3. Load in the R package that corresponds to your model, this is randomForest for a random forest model and gbm for a boosted model. 

4. Apply the corresponding R function to the model to extract the variable importance metric. This is importance() for a random forest and summary() for boosted models.

5. Write the resulting data back out to Alteryx.

 

 

The code to do this for a random forest model:

 

 

# Read in serialized model object
modelObj <- read.Alteryx("#1", mode="data.frame")

# Unserialize model
model <- unserializeObject(as.character(modelObj$Object[1]))

# load library
library(randomForest)

# extract feature importance
importance <- importance(model)

# convert to data frame
output <- data.frame(var = row.names(importance), importance)

# write out data frame
write.Alteryx(output, 1)

 

 

 

The code to do this for a boosted model:

 

 

# Read in serialized model object
modelObj <- read.Alteryx("#1", mode="data.frame")

# Unserialize model
model <- unserializeObject(as.character(modelObj$Object[1]))

# load package
library(gbm)

# extract variable importance
output <- summary(model)

# write out to Alteryx
write.Alteryx(output, 1)

 

 

I've attached a workflow that demonstrates the process for both a boosted model and a random forest model. Please be sure to read the documentation on each of the functions used to extract the variable importance metrics (hyperlinked above).

 

Hope this helps!

 

Sydney

Labels