This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
First: Used forest model to get the variable importance plot. Much to learn here about what this plot is saying (thanks again @SydneyF
Then you need to compare two models that predict H0. Given that H0 is a binary variable, this looks like logistic regression, so this means that we can use the Nested Test tool to spot the difference between the two models
Which then gives the Chi-SQ difference caused by removal of F_38
Definitely hard! I figured out with Google to use the forest model for the mean decrease Gini coefficient, but then wasn't sure which model to use and how to find the chi square comparison. After getting several R errors when using various tools and connecting things in the wrong way, I used @PJDit's spoiler to see it was the logistic regression and nested test tools, which I don't think I've used before.
The first part was pretty simple since only a couple of tools have variable importance plots. But I went down a rat hole trying to do the model because I went with random forest and the results came up exactly the same for the 9 variable or 10 variable case. That's when I went back to my standby, logistic regression and saw some differences in the Model comparison tool. The chi-squared thing had me thrown for a bit. I did find the calculations and was ready to go that way, but stumbled across nested test, which made it a lot easier. My solution winds up being similar to the one provided, although I didn't verify the independence of the variables.
I love these predictive challenges as I learn so so much! Through good luck I made it through to the last bit (using Google) and then had to peek to see what you scored the models using to get Chi-Squared. I'd be tempted by this (after a little more practice) in the Expert! Thank you
I have quite mixed feelings about this one. Unless you specifically do this use case, I don't think there is a way of knowing which tool to use. Alteryx documentation for chi-sq and chi-squared doesn't mention any tools apart from contingency table (which is something I tried to use) - this includes documentation for the nested test tool.
Having said that, it's a nice challenge to explore predictive tools a bit more and I've learned a new tool! I will definitely pay more attention to tools descriptions as well