This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I'm new to Alteryx and data science/statistics, so please be gentle. :)
I am working with HR data and I have been able to use the Forest Model to produce a probability percentage if an employee will leave the company. I can see which variables are important (using the Variable Importance Plot) so I have an idea of which ones would have a stronger influence.
Is there a way to determine if I changed one of the variables how it would affect the probability percentage? For example, if I gave an individual a higher bonus, their chance of leaving with go down.
The goal is to create a dashboard (using Qlik View) for leadership that says: here are the employees that are most likely to leave the company. If you change these variables (give a higher bonus, move them closer, etc.) this will increase their chances of staying longer with the company.
Is this something that I can do with Alteryx? My gut feeling is that this is a big undertaking that can't be produced too easily without a much deeper understanding of statistics.
This is absolutely something you can do with Alteryx! In fact, there was an article that just got published on the community that discusses the tools that Alteryx provides for analyzing the partial effect that an independent variable has on a dependent variable. Here is the article: Analyzing Partial Effects in Alteryx
As described in the article. with the Forest model, you would need to compare the scored values of your data. However, if you are comfortable with the Boosted model (this model is also derived from Decision Trees, so it behaves in a somewhat similar manner as the Forest model), you can create a model and include Marginal Effects plots for this analysis. Or if you are comfortable with Logistic Regression, which involves manually choosing variables until you determine that the model has reliable results, you can look at the Conditional-Density Plots.