Hi All,
First post. Here goes.
So I have a number of different dimensions (or categories excuse my tableau speak) and I want to understand which combinations of these dimensions are important in driving a target metric. My sense is that my problem is with framing the question, so let me use an example and we might be able to figure out the detail:
Lets say I am the support department of a company:
I want to test the validity of these hypotheses. The challenge is that these are non-mutually exclusive groups, so teasing out the relationships is challenging. Ultimately I want to create profiles (clusters I guess) that have different case generating behaviors.
Why? So that I can then go on to predict the case volumes I can expect if the number of customers within a specific profile increases in the future.
Any help would be appreciated.
Many thanks,
Solved! Go to Solution.
Do I need a forest model... maybe
You will likely need to run and analyze a handful of different statistical tools and their associated outputs to arrive at a concise answer or expression for your situation.
As you've noted, there are probably individual variable features (age might be a driver alone) and combination variable features (a certain product within a certain age group is particularly problematic perhaps). Different models are good at different things.
Your end result, will probably be some sort of cluster - that is what describes the confluence of features that truly drives at describing your situation.
I would recommend running lower level statistical models to remove the noise from your data set and using something like a k-means cluster for your final result, once you have removed the variables that aren't necessarily drivers.
Breaking this apart - I would look at correlation between individual factors and your predicted variable (call volume), then look at a forest model of the factors that survived the first step, then roll this up into a clustering model.
Long-story short, statistics if often a journey-like process, rarely a single step, but all of these steps and associated tools are available within the Alteryx suite.
Let me know if this helps!
I second @ZacharyM's comments. Here are a few other tools to research: Model Comparison (not installed by default, in Alteryx Gallery), MB Rules, and of course random forest and decision tree.
User | Count |
---|---|
19 | |
15 | |
15 | |
9 | |
8 |