community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Which dimension (category) combinations are driving a metric?

Highlighted

Hi All,

 

First post. Here goes. 

 

So I have a number of different dimensions (or categories excuse my tableau speak) and I want to understand which combinations of these dimensions are important in driving a target metric. My sense is that my problem is with framing the question, so let me use an example and we might be able to figure out the detail:

 

Lets say I am the support department of a company:

 

  • I have a list of customers calling in and the number of support cases they raise in addition to some information about the customers:
    • What products they have purchased
    • How old they are
    • Where are they from
    • How eductaed are they
    • How long have they been a customer
    • Have they attended a training webinar or event.
  • The hypotheses within the business are that:
    • Customers that are new and young raise more support cases.
    • Customers with a specific set of products raise more support cases.
    • Customers from some specific regions are challenging and raise more support cases
    • Customers that are trained raise fewer support cases.

I want to test the validity of these hypotheses. The challenge is that these are non-mutually exclusive groups, so teasing out the relationships is challenging. Ultimately I want to create profiles (clusters I guess) that have different case generating behaviors.

 

Why? So that I can then go on to predict the case volumes I can expect if the number of customers within a specific profile increases in the future. 

 

Any help would be appreciated. 

 

Many thanks,

 

 

 

 

 

Do I need a forest model... maybe

Alteryx
Alteryx

You will likely need to run and analyze a handful of different statistical tools and their associated outputs to arrive at a concise answer or expression for your situation.

 

As you've noted, there are probably individual variable features (age might be a driver alone) and combination variable features (a certain product within a certain age group is particularly problematic perhaps). Different models are good at different things.

 

Your end result, will probably be some sort of cluster - that is what describes the confluence of features that truly drives at describing your situation.

 

I would recommend running lower level statistical models to remove the noise from your data set and using something like a k-means cluster for your final result, once you have removed the variables that aren't necessarily drivers.

 

Breaking this apart - I would look at correlation between individual factors and your predicted variable (call volume), then look at a forest model of the factors that survived the first step, then roll this up into a clustering model.

 

Long-story short, statistics if often a journey-like process, rarely a single step, but all of these steps and associated tools are available within the Alteryx suite.

 

Let me know if this helps!

I second @ZacharyM's comments.  Here are a few other tools to research:  Model Comparison (not installed by default, in Alteryx Gallery), MB Rules, and of course random forest and decision tree.

Labels