Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Which dimension (category) combinations are driving a metric?

jparkerrandall
5 - Atom

Hi All,

 

First post. Here goes. 

 

So I have a number of different dimensions (or categories excuse my tableau speak) and I want to understand which combinations of these dimensions are important in driving a target metric. My sense is that my problem is with framing the question, so let me use an example and we might be able to figure out the detail:

 

Lets say I am the support department of a company:

 

  • I have a list of customers calling in and the number of support cases they raise in addition to some information about the customers:
    • What products they have purchased
    • How old they are
    • Where are they from
    • How eductaed are they
    • How long have they been a customer
    • Have they attended a training webinar or event.
  • The hypotheses within the business are that:
    • Customers that are new and young raise more support cases.
    • Customers with a specific set of products raise more support cases.
    • Customers from some specific regions are challenging and raise more support cases
    • Customers that are trained raise fewer support cases.

I want to test the validity of these hypotheses. The challenge is that these are non-mutually exclusive groups, so teasing out the relationships is challenging. Ultimately I want to create profiles (clusters I guess) that have different case generating behaviors.

 

Why? So that I can then go on to predict the case volumes I can expect if the number of customers within a specific profile increases in the future. 

 

Any help would be appreciated. 

 

Many thanks,

 

 

 

 

 

3 REPLIES 3
jparkerrandall
5 - Atom

Do I need a forest model... maybe

ZacharyM
Alteryx Alumni (Retired)

You will likely need to run and analyze a handful of different statistical tools and their associated outputs to arrive at a concise answer or expression for your situation.

 

As you've noted, there are probably individual variable features (age might be a driver alone) and combination variable features (a certain product within a certain age group is particularly problematic perhaps). Different models are good at different things.

 

Your end result, will probably be some sort of cluster - that is what describes the confluence of features that truly drives at describing your situation.

 

I would recommend running lower level statistical models to remove the noise from your data set and using something like a k-means cluster for your final result, once you have removed the variables that aren't necessarily drivers.

 

Breaking this apart - I would look at correlation between individual factors and your predicted variable (call volume), then look at a forest model of the factors that survived the first step, then roll this up into a clustering model.

 

Long-story short, statistics if often a journey-like process, rarely a single step, but all of these steps and associated tools are available within the Alteryx suite.

 

Let me know if this helps!

OldDogNewTricks
10 - Fireball

I second @ZacharyM's comments.  Here are a few other tools to research:  Model Comparison (not installed by default, in Alteryx Gallery), MB Rules, and of course random forest and decision tree.

Labels