Alteryx Designer Desktop Discussions

jcalvo92 · ‎04-22-2019

Hi, newbie here.

I am trying to solve this problem with Alteryx: we have two variables, a continuous predictor variable and a categorical target variable with only two values. We are searching for a simple rule or set of rules to break the continuous predictor, finding the best places to maximize the discriminatory power.

Example:

Variable\Register	1	2	3	4	5	6	7	8	9
Target A	Type1	Type1	Type2	Type2	Type1	Type2	Type1	Type1	Type1
Predictor B	12	15	21	26	27	30	67	78	98

We can use the Decision Tree tool and decompose the tree into rule-based models through the C5.0 algorithm, getting this solution

If B <= 15 then Type1
If B > 15 and B <= 30 then Type2
If B > 30 then Type1

However, with a real case, I tend to get an Error: Decision Tree (12): Decision Tree: Error in apply(prob, 1, max) : dim(X) must have a positive length. Other times, I get no errors but no rules either: all entries get the same classification.

I had a look at it may be that the variable does not provide enough information to grow the tree. The rpart package caps the depth that the tree grows by setting default limits.

How can I get around this without dealing with the R code inside Decision Tree?

Also… any idea about how to solve this problem, with Alteryx, without using the Decision Tree tool?

Thanks,

Javi

dsrajat · ‎04-29-2019

What I understand, your approach has a problem. You clearly have two classes of output (Type1 and Type2). Therefore it is more feasible to use logistic regression or Naive Bayes to solve this. Decision tree would need more data and more features to work

Hope this helps.

dsrajat · ‎04-29-2019

Guess what! Naive Bayes Classifier solved this problem. See the pic attached for detailed answer.

jcalvo92 · ‎04-29-2019

Great! That was what I was looking for

Thank you, @dsrajat

dsrajat · ‎04-29-2019

you are welcome

Alteryx Designer Desktop Discussions

Categorizing numeric predictors

Re: Unable to get an output

Re: Extracting the list of sheet names across mult...

Re: Chaining Apps

Re: Firm names parse

Re: Help with Multi-Row formula