Hi, newbie here.
I am trying to solve this problem with Alteryx: we have two variables, a continuous predictor variable and a categorical target variable with only two values. We are searching for a simple rule or set of rules to break the continuous predictor, finding the best places to maximize the discriminatory power.
Example:
Variable\Register | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Target A | Type1 | Type1 | Type2 | Type2 | Type1 | Type2 | Type1 | Type1 | Type1 |
Predictor B | 12 | 15 | 21 | 26 | 27 | 30 | 67 | 78 | 98 |
We can use the Decision Tree tool and decompose the tree into rule-based models through the C5.0 algorithm, getting this solution
However, with a real case, I tend to get an Error: Decision Tree (12): Decision Tree: Error in apply(prob, 1, max) : dim(X) must have a positive length. Other times, I get no errors but no rules either: all entries get the same classification.
I had a look at it may be that the variable does not provide enough information to grow the tree. The rpart package caps the depth that the tree grows by setting default limits.
How can I get around this without dealing with the R code inside Decision Tree?
Also… any idea about how to solve this problem, with Alteryx, without using the Decision Tree tool?
Thanks,
Javi
Solved! Go to Solution.
What I understand, your approach has a problem. You clearly have two classes of output (Type1 and Type2). Therefore it is more feasible to use logistic regression or Naive Bayes to solve this. Decision tree would need more data and more features to work
Hope this helps.
you are welcome