Categorizing numeric predictors
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, newbie here.
I am trying to solve this problem with Alteryx: we have two variables, a continuous predictor variable and a categorical target variable with only two values. We are searching for a simple rule or set of rules to break the continuous predictor, finding the best places to maximize the discriminatory power.
Example:
Variable\Register | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Target A | Type1 | Type1 | Type2 | Type2 | Type1 | Type2 | Type1 | Type1 | Type1 |
Predictor B | 12 | 15 | 21 | 26 | 27 | 30 | 67 | 78 | 98 |
We can use the Decision Tree tool and decompose the tree into rule-based models through the C5.0 algorithm, getting this solution
- If B <= 15 then Type1
- If B > 15 and B <= 30 then Type2
- If B > 30 then Type1
However, with a real case, I tend to get an Error: Decision Tree (12): Decision Tree: Error in apply(prob, 1, max) : dim(X) must have a positive length. Other times, I get no errors but no rules either: all entries get the same classification.
I had a look at it may be that the variable does not provide enough information to grow the tree. The rpart package caps the depth that the tree grows by setting default limits.
How can I get around this without dealing with the R code inside Decision Tree?
Also… any idea about how to solve this problem, with Alteryx, without using the Decision Tree tool?
Thanks,
Javi
Solved! Go to Solution.
- Labels:
- Predictive Analysis
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
What I understand, your approach has a problem. You clearly have two classes of output (Type1 and Type2). Therefore it is more feasible to use logistic regression or Naive Bayes to solve this. Decision tree would need more data and more features to work
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Guess what! Naive Bayes Classifier solved this problem. See the pic attached for detailed answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
you are welcome
