This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
It is possible to specify splits in the Decision Tree building process in Alteryx by essentially using the Decision Tree tools to create a Decision Tree "by hand".
The steps to do this are as follows:
1. Learn a tree with "Age" as your only predictor variable, and The maximum allowed depth of any node in the final tree set to 2 (1 would be better, but 2 is the lowest the tool will allow) and The minimum number of records needed to allow for a split set to the number of records in your dataset (this is another method to ensure the decision tree to only creates one split) (Both of these options are in the HyperParameters drop down, in Model Tab, in the Customize Window).
2. Using the Report Output of the Decision Tree tool, identify the split threshold(s) of the Leafs
3. Use a filter tool, splitting the data based on "Age", matching the split threshold of the Decision Tree Report.
4. Create "subtrees" for your left and right branches with Decision Tree Tools.
This process will allow you to specify splits in the Decision Tree building process. You can repeat these steps downstream for each split if you would like to.
Another option might be to create bins (e.g., 0-20, 20-40, 40-60 etc.) for your age data, and subset the data for each of the age bins, then train a separate decision tree on each of these segments.
Only you know your data and your use case, but I want to mention that when building a decision tree with all of your predictor variables, at every iteration the Decision Tree Tool is choosing best variable for splitting (either based on Gini coefficient or Information Index, depending on how the tool is configured). This means that if your data includes a better predictor variable that separates the classes more than that can be done by the predictor Age, then that variable is chosen first by the Decision Tree Tool.
Does this all make to you? There is a Stack Overflow post that discusses this process in R if you are interested in seeing additional information.
Please let me know what you think, or if you have further questions!
Off the top of my head, I think there is a way to have Alteryx pass the parameters of a Decision Tree to other tools, but it would require custom R code. The outputs of the Decision Tree Tool are a serialized R model object, a Report, and an Interactive Report. You could potentially extract the information from the R model object, but this would take an R Tool and code that would unserialize the model object and then extract and output the parameters you are interested in as a data frame.
If you had that piece working, you could create a Dynamic Filter Macro with the extracted decision tree parameter as your ¿ input and your data as you standard input.
Does that make sense? I haven't tested this yet, but I think it would be the best strategy.