Hi
I'm starting to work with decision trees, and to begin I looked into the decision tree workflow already provided in the Alteryx workflow samples (Sample Workflow -> Predictive Analytics -> 6 Decision Tree).
After running the workflow and looking into the Browse Tool for the I Output, I run into an issue understanding the calculations for the Precision and Recall (under the summary tab).
Combining the numbers from the Misclassifications tab and the definitions of Precision and Recall (according to wiki), I would expect:
Precision = True Positive / (True Positive + False Positive) = 123 / (123+26) = 0.8255 = 82.55% (different from the 75.7% in the Summary Tab)
Recall = True Positive / (True Positive + False Negative) = 123 / (123+70) = 0.6373 = 63.73 % (different from the 53.6% in the Summary Tab)
Is there something I'm misunderstanding from these calculations? - perhaps some correctional factors.
Regards Tue
Solved! Go to Solution.
Weirdly, this decision tree considers the condition of "No" to be a true positive. I guess that can be true in this world.
No | Yes | Sum | |
No | 81 | 70 | 151 |
Yes | 26 | 123 | 149 |
Sum | 107 | 193 | 300 |
If you consider the confusion matrix above, and consider "No" to be a true positive in this conditional analysis, the calculation would be:
Precision = 81/81+26 = 75.7%
Recall = 81/81+ 70 = 53.6%
I guess the real question would be: "how can I flag a condition as a true positive in a CART model so that metrics are calculated properly?" That I do not know.
Ahh, guess that makes sense... somehow.
Thanks for the clarification.
Just a follow up question to this thread, does anyone know how to specify which outcome/class is the "positive" one in the Decision Tree macro for the report to read correctly? Thanks.