community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Decision Tree Interactive Report

Alteryx Partner

Hi All,

 

I am having trouble reconcilling to pieces of information in the interactive report for the decision tree.

 

The Summary page (attached) has the recall as 83.9% and precision= 72.2%

 

The Recall =(True Positive)/(True Positive + False Negative) and 

Precison = (TP)/(TP+FP)

 

What I can't work out is how that reconciles back to the data provided in the confusion matrix as below (and attached).

 

Actual Actual Positive Actual Negative

Predicted Positive
198 (52.4%)
180 (47.6%)
Predicted Negative
90 (16.1%)
468 (83.9%)

 

Under these headings Recall would be (198)/(198+90)= 68.75% and Precision=(198)/(198+180)=52.3%.

 

To be able to get the result as listed in the summary (where recall =83.9% and Precision= 72.2%) the headings would need change to be:

 

Actual Predicted Negative Predicted Positive

Actual Negative
198 (52.4%)
180 (47.6%)
Actual Positive
90 (16.1%)
468 (83.9%)

which would give Recall= (468)/(468+90)= 83.9% and Precision=468/(468+180)=72.2%.

 

Am I misundertanding something here? Or are the labels in the cells around the wrong way? Any help greatly appreciated!

 

 

Alteryx Partner

I'm also seeing the same problem.  I use the Model comparison tool to evaluate performance (tip from another Alteryx community user, gotta give credit where its due).  You can download it from the gallery.

 

One thing to note is that you will need to right click on your predictive macro (this applies to Linear Regression, Decision Trees, Logistic Regression) and choose version 1.0 in order to take advantage of the Model comparison tool.  You don't need to do this with Boosted (I haven't tried any others yet).

 

But the results between the interactive output in decision trees and the Model comparison are striking.  In my case, the Interactive output showed 81% accuracy while the Model comparison showed 60%.

 

Hope this helps.

Alteryx Partner

Thanks for the suggestion, I'll give it a whirl! Its pretty concerning that there would be a large difference when calculating the same thing, or things apparently labelled incorrectly.

 

Thanks again!

Meteor

DWIL, 

 

You aren't missing anything, the calculations they are displaying on their summary are complete incorrect except for Accuracy. 

 

They are posting the True Negative % as the Recall, and what appears to be the Specificity as the Precision. Their F-Measure is also incorrect due to them getting both the Recall and Precision wrong. I suggest only looking at the confusion matrix (Misclassifications section) and doing the calculations manually until they address this issue. 

Alteryx Partner

Thank you for your reply, its really appreciated!

Here is hoping that someone from Alteryx is aware and fixing the issue because its not a minor one (although should be relatively easy to fix you'd think).

Thanks again.

Community Content Engineer
Community Content Engineer

hi @mash - we are researching this issue.  Thank you for letting us know.

Alteryx Partner

Hi @CristonS does that research extend to the issue I raised also (regarding the apparent differences between the decision tree interactive report and the confusion matrix)?

Meteor

After some more testing, the Model Comparison tool automatically sets the scoring threshold at 0.5 (for binary classification). Since you didn't specify which you were running, this could have something to do with the accuracy difference you are seeing. 

Alteryx Partner

I used it for Binary Classification and used the default of .5...

When I build the confusion matrix from scratch using the formula tool and compare it back to the model comparison it was off...but this was also a few weeks back.  I want to say that in the past few days, the accuracy may have improved.  Not sure if an update was made to the macro recently. 

Meteor

@mash

 

There is no setting the threshold in the Model Comparison tool. It is just default set to 0.5.

 

The 'Accuracy_F' and 'Accuracy_T' are just basic %'s of how many of the actual T's and F's you predicted right. 

The Accuracy itself was correct, but the F-measure and AUC are both incorrect as well. Not sure how the calculation was made on the backend though :/ 

 

Labels