Alteryx Designer Desktop Discussions

vujennyfer · ‎05-25-2021

I am trying to use the Naive Bayes Classifier tool to determine if a review is "negative" or "positive". I am able to produce probability results for the training dataset, but not for the test set (unseen data). I keep getting the below errors. Attached is the workflow and the dummy data. The excel sheet has two tabs for the training and test data set. Any help is appreciated!!

apathetichell · ‎05-25-2021

Doesn't the Alteryx Naive-Bayes tool require two predictors? I know you above version seems to work - but this could be triggering the error.

2021-05-25 (4).png

More evidence - from the tool sample workflow...

The Naive Bayes Classifier tool assumes that all predictors are independent of one another and predicts, based on a sample input, a probability distribution over a set of classes.

To configure the tool, first name the model. Next, select the target and at least 2 predictors. Choose a positive value as a smoothing parameter. The default value is 0. This feature allows you to smooth the data by accounting for class/feature combinations that might either be entirely absent from the training set, or are otherwise under-represented in frequency.

vujennyfer · ‎05-25-2021

Hi @apathetichell, thanks for your reply. I added another predictor (ratings) to have two total predictors but I am still seeing those errors. I also changed the Laplace smoothing to '1' as a positive value.

apathetichell · ‎05-25-2021

made a few changes and this seems to work.

I think the test data and train data need to have the same fields - so the "positive/negative" was set to a dummy field. Made a few other small changes here and there...

I'm not ecstatic with this version so if no one has given you a better answer (as to why this works and some changes don't) - I'll try to get back to this in a bit.

vujennyfer · ‎05-25-2021

THANK YOU SOO MUCH @apathetichell!! This worked perfectly. You're the BEST! 🙂

vujennyfer · ‎06-18-2021

Hi @apathetichell, I'm returning to this problem and wondering if there is a way (maybe not Naive Bayes but another tool) that could help with text classification in Alteryx. For this solution to work, we needed at least 2 fields (review & rating). Is it possible to only utilize one field (review) to truly get the model to learn text classification instead of using the numerical rating?

apathetichell · ‎06-18-2021

Hi! I don't know about the text mining processes in Alteryx (extra $) but perhaps someone can talk to you about them? I know R has various NLP/stemming/document mapping functions which you could add via the R tool (for free) as does Python (for free). Python is python so if you start a new thread as "implementing text classification in python" someone will probably respond.

vujennyfer · ‎06-18-2021

Thanks for your input @apathetichell! I will try those suggestions 🙂

Alteryx Designer Desktop Discussions

Predictive Analytics-Naive Bayes-Error

[SHARING] How to: Dynamic Formula Example with new...

Re: Date Time Function - Prioritization Base on Du...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...

Re: Multi-Row formula and elseif operators