Predictive Analytics-Naive Bayes-Error
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am trying to use the Naive Bayes Classifier tool to determine if a review is "negative" or "positive". I am able to produce probability results for the training dataset, but not for the test set (unseen data). I keep getting the below errors. Attached is the workflow and the dummy data. The excel sheet has two tabs for the training and test data set. Any help is appreciated!!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Doesn't the Alteryx Naive-Bayes tool require two predictors? I know you above version seems to work - but this could be triggering the error.
More evidence - from the tool sample workflow...
The Naive Bayes Classifier tool assumes that all predictors are independent of one another and predicts, based on a sample input, a probability distribution over a set of classes.
To configure the tool, first name the model. Next, select the target and at least 2 predictors. Choose a positive value as a smoothing parameter. The default value is 0. This feature allows you to smooth the data by accounting for class/feature combinations that might either be entirely absent from the training set, or are otherwise under-represented in frequency.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @apathetichell, thanks for your reply. I added another predictor (ratings) to have two total predictors but I am still seeing those errors. I also changed the Laplace smoothing to '1' as a positive value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
made a few changes and this seems to work.
I think the test data and train data need to have the same fields - so the "positive/negative" was set to a dummy field. Made a few other small changes here and there...
I'm not ecstatic with this version so if no one has given you a better answer (as to why this works and some changes don't) - I'll try to get back to this in a bit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
THANK YOU SOO MUCH @apathetichell!! This worked perfectly. You're the BEST! 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @apathetichell, I'm returning to this problem and wondering if there is a way (maybe not Naive Bayes but another tool) that could help with text classification in Alteryx. For this solution to work, we needed at least 2 fields (review & rating). Is it possible to only utilize one field (review) to truly get the model to learn text classification instead of using the numerical rating?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi! I don't know about the text mining processes in Alteryx (extra $) but perhaps someone can talk to you about them? I know R has various NLP/stemming/document mapping functions which you could add via the R tool (for free) as does Python (for free). Python is python so if you start a new thread as "implementing text classification in python" someone will probably respond.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your input @apathetichell! I will try those suggestions 🙂
