Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help with Naive Bayes Model for FF

ivesbr
7 - Meteor

Hi:

 

I'm trying to build a predictive model in Alteryx for Fantasy Football.  Initially, I tried using a linear regression model.  But I then realized I needed to take a different approach. 

 

So now I've developed a workflow that feeds 6 weeks worth of fantasy football data to train the model and then provide an unseen version as well.

 

I ran into an error at first saying that I had too many players.  So I added a filter to focus on one player to see how the model works. 

 

But now I'm getting an error on the Score tool saying that the execution was halted because ... predProb.naiveBayes(mod.obj,newdata = new.data). 

 

Anyone familiar with the error and how to resolve?  Thanks!  

10 REPLIES 10
ChrisHe
Alteryx Alumni (Retired)

 Hey @ivesbr  - could you also pass along the Excel file used in your workflow too? I'll take a look and investigate!

 

-Chris

LukeG
Alteryx Alumni (Retired)

Hey @ivesbr 

 

After briefly looking at your workflow, I notice two things that may be negatively impacting your model.

 

1. The metadata shows that all fields are string values. Most of the fields should be numeric.

 

2. Naive Bayes is  a classifier that is meant to predict categorical variables. When predicting fantasy points (numeric), you would be better off with a linear regression, random forest, or any other model that is meant to predict continuous numeric variables.

 

Let me know if those changes get your model up and running

 

Luke

ivesbr
7 - Meteor

Hi @ChrisHe:

 

Sure thing ... here's the flat file.  Thanks!

 

All the best,

ivesbr
7 - Meteor

Hi @LukeG:

 

Thanks for the quick note back.  I actually posted a workflow to the Alteryx community last week that was based off a linear regression.  But I was advised to try a different model.  

 

So, I tried the Bayesian Model and to your point ... it runs off text (e.g. VString) as opposed to numerical values (e.g. Double).  As a result, I flipped all the values to VString so I could run the model.  

 

Not sure if this context helps, but that's the method to my current madness 🙂

 

All the best,

ChrisHe
Alteryx Alumni (Retired)

Hey @ivesbr ,

 

LukeG is correct that a Naive Bayes is generally used in cases with categorical variables.  For example you might use it as a way to detect if a particular financial transaction is fraudulent or not.

 

When looking to predict numeric values, a linear regression is a good bet, but in many cases I like to give the Boosted Model a try.  I've attached a workflow that goes through this as a solution. I've also changed the workflow so that your variables are the right types (aka numerics are numerics).  This is very important when using predictive models as the tools will convert text variables to multiple yes/no variables.  For example, if you include Team as a predictor it will create 32 new columns in your underlying data for each team. When Carson Wentz enters your model he'll have a 0 for 31 teams and a 1 for PHI.

 

Because of this using categorical variables with many (over 6-7) individual options is not normally very effective.  I might use something like conference or division instead of team for use in a model.  Take a look through my model and see what you think!  I've chosen the fields I think make the most sense and run them through a Boosted Model.  If it's helped please mark this as a solution so others in the community can find it more easily.

 

-Chris

ivesbr
7 - Meteor

Hi @ChrisHe

 

Thanks for sending over the workflow with the boosted model.  It looks similar to my linear regression model from last week (attached).  

 

Was hoping I could ask you one last question since it seems we did something similar.  Both your model and mine runs the variables through and get a predicted fantasy score for weeks 1 through 6.  

 

But what I would like to ultimately do is use the 6 weeks worth of data to forecast a week 7 projection before the games start.  Any thoughts on how to do that with a linear or boosted model?

 

Thanks again for your help!

 

All the best,

ChrisHe
Alteryx Alumni (Retired)

Hey @ivesbr - if you're looking to forecast into the future then you're going to be better off using our Forecast tools like ARIMA or ETS forecasting.  Most regression style predictive tools aren't really about looking "into the future" timewise.  Their goal is to take an instance (like customer transaction) and predict an unknown variable (like how much the'd spend based on their demographics and other factors).

 

I've attached another workflow that uses our ARIMA forecasting to predict a single player into the future.  The co-variates section of the ARIMA model allows you to select other fields that should also be taken into account beyond seasonality and trend. You can play with those check boxes to see what fields help make the future prediction more useful.

 

If you'd like to run the model for all players at once I recommend using this TS Model Factory tool that you can find and download at gallery.alteryx.com.  Let me know if this helps you get to a solution!

 

-Chris

ivesbr
7 - Meteor

Awesome ... thank you, @ChrisHe!

 

I'm going to step through the workflow in a little more detail so I may follow up if that's OK?

 

Thanks again!

 

All the best,

ivesbr
7 - Meteor

Hi @ChrisHe:

 

Quick question about the "other options" section in the configuration panel of the ARIMA tool.  It looks like the model is taking the 6 weeks of data from the input file and generating an additional 6 weeks worth of forecasted values. 

 

I assumed that I could configure the number of forecasted periods by adjusting the value under: "number of periods to include in the forecast plot."  But that doesn't seem to be the case.  Any chance you can confirm how one could go about adjust the number of forecasted outputs?  Thanks!

 

All the best, 

Labels