Alteryx Machine Learning Discussions

Gadro · ‎10-28-2022

Hello everyone! I've recently trained a model and I'm attempting to have it score a portion of data held in reserve. Unfortunately, I'm running into a persistent error. I'm not sure what the source of the error could be. I have added two select tools to hard-force the data types between both the training set and what I'd like to score to make sure they agree. I also ensured none of the numeric variables are floats - they are all either strings or integers.

Does anyone have any idea on how to better debug the issue? The error message I'm getting says the error occurs in the predict tool, but doesn't tell me which column it's attempting to compare and failing to do so. Any ideas?

gyang3 · ‎10-30-2022

@Gadro without seeing your workflow, it'd be a bit difficult to figure out where in the process the error lies. Any chance you can provide a sample workflow? A few suggestions below:

You may have NaN values in which one or more of your string/categorical fields contain null values (possibly from force changing the datatypes using the select tool). You could use a formula tool to replace these, and then use the ML tool to one-hot encode the categorical variables.

Also ensure that the columns in your dataset contains values that are consistent with the datatype that you've set. For example, Column1 is suppose to contain numeric values but was read in as a string because it contained alphanumeric characters and you force changed the datatype to a float or double. Even though you force changed it, you may still need to clean up the data so that there are no alpha characters.

Hope those suggestions help!

shakir_juolay · ‎04-13-2024

@Gadro

I was facing the same issue. I had developed the model using Assisted Modeling Tool. One of the categorical variables in the training set did not have any null values in it hence the Assisted Modeling Tool did not define a null value handling strategy for the variable, the tool only defined one hot encoding for the variable in the machine learning pipeline. However in the test set the same variable had null values in it and this was causing the error mentioned by you in the one hot encoding step in the machine learning pipeline. I retrained the model without the variable and when used it to score the test set the issue was resolved.

Alteryx Machine Learning Discussions

Getting Started

Start your learning journey with Alteryx Machine Learning Interactive Lessons
Go to Lessons

Scoring the model failed with error: '<' not supported between 'str' and 'float'

Alteryx Machine Learning Discussions

Getting Started

Start your learning journey with Alteryx Machine Learning Interactive Lessons Go to Lessons

Scoring the model failed with error: '<' not supported between 'str' and 'float'

Start your learning journey with Alteryx Machine Learning Interactive Lessons
Go to Lessons