Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

scorer node

lixuanzhang
7 - Meteor

I  used the score node to predict prices for two houses but the predicted price for one is NULL. Can you take a look at the workflow and see what I did wrong?

Thank you. 

3 REPLIES 3
TheOC
15 - Aurora
15 - Aurora

Hi @lixuanzhang 
Thats a great question, and I have been able to have a look at your workflow and find a solution.

First of all, thanks for attaching your full workflow, that was incredibly helpful for digging into your problem. I figured it may be useful for you to see my steps of investigation, in the case of any future issues with your models.

 

 

My first step was to open up your workflow, and give it a run. As you mentioned, there is one row that predicts a value, and one that produces a null value out of the score tool.

I then had a look at the data testing the model, specifically the 'metadata' tab of the results window, to ensure number of columns and data types were the same:

TheOC_1-1669220808761.png

 


And to see if they match the training data (they do!):

TheOC_2-1669220823207.png

 

The next thing to investigate was the String based data. Your string data is used as categorical variables, in your case a 'yes or no' (binary) value. One occasion the error you faced may appear would be when you try to test on a variable that does not exist in your training dataset. Consider a model trained with 'on' and 'off', but is given a new record with the value 'dimmed'. The model would not be trained on the value 'dimmed' and so, it would not be able to produce a result.

 

The quick way to check this, is again to click between the two data inputs:

TheOC_3-1669221566173.png

(testing)

TheOC_4-1669221606992.png
(Training).

One thing I noticed here is that in your testing data, 'YES' is capitalised, whereas in your training data, it is 'Yes'. The scoring of your model will not recognise this as 'Yes', it is case sensitive. If I didn't spot this, my next step would have been using a summarize tool to 'group by' each value, just to investigate what categories are within each string column.

 

So in this case, the fix for your issue is to match the two 'Yes' values. There are multiple ways of doing this, for instance using a data cleansing on each stream to change all strings to upper case. One way i quickly tested that this was the issue was through a filter tool, to replace 'YES' with 'Yes':

TheOC_5-1669221876280.png


And that worked!

TheOC_6-1669221912736.png



So i removed the filter, and applied the data cleansing tools to tidy up your data. This future proofs the results, in the case that there is a 'NO' (rather that 'No') in future data. 

 

I have attached that workflow, please give me a shout if you run into any issues or would like any further explanation.

 

Kind Regards,

Owen


Bulien
lixuanzhang
7 - Meteor

Thank you very much! That is VERY helpful. 

TheOC
15 - Aurora
15 - Aurora

No problem! Glad I could help 😁


Bulien
Labels