This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
So it looks like WeekNumber and Season_Number are set as V_Strings, which means R is treating them as categoricals (rather than continuous or ordinal variables). With the forest modeler, if there are categorical values present in the scoring stream that were not available in the training stream, those rows will not be scored and cause an error. In your example, the WeekNumber and Season_number had 7 categories in your 9 rows that were not present on training.
No error is generated if you score your model on the training data (which means no new categories at scoring time) by connecting your Sample 19 rows output to the scoring node.
You can also stop an error from occurring by changing WeekNumber and Season_Number to an integer value type, which is then treated as ordinal/continuous by the Forest Model. That also eliminates the problem of missing categorical values!
We have moved away from using a straight up forest model when we deal with large numbers of categories and small training sets and tend to use Boosts which aren't as finnicky with missing categories. Whether that is an acceptable solution in your case, or whether changing your variable types to continuous is acceptable, will depend on the specific model you're trying to build. Hope this helps!