Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Check out our powerful new search update! You can read more here. Please let us know if you have any feedback by creating a topic here.

Logistic Regression error "Unexpected number of bytes to read."

Highlighted
Meteor

I am encountering an error when scoring the output from stepwise logistic regression - "Unexpected number of bytes to read.  Invalid argument".  The scoring tool appears to be processing the evaluation set from the Create Samples tool without a problem, but crashes on the validation set.  The same error message appears with the Logistic Regression tool, although it does not seem to be causing a problem there since I still get an output model from the Report output of the Stepwise tool.  Any suggestions?  The workflow is shown in the attached document since I have not been able to figure out how to paste images into these messages. Thanks.

 

 

 

Highlighted
Moderator
Moderator

Hi @WonderHog,

 

Is it possible for you to try packaging your workflow up so we can take a look at your data and workflow ?  To do so go to options > export workflow. 

 

Since your input data comes from a database, you would need to save it to another supported file format such as xlsx or yxdb and redirect the input data tool in your workflow to the copy before exporting as database files tables will not get exported.

 

Thanks,

 

as a side note, to add an inline image to a community post save the image, then click the below button

 2018-01-08_12-17-08.png

Jess Silveri
Premium Support Advisor | Alteryx
Highlighted
Meteor

Hi Jessica - Thanks for the response.  I am thinking this may be a memory problem since it is crashing after scoring my 60% evaluation set, but while it is scoring the 40% validation set.  (There are about 320,000 records).  If I reduce the size of the sets (to 50%/30%) it runs OK, as well as when I reduce the number of input predictor variables.  I have only 8 Gb of RAM.  Unfortunately, I can't share the data with you since it contains protected health information (PHI).  I have attached the workflow itself.  Thanks again.

Highlighted
Moderator
Moderator

Hi @WonderHog,

 

 

Can you try a few steps to optimize the workflow and see if it makes any positive outcome?


I would recommend dropping the fields you aren't scoring on with a select tool.  Additionally, you can try to remove the browse tools in the workflow.  We've got a good general recommendations document here.

 

If you still get the memory error could you score in batches in a separate workflow ?

Jess Silveri
Premium Support Advisor | Alteryx
Meteor

Could you have some string variables with too many classes? I got exactly same error message with Logistic regression, when I set (by mistake) my "salary" variable as string, then logistic regression look at each salary value as separate class and it becomes too many. 

Highlighted
Meteor

Hi Jessica,

 

I have streamlined things and it seems to be working better now, only crashes part of the time.  I think I just need to get more RAM.  Thanks

Labels