This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am encountering an error when scoring the output from stepwise logistic regression - "Unexpected number of bytes to read. Invalid argument". The scoring tool appears to be processing the evaluation set from the Create Samples tool without a problem, but crashes on the validation set. The same error message appears with the Logistic Regression tool, although it does not seem to be causing a problem there since I still get an output model from the Report output of the Stepwise tool. Any suggestions? The workflow is shown in the attached document since I have not been able to figure out how to paste images into these messages. Thanks.
Is it possible for you to try packaging your workflow up so we can take a look at your data and workflow ? To do so go to options > export workflow.
Since your input data comes from a database, you would need to save it to another supported file format such as xlsx or yxdb and redirect the input data tool in your workflow to the copy before exporting as database files tables will not get exported.
as a side note, to add an inline image to a community post save the image, then click the below button
Hi Jessica - Thanks for the response. I am thinking this may be a memory problem since it is crashing after scoring my 60% evaluation set, but while it is scoring the 40% validation set. (There are about 320,000 records). If I reduce the size of the sets (to 50%/30%) it runs OK, as well as when I reduce the number of input predictor variables. I have only 8 Gb of RAM. Unfortunately, I can't share the data with you since it contains protected health information (PHI). I have attached the workflow itself. Thanks again.
Can you try a few steps to optimize the workflow and see if it makes any positive outcome?
I would recommend dropping the fields you aren't scoring on with a select tool. Additionally, you can try to remove the browse tools in the workflow. We've got a good general recommendations document here.
If you still get the memory error could you score in batches in a separate workflow ?
Could you have some string variables with too many classes? I got exactly same error message with Logistic regression, when I set (by mistake) my "salary" variable as string, then logistic regression look at each salary value as separate class and it becomes too many.