This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Yes - we've had this particular error many times - and it's always when the machine runs out of memory.
An example was when we were running multiple workflows on a machine throughout the day, but we forgot to close the previous version - as it ran out of memory we started getting this kind of failure. What we did in this case is to write a quick Auto Hotkey job to close apps on the machine and jobs that were not in a running state (not pretty, I know). We also trimmed the jobs down to work in small subsets (instead of running for all unprocessed rows, it runs for the first 100 000 rows and then stops - that way we can run it many times).
I've also had this if there's an odd unhandled error well down into a macro - the error report sometimes gets lost. What we did in this case is to dig into the sub-macros to find the errors. These ones are very hard to find - the way we found this was to cut the job in half and continue doing this until we found the part that was failing.
That's unusual. We fairly regularly run very large scoring streams on VMs of that size without a problem (2 streams x 10 million records x 40 models).
How many variables are you feeding into your scoring stream? Have you tried decreasing the number of records it scores at one time? We make sure to only send the fields necessary for scoring through the scoring streams to cut down on the amount of data that R has to deal with, and it can be a significant improvement in performance over sending in all of the data. Not sure if that's what is at play here.
Has it recurred at all?
EDIT: I see that you're not even getting to scoring, so change the question to how many fields are you attempting to model and what's the size of your training set?
If you've got 32GB of RAM and 4GB dedicated sort/join per stream, that should set you up to run 4 streams (per Alteryx's documentation). But if all of the RAM is in use, then you can still run into that issue. Not entirely sure on how R handles the swap space so if it could want more actual RAM than is available, even if a normal alteryx work flow would just run slower.
R's memory management is not a topic on which I'm particularly competent. I also don't have experiences using that many inputs in building a predictive model. If you can predictably generate this error (or get to that point) I would see if limiting the model inputs by half, either in training set size or in variable count, stops it from occurring.