Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #131: Think Like a CSE... The R Error Message: cannot allocate vector of size...

TheOC
15 - Aurora
15 - Aurora

This could be a lot of things:

Spoiler
Primiarily, its a sizing issue, and its running out of memory space.
Granted, it could be solved, by adding a ridiculous amount of ram- but thats not the real solution.

I would suggest an autofield - or optimised select tool prior to the tool, to ensure correct data sizes. This is because CSV files are obviously always text fields.

If we're looking at memory, add more memory by overriding the settings, and removing unnecessary tools.

Create a sample of the data, and run that through the tool instead.


TheOC_0-1633706345417.png

 


Bulien
Qiu
20 - Arcturus
20 - Arcturus
Spoiler
I did learn much for your replies and I dont have the tool set.
So I would consider this is done for me.
tammybrown_tds
8 - Asteroid

Done

Watermark
12 - Quasar
12 - Quasar
Spoiler
MT Solution 131.png

LiuZhang
9 - Comet
Spoiler
131 - 2.png131.png

Given the K-Centroid Cluster Analysis tool is successful run, I assume the issue is not on data type, though I would reroute the input for Append Cluster from Select rather than the data source.

 

Without seeing the tool itself, it's hard to know exactly what is the first thing that can be improved to deal with memory.

 

Given the issue is memory, for more of performance aspect, I would suggest first to standarise the data to reduce computation. As append encountered issue, likely to be the number of clustering is too high, when the tool is trying to append (join?), it may hit the issue like error for Append tool when more than 16 coming from right input.

TonyAndriani
9 - Comet

Reposting my response under my new user ID.

 

Some thoughts below. I'll append the note in a text file so I have something attached to my response. 

Spoiler
I've been doing some reading on the clustering tools - haven't had to use them yet. As I understand it, the clustering tool defines the clustering model and the append cluster tool assigns elements to the clusters. If that's the case and the tool is running out of memory because we're throwing too much at it, couldn't we just put the append tool into a batch macro and let it process the data in bite-size chunks that fit in memory? Just looked at the spoiler while writing this and sounds like I'm at least close to what the CSE came up with.
binuacs
20 - Arcturus
Spoiler
never worked on this too before
scoles0617
8 - Asteroid

My solution:

mmontgomery
10 - Fireball

Challenge #131

ARussell34
8 - Asteroid

The solution I found was to have the upper connection to Append cluster tool to be coming from the select tool and not the input file. That way all the correct size, names, types, and appropriate columns for further analysis are pulled through that select tool.