Alteryx Designer Cloud Discussions

44619e163d12cf5f0d39 · ‎10-23-2018

When I investigated why one of my Postgres instances crashed, it turned out the disk space got eaten by a temp file. This specific instance runs on AWS / S3 / EMR.

The temp files that are created are in an s3a directory and are fairly large

This seems to be some caching, I noticed they vary in size depending on the job data size.

Can they be deleted and circumvented from being created in first place ?

44619e163d12cf5f0d39 · ‎10-23-2018

Also, this is an 5.1 instance we are just testing, in case this makes a difference, we used the AMI version. Thanks in advance.

Trifacta_Alumni · ‎10-23-2018

Hi @Mad Hatter? , this may be related to the temporary directory on the Trifacta node that's being used for buffering when uploading to S3.

Try modifying the location in the /opt/trifacta/conf/trifacta-conf.json

"filewriter": {

max: 16,

"hadoopConfig": {

"fs.s3a.buffer.dir": "/tmp",

"fs.s3a.fast.upload": false

},

...

}

For your reference, this is outlined in https://docs.trifacta.com/display/r051/Enable+S3+Access?os_username=tr82r051usr-&os_password=%5E88f2Sl2mG0A2l1239F%5E

In the above, you can disable buffering by setting fs.s3a.fast.upload to false , that should take care of your temp file problem.

Regarding the side effects of setting "fs.s3a.fast.upload": “false” , other the decreased performance it causes files that Trifacta is working on S3 will be downloaded fully to the edge node

For example, if jobs are writing big files, at the publishing step those files get downloaded to the edge node. This in turn can cause the edge node to be running out of disk space

44619e163d12cf5f0d39 · ‎10-23-2018

Thanks @Sebastian Cyris? , will go ahead and try it out.

44619e163d12cf5f0d39 · ‎10-23-2018

Thanks @Sebastian Cyris? , this did the trick.

didn't really see any impact performance wise, since pretty much all of the jobs are scheduled

Ghost
Pumpkin
Vampire
Zombie

Alteryx Designer Cloud Discussions

Postgres crashed due to insufficient disk space