Hello
I have the folloiwng Alteryx workflow with a Batch Macro. The purpose is to generate the rolling sum of the transaction amount for a sliding window with a width of 3 time steps. Here I only showed the Sliding Window, without the running sum part.
When I run the below workflow and the macro, I received error message "Temp Drive is getting full". I have tried to configure the Memory Limit of Alteryx (under Options -> Edit User Settting) with a bigger size, still it did not work.
My machine has 58G hard-disk free space. Apparently, when I run this workflow / macro, it used up all the 58G. The dataset has 2M records, and its size is 51MB only. I tried to work with only the first 1M records, it works without problem. So it seems like it is a hard-disk space issue. I believe I did not build my Macro properly, therefore eating up so many disk space. And when I stopped the workflow, I get back my 58G hard-disk space.
Can anyone give some help? Thanks
Workflow and macro attached
Thanks a lot
tsnchan
Do you mind exporting your workflow instead?
Please provide relevant data to this use case, and kindly provide your criteria in as much detail as possible. If you have a workflow built halfway, kindly export that over as well.
To export a workflow go to: Options > Export Workflow. Kindly do NOT send a "Save As" copy.
Hi @tsnchan
Without seeing the data and it's flow through the macro i can only summerise, but, i suspect your issue lies with one of the two Append Field tools.
If you're happy to share the data we can have a deeper look, but i suspect that's a no 😀. I'd suggest testing just the macro and see what happens with the data in the first instance
I suspect that too. The records must've blown to billions and counting.
Thank you all for your reply. I can share the data (which is available from public domain). The original set is huge (480MB), so I just sampled the first 100k rows. Would love to hear your thoughts.
The file name appended with "_sampled", so you need to change the file name inside the input box of the workflow. Thanks.
Hi @tsnchan
i'm not getting any issues with this dataset - can you provide where the data is from if its publically available so I can try with the full dataset?
Additionally can you provide details on what you're feeding in to the control parameter? I'm defaulting it to 11 as per the macro but want to make sure.
As @caltang suggested exporting the workflow may be better as we can then work on the full package for you
Online Payment Fraud Detection (kaggle.com)
This is the full dataset with 6M rows.
The "Temp Drive is getting full" happened when I tried 2M rows. It works also ok with 1M rows.
Record ID is the control parameter. I default it for 11 in the Macro. We need to run for each Record.
Thanks again
I cannot export the workflow with data, as the data file size is too big. It exceeds the max file size allowed here.
We still need the other two tools you're using other than the input. When you export, you can untick the input so it's more manageable.
Hey @tsnchan
I'm going to hazard a guess that this workflow takes AGES to run? You're essentially appending 1/2 million records to 1/2 million records, which creates 4/36 trillion records overall within the batch macro. That's why you're running out of disk space.
Whats the logic behind the macro?