Alteryx Designer Desktop Discussions

MRotter · ‎06-15-2022

Hello,

I have a large dataset (over 5 million observations) and I have already divided the dataset into chunks. Now I want to process one chunk at a time according to the chunk_id. I could use a filter tool, but that's not really an option as I would need 27 filter tools.

Is there any way to update the filter condition and process each chunk individually.

FilipR · ‎06-15-2022

1. Can you attach the pseudo_data as well? Or export your workflow as yxzp?

2. What exactly are the processes you need to do, that you need to do them one at a time? Is there a different process for chunk 1 than chunk 2? If the process is the same for all chunks then why not just sort the data by chunk #?

MRotter · ‎06-15-2022

Hello,

I have attached the file as yxzp. I need to upload the chunks in Python. The Python code is already ready and works fine. However, there is a problem: the bigger the dataset gets, the longer it takes to upload the data into the Python tool.

I just need to tell the filter tool in Alteryx for each chunk_id the id and so on

For example:

first iteration Check if chunk_id = 1 then
second iteration Check if chunk = 2 then and so on until unitl chunk 27.

Is there any way to update the filter tool?

FilipR · ‎06-15-2022

I can think of two solutions at the moment:

1) do what you did with the Filters, but combine them with a Block Until Done tool.

2) write a dynamic Python code that will essentially do the same thing (but you will not be in a need of defining the number of chunks from the start) - not sure how to write it myself, though, so I would go with option 1.

Also, Python is not the fastest tool around. Maybe you could rewrite the code entirely in Alteryx instead and get better results?

PS. I couldn't open your yxzp, because I have an older version of Alteryx. :(

CarliE · ‎06-15-2022

@MRotter,

A better way to do this is to use a batch macro that dynamically updates the filter for each chunk (iterates per chunk). Linked here is an example

Also, I saw you have a ceiling formula in there that wasn't doing anything if you want you can group X amount of fields per group by doing

CEIL([chunk_id]/27) which will group the first 27 records into the first iteration.

If this helped to solve your issue, please make sure to mark it as a solution.

Thanks

Carli

MRotter · ‎06-21-2022

THX Carli, it helped me to progress.

All the best to both of you.

Alteryx Designer Desktop Discussions

Update formual tool