This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have a large dataset (over 5 million observations) and I have already divided the dataset into chunks. Now I want to process one chunk at a time according to the chunk_id. I could use a filter tool, but that's not really an option as I would need 27 filter tools.
Is there any way to update the filter condition and process each chunk individually.
1. Can you attach the pseudo_data as well? Or export your workflow as yxzp?
2. What exactly are the processes you need to do, that you need to do them one at a time? Is there a different process for chunk 1 than chunk 2? If the process is the same for all chunks then why not just sort the data by chunk #?
I have attached the file as yxzp. I need to upload the chunks in Python. The Python code is already ready and works fine. However, there is a problem: the bigger the dataset gets, the longer it takes to upload the data into the Python tool.
I just need to tell the filter tool in Alteryx for each chunk_id the id and so on
first iteration Check if chunk_id = 1 then second iteration Check if chunk = 2 then and so on until unitl chunk 27.
1) do what you did with the Filters, but combine them with a Block Until Done tool.
2) write a dynamic Python code that will essentially do the same thing (but you will not be in a need of defining the number of chunks from the start) - not sure how to write it myself, though, so I would go with option 1.
Also, Python is not the fastest tool around. Maybe you could rewrite the code entirely in Alteryx instead and get better results?
PS. I couldn't open your yxzp, because I have an older version of Alteryx. :(