Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Set up Macro to split the data and process it

Anmol_Mehrotra
6 - Meteoroid

Hi all,

I have the following scenario and want to check how I can build a macro to perform the task:

 

I have a dataset of 300-800k rows (varies daily depending on various factors). I want to build a solution that can split this data into chunks with an upper limit of 100k as the API can only process up to 100k rows at a time. Furthermore, I want this data to go through 3 different tools before it gets stitched together and is pushed as a single Excel output file.

So I am looking for something dynamic that can divide the data based on the total rows in the input file and then pass all the sets of data through the same 3 tools to process and append so there is only one output file.

 

I have not be able to make it work till now. Would appreciate if anyone has an idea on how this can be achieved.

4 REPLIES 4
danilang
19 - Altair
19 - Altair

Hi @Anmol_Mehrotra 

 

We need a bit more info here.  Is the 100K limit on the input side or on the output side?

 

If it's on the input side then you need to build a macro that can read all the data.  Build an iterative macro where you pass in the URL of the first API call as it's input.  The loop exit logic depends on the format of the meta info returned by the API. 

 

Most APIs return the total number of pages/rows and current page/row number for your entire dataset on each page of data.  Compare the total pages to engine.iterationnumber+1(engine.iterationnumber is zero based) and if it's less modify the URL to get the next page of data and pass it back to the input.  If it's equal, remove the URL record from the input which will cause the iterative macro to stop and return all the data.  

 

Some API's don't provide the total pages, but instead return no rows when the data has been completely transferred.  In this case use a Count on the number of returned rows and if it's >0 then pass the modified URL back to the input.  If it's 0, remove the URL record 

 

If the API limit is on the output side, you can use a simple batch macro that send one batch of records at a time and set up the number of batches in  main workflow.

 

Dan 

Anmol_Mehrotra
6 - Meteoroid

So the API is one of the tools that I require once the data has been pulled. Basically the initial data is coming from an Excel file and the API is pulling in another set of data that needs to be joined using a unique key. And the data that is coming from API can been pulled 100k at a time which is core issue.

I hope that makes a bit more sense.

danilang
19 - Altair
19 - Altair

Hi @Anmol_Mehrotra 

 

Use an iterative macro to pull your paged API data.  Here's an excellent example to get you started.  Of course, you'll have to modify it to handles the details of the specific API that you're calling

 

Dan. 

Anmol_Mehrotra
6 - Meteoroid

Thanks Dan. I will take a look.

Labels