Good morning.
I am trying to pull down data via API and found that the API returns data for about 100 records at a time. There are about 600+ in my dummy data source. The bottom of the API results provide a 'next page' URL which I am trying to re-process through the workflow via an interative macro.
After trying to replicate some examples seen in this community I have been able to isolate the 'next page' URL at the end of the initial results, however, I am not quite sure how to get it to get sent back to to the beginning of an iterative macro. The initial data input is the initial API URL, token and user ID. I believe I need to just replace the URL with the 'next page' URL each time.
I attached my workflow and macro, and an image of the macro itself is shown below to try to illustrate what I'm doing.
This is my first macro attempt so forgive me if I'm missing something fundamental.
Solved! Go to Solution.
This is exactly right!
The key is to find documentation on the API to understand how pagination works for the specific API you're leveraging. In this case, Github looks to be a better source than the Pushshift website itself.
From what I can tell, a date parameter may be the best way to construct the next url for the iterative macro. But it really depends on what you're looking for. See the example below where I leveraged the "created_utc" field for pagination. Note that this can result in MANY iterations if you don't cap it at a reasonable number (I cancelled my workflow after 626 iterations!).
This should get you started, and help you understand how to approach the iterative macro. It looks like your original macro was updating the size parameter, but that only defines how many records to return each time (default is 25, max is 1000).
Won't the iteration involves filter for the records that doesn't meet a certain condition? How would you work this out when you are taking query string/body from field?