Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Retrieving StatsWales Population Projections via API and Download Tool

Highlighted
8 - Asteroid

I am trying to download the Population Projections found on the StatsWales website via their API and having difficulty, so I am wondering if anyone has managed to do this successfully or fancied the challenge.

 

Ideally, I'd like to retrieve the counts with the following dimensions setup, so it is expanded by Age and Gender, for both 2019 and 2020: https://statswales.gov.wales/v/HwIo

 

However, I'd be grateful if I could just get the data into Alteryx first, nicely transposed and ready to work with in any format!

 

The API is open data and doesn't require a key. So far, I have successfully fetched the data using this link http://open.statswales.gov.wales/en-gb/dataset/popu5099 within a Download Tool. I have then used the JSON Parse to split this up but that's as far as I got. I am struggling to now to get any further so that the counts are ready for processing. 

 

As with many APIs, you can add several parameters to the link, to filter the data as you require. Ideally, I'd like to get this setup so the counts are broken down by age and gender. The API supporting documentation can be found here:
https://statswales.gov.wales/Help/Catalogue#APIEndpointsExplained

 

Thanks in advance!

Highlighted
Alteryx
Alteryx

This should get your data in the format that you need for further summarization and processing. Let me know if this works for you! Just a quick JSON Parse, filtering out the metadata row, doing a text to column on the JSON Name column, and then doing a crosstab to put it into a nice table. 

 

download and parse.png

Highlighted
8 - Asteroid

Thanks a lot @BrandonB that's a great start!

 

Any idea why we are just getting a very small sample of the data coming through though? Only looks like data for low ages in 2014. I can see that the final row once tranposed is a "nextlink" which I've tried downloading which I assume returns the next bulk of the data.

 

al_sweets_0-1590252257314.png

 

Any recommendations on how to proceed, might have to setup a macro to read it all in? Or am I going to just have to figure out setting up the parameters in the link? Ideally would like all ages for just 2019 and 2020.

 

Update:

I've managed to filter it to 2019 using the browser by adding the following parameters:

http://open.statswales.gov.wales/en-gb/dataset/popu5099?$filter=Year_Code%20eq%202019 

 

However, when used in Alteryx, it return this error:

error en-US Syntax error: character '%' is not valid at position 9 in 'Year_Code%20eq%202019'.

 

How can it work in the browser but fail in the Download tool?

 

Highlighted
Alteryx
Alteryx

This is likely an API that uses pagination, meaning that it downloads a certain number of records and then provides the link to the next set. The way to do this in Alteryx is to leverage an iterative macro which leverages the url for the next set as what to iterate through on each run. When that passes through, the next data and next link will continue to be passed through until all of the data comes through and there isn’t another next link. I will set up an example for you this next week if nobody else builds this out by then. 

Highlighted
8 - Asteroid

Thanks @BrandonB

 

Obviously the ideal method would be to filter it beforehand to avoid unnecessary calls to the API. Given I only want years 2019 and 2020, it seems undesirable to retrieve data for all the other years up to 2039.

 

Do you have any advice for the syntax error when filtering, which seems to work fine within an internet browser but fails in the Download tool?

 

http://open.statswales.gov.wales/en-gb/dataset/popu5099?$filter=Year_Code%20eq%202019 

 

In Alteryx, it returns this error:

error en-US Syntax error: character '%' is not valid at position 9 in 'Year_Code%20eq%202019'.

Highlighted
13 - Pulsar

I thought exactly the same as @BrandonB. An iterative macro is the way to go.

 

The "main workflow" only needs the starting URL.

 

grossal_0-1590256135460.png

 

And the macro contains the main logic:

 

grossal_1-1590256160151.png

 

The Filter in the End checks the data for the new URL and gives them back to the Input.

 

Workflow / package attached. Let me know if it works.

 

 

Best

Alex

 

Highlighted
8 - Asteroid

Thanks @grossal - this is definitely progress.

 

However, it still only returns a total of 14,001 transposed records for me, stopping after just 14 iterations. It doesn't get further than Age 11 population projections for only year 2014, and I'll need all ages for 2019/2020 so quite a way to go. The last row of the final iteration is entirely null for all fields, with no "next link". 

Highlighted
13 - Pulsar

I checked out our other URL. You need to disable URL encoding in the Download Tool:

 

grossal_0-1590258106303.png

 

I tried it myself and stopped after 30 iterations. Looks like it's working. Could you give it a try?

Highlighted
8 - Asteroid

@grossal Thanks for the tip regarding unselection the encoding option, this now works outside the iterative macro.

 

Using the macro, mine actually got up to iteration 49. It then continues to run indefinitely and throws the following error from iteration 50. 

 

al_sweets_0-1590265957012.png

 

The macro should maybe check for the Null field then so it can dynamically stop in future. As a workaround, I set the macro to limit to 49 iterations and this seems to work perfectly. I've looked at the data and it's exactly what I was expecting.

Labels