Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Concatenate API Data in chunks

fnkankeu
6 - Meteoroid

Hi,

 

I am trying to read from a website using its API. The data come in CSV format and in chunk of 10MB. The first issue is that the data cut off right at that 10 MB and the next chunk will continue when the last chunk cut off. So I need to find a way to concatenate all the chunks together and use the result to do some data manipulation. I want it to be dynamic so I can use the result in the same workflow and it can work with different number of chunk. 

With my experience with Alteryx I don't think it is possible to do. I just wanted to take a chance by asking the question here. 

I added as an attachment the end and the beginning of 2 chunks I have read with the browse tool. 

I need some help.

 

Thanks

10 REPLIES 10
BrandonB
Alteryx
Alteryx

Could you use a summarize tool and concatenate the column without a delimiter? That would turn the entire column into one string which you could then use a text to columns tool to split to rows, use a multi-row tool to create kind of like a record ID that starts from 1 again once it increments up to the total number of desired columns, and then crosstab the data with this record ID as the headers. Then you could use a Dynamic Rename tool afterwards which would set the first row of data which contains your headers to be the actual column headers. 

BrandonB
Alteryx
Alteryx

This example should work for your data. You may need to update the first multi-row tool to a different number than 5 if you have a different number of columns but that is a quick fix. This approach will accommodate any number of chunks. Workflow is attached. 

 

Data in "chunks" as you mentioned:

 

BrandonB_1-1641746055173.png

 

 

 

Solution

 

BrandonB_0-1641745926587.png

 

atcodedog05
22 - Nova
22 - Nova

Hi @fnkankeu 

 

Can you provide this in excel and also can you show us how the 2nd chunk continuing the 1st chunk looks like.


We will be happy to help : )

fnkankeu
6 - Meteoroid

Hi Brandon,

 

Thank you for your reply. It helped me a little bit. I will share with you a picture of a part of the new workflow I did based on your answer. 

Like you can see on my workflow, I need to change the chunk number in the URL of the formula tool to make the call to receive each chunk of data. What I understand of your answer is that after each call, I should directly append the output of each chunk in a file and once the append of all the chunks is done, I read the file to make the data manipulation you explained in your first answer. 

How can I append each chunk in a file since Alteryx only allow the overwriting of an CSV file. 

Assuming I was able to append each chunk, how can I make sure that when I am reading the file, the appending process is over? I want to read the file in the same workflow as the appending process. 

I hope I was clear.

 

Thanks for your help 

OllieClarke
14 - Magnetar
14 - Magnetar

Hi @fnkankeu how do you know how many chunks there are in a file?

If there's a way of knowing before the download tool, then you could use a generate rows tool to make the requisite number of urls which would then pass through the download tool. After this you can do as @BrandonB suggested and concatenate all the downloaddatas (grouped by file name), so you get the full data per file. 

If you can only know how many chunks there are by incrementing the chunk value until no data comes out, then you will probably have to make an iterative macro, so you can poll the API until the data runs out. After this point you can implement the concatenate method as before.

 

Hope that helps,

 

Ollie

OllieClarke
14 - Magnetar
14 - Magnetar

@fnkankeu sorry, I would also add that you should sort your data by file name and chunk value before you concatenate it, as the download tool can output values in a slightly different order to its input (due to different response times along with the ability to have concurrent connections)

fnkankeu
6 - Meteoroid

Hi @ollieClarke,

 

Yes there is a way to know how much chunk I have. You said "then you could use a generate rows tool to make the requisite number of urls which would then pass through the download tool."

Can you give me an example please of how to do that?

 

Thanks

atcodedog05
22 - Nova
22 - Nova

Hi @fnkankeu 

 

As I can see you are mentioning the chunk.

 

atcodedog05_1-1641828211619.png

 

You can generate chunk number using generate rows tool like below and pass them through the download tool at once it can process multiple rows of request URLs like below

 

atcodedog05_2-1641828401761.png

 

Hope this helps : )

 

 

OllieClarke
14 - Magnetar
14 - Magnetar

Hey @fnkankeu sure 🙂

 

OllieClarke_0-1641833009502.png

so here I started with a field telling me how many chunks (Num_Chunks)

OllieClarke_1-1641833054628.png

 

As your chunks start from 0, I've assumed that 5 chunks would be the range 0-4. I set up the generate rows tool to create a new column (Chunk) like this:

OllieClarke_2-1641833111345.png

OllieClarke_3-1641833120720.png

which will generate every number between 0 and Num_Chunks-1

 

We can then create the url with a formula tool as you did

OllieClarke_4-1641833177216.png

 

Hope that helps,

 

Ollie

 

 

Labels