Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract data from csv file inside a gz file from Download Tool

pperrot1
7 - Meteor

Hi guys,

I need to read and process data from a csv file that is contained inside of a GZIP file that I need to download daily from the url: https://github.com/marcuswac/covid-br-data/raw/master/covid-br-ms-complete.csv.gz

 

I don't need to save the zip file, I don't need to save the csv file. I just need to extract the data inside of that file and use it within a workflow.

Is there a simple way to incorporate the Download Tool with something to Extract+Parse data from this link into a flow??

I can not figure this out!

 

Please help!

 

6 REPLIES 6
BrandonB
Alteryx
Alteryx

This workflow should work for you. I leveraged a batch macro after the download tool which has a dummy control parameter in it to force it to execute second. This process does download the file to your machine based on the path specified in the text input, however, this will be overwritten daily so it shouldn't take up a lot of space on your machine. Packaged workflow is attached and I have also included both the workflow and the macro in case you have issues extracting. 

 

 

download 1.png

 

 

download 2.png

Maskell_Rascal
13 - Pulsar

Hey @pperrot1 

 

Not sure how familiar you are with Python, but this is exactly the kind of thing I do with that tool. 

 

First input the URL to the Download tool and configure it to download to a Temporary File. 

Maskell_Rascal_0-1604517796523.png

 

I then use a Select tool to select only the path to the temp file and rename the field. 

Maskell_Rascal_1-1604517851729.png

 

Its now ready to be connected to the Python tool, so the path will be read into the code I built. 

Maskell_Rascal_2-1604517918530.png

 

Final output from Output #1 looks like this:

Maskell_Rascal_3-1604517990448.png

 

I've attached a workflow for you to try out. 

 

If this solves your issue please mark the answer as correct, if not let me know!

 

Thanks!

Phil

 

BrandonB
Alteryx
Alteryx

Awesome approach @Maskell_Rascal! Your solution is nice because it leverages the temp file rather than requiring one to be downloaded. I know that a regular Input Data tool can be used with a .gz file, but I haven't had luck with a Dynamic Input tool on a temp file. Could be something I'm missing in the configuration, but in any case, your solution looks perfect. 

Maskell_Rascal
13 - Pulsar

Thanks @BrandonB! I like your solution as well! I always forget how many problems can be solved by building out a macro. 

pperrot1
7 - Meteor

@BrandonB @Maskell_Rascal 

Wow, guys!
I'm speechless! To both of you, thank you so much!!

 

You have no idea of how long I have been trying to resolve this and the work-arounds I have been trying to pull (downloding the temp file, than opening it up again... it was a mess)

 

Both your solutions worked flawlessly!!!

EW
11 - Bolide

@Maskell_Rascal your Python solution seems to be what I need for a problem I'm facing.  But is there a way to get the Python tool output to include the filename of the downloaded file?  I'm not very familiar with Python.

Labels