Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Batch Macro for HTML scrape (each instance diff html based on input excel)

Ray_Pospisil
8 - Asteroid

Hi all,

 

I am seeking an advice on how to make my BATCH MACRO work.

 

I am downloading public information from the Companies house in my country.

The HTML is always the same with only one variable when showing information for a different company.

Example:

Company A: ......//or.justice.cz/ias/ui/rejstrik-firma.vysledky?subjektId=664507&typ=PLATNY 

Company B: ......//or.justice.cz/ias/ui/rejstrik-firma.vysledky?subjektId=545255&typ=PLATNY 

 

My plan is to actually create a database whereby downloading information of all companies on this website, hence, every single combination of the figure in bold above.

 

 

How I go around?

 

I have one input excel file with all combinations of this html address where each line represents information about one company.

As there is specific parsing process involved, I must use a BATCH MACRO to always run the instance using one HTML address at once with BATCH MACRO simply appending each newly scraped instance in output excel file. (note - it is excel now with my example of few companies but once functional, output will be a database).

 

I built the Alteryx macro workflow as well as standard workflow where the macro is used but it does not seem to be working (i.e. always downloads information only about the first company in the input list).

 

(please find the standard workflow and macro attached)

 

 

 

Can anybody help me to figure out why the BATCH MACRO is not working?

 

Many thanks in advance,

 

Radek

 

 

1 REPLY 1
BrandonB
Alteryx
Alteryx

I think I see your issue. Rather than connecting your control parameter and action tool to a filter, you should connect it directly to your text input at the beginning. The way that it will run when doing this is running the workflow one time for every URL that is passed into it.  

output.png

 

Also, at the end of your output, I would recommend that you either make the Output Data tool dynamically create a new file for each URL that is passed in, or you should add a Macro Output instead so that the data exits the end of the macro back into your workflow. 

 

output 2.png

Polls
We’re dying to get your help in determining what the new profile picture frame should be this Halloween. Cast your vote and help us haunt the Community with the best spooky character.
Don’t ghost us—pick your favorite now!
Labels