Alteryx Designer Desktop Discussions

MRPP1982 · ‎09-28-2023

Dear Community!

I am trying to extract table from an HTML file (trust me, it is HUGE), a sub-set of it is attached here.

This form doesn't allow .html to be added, hence I have pasted code in a Notepad, please open the same using a browser.

Any help will be greatly appreciated.

Cheers.

Yoshiro_Fujimori · ‎09-28-2023

Hi @MRPP1982 ,

Could you attach the file?

Yoshiro_Fujimori · ‎09-28-2023

Just as a staring point, I created a simple html file containing a table as below.

RecordID	Name	Value
1	Apple	2.0
2	Orange	1.5

Though I am not familiar with html format, I guess the basic idea is to extract the <tr> tag and <td> tag.

So I made a workflow as attached.

Workflow

I hope this helps.

MRPP1982 · ‎09-30-2023

Helo @Yoshiro_Fujimori,

Thanks for your response. Please forgive my ignorance with HTML tags, I am not able to modify this flow/tools to suit my purpose. I have made a 'short' version of the actual file and attaching here. Could you please re-look at it and help?

P.S: I have attached a 'sub-set' of my HTML as .7z file, please 'unzip' the same.

Many thanks in advance!

Cheers!
- Sai

Yoshiro_Fujimori · ‎10-01-2023

Hi @MRPP1982

Attached the revised version, with additional flow to deal with <th> tags.

You may need to modify the workflow to apply it to the larger data set,

but basically you can extract the rows and columns (and headers) by understanding the html tags

and then you can put the contents in table format by yourself on Alteryx.

Good luck.

Workflow

Alteryx Designer Desktop Discussions

Extract Table from HTML File