Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to extract data from HTML code.

BihaniBhavesh
5 - Atom

Hello, I'm looking for solution to extract the strings present inside an HTML codes. Can you please share the code or Alteryx workflow to achieve final output like : 

 

Input : <span style="color: rgb(255, 153, 0); font-size: 14px;"><span style="font-size:20px;color:rgb(255,153,0);">My name is Ram </span></span>

Output : My name is Ram

 

Input : <b style="color: rgb(255, 153, 0);">Please Enter Name </b><div><b style="color: rgb(255, 153, 0);">Ram</b></div>

Output : Please Enter Name Ram

3 REPLIES 3
acarter881
12 - Quasar

Hello @BihaniBhavesh.

 

You can parse HTML inside of Designer in multiple ways. One way is to use the RegEx tool (see the attached workflow).

 

acarter881_0-1685400813275.png

 

For more complicated HTML, it may be best to resort to a programming language such as Python. The Python library called beautifulsoup is great at parsing HTML.

tristank
11 - Bolide

Here is a fun challenge for parsing HTML 

 

You can look at some of the solutions and practice parsing yourself but generally, @acarter881 hit it right on the head. You will want to look for patterns in the HTML using Regex to parse

 

Best of luck!

 

Tristan

Raj_007
8 - Asteroid

Thank you so much - i tried to use this regex with tokenize method - I got the data in 2 columns json name and jsonname_value

where I see the column names first followed by the data that belongs to those columns  - i have to take this to the next level by validating 

 

Labels
Top Solution Authors