In case you missed the announcement: Alteryx One is here, and so is the Spring Release! Learn more about these new and exciting releases here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to extract data from HTML code.

BihaniBhavesh
5 - Atom

Hello, I'm looking for solution to extract the strings present inside an HTML codes. Can you please share the code or Alteryx workflow to achieve final output like : 

 

Input : <span style="color: rgb(255, 153, 0); font-size: 14px;"><span style="font-size:20px;color:rgb(255,153,0);">My name is Ram </span></span>

Output : My name is Ram

 

Input : <b style="color: rgb(255, 153, 0);">Please Enter Name </b><div><b style="color: rgb(255, 153, 0);">Ram</b></div>

Output : Please Enter Name Ram

3 REPLIES 3
acarter881
12 - Quasar

Hello @BihaniBhavesh.

 

You can parse HTML inside of Designer in multiple ways. One way is to use the RegEx tool (see the attached workflow).

 

acarter881_0-1685400813275.png

 

For more complicated HTML, it may be best to resort to a programming language such as Python. The Python library called beautifulsoup is great at parsing HTML.

tristank
11 - Bolide

Here is a fun challenge for parsing HTML 

 

You can look at some of the solutions and practice parsing yourself but generally, @acarter881 hit it right on the head. You will want to look for patterns in the HTML using Regex to parse

 

Best of luck!

 

Tristan

Raj_007
8 - Asteroid

Thank you so much - i tried to use this regex with tokenize method - I got the data in 2 columns json name and jsonname_value

where I see the column names first followed by the data that belongs to those columns  - i have to take this to the next level by validating 

 

Labels
Top Solution Authors