Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

HTML Data Source

OscarGeorge
7 - Meteor

Hi All,

I'm trying to bring in an ODBC data source which contains emails in a CRM system, however the body of the email is saved as HTML. How can i read or manipulate this particular field (Description).

 

Attached is a screen shot to give you an idea of whats happening, any help greatly appreciated. 

Many thanks


Oscar 

 

 

snip_20200827112109.png

 

3 REPLIES 3
DavidP
17 - Castor
17 - Castor

Hi @OscarGeorge 

 

I would use a Text to Columns tool on [description] set to split to rows on delimiter \n

 

Then use this formula in a Formula tool:  REGEX_Replace([description],'<[!fiohldpMNmcsbtua][^>]*>|<\/[^t][^>]*>|<\/title>|<\/table>|&[a-z]+;','')

 

It does a decent job of removing html tags in many cases.

 

Also try using the regex formula first and then do the split to rows - I'm not sure which will work best.

ChrisTX
16 - Nebula
16 - Nebula
OscarGeorge
7 - Meteor

Thanks David, the first method worked quite well.  I had to use a filter and manipulate it a little after but that's definitely a handy formula. 

Labels
Top Solution Authors