Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

HTML Data Source

OscarGeorge
7 - Meteor

Hi All,

I'm trying to bring in an ODBC data source which contains emails in a CRM system, however the body of the email is saved as HTML. How can i read or manipulate this particular field (Description).

 

Attached is a screen shot to give you an idea of whats happening, any help greatly appreciated. 

Many thanks


Oscar 

 

 

snip_20200827112109.png

 

3 REPLIES 3
DavidP
17 - Castor
17 - Castor

Hi @OscarGeorge 

 

I would use a Text to Columns tool on [description] set to split to rows on delimiter \n

 

Then use this formula in a Formula tool:  REGEX_Replace([description],'<[!fiohldpMNmcsbtua][^>]*>|<\/[^t][^>]*>|<\/title>|<\/table>|&[a-z]+;','')

 

It does a decent job of removing html tags in many cases.

 

Also try using the regex formula first and then do the split to rows - I'm not sure which will work best.

ChrisTX
15 - Aurora
OscarGeorge
7 - Meteor

Thanks David, the first method worked quite well.  I had to use a filter and manipulate it a little after but that's definitely a handy formula. 

Labels