Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Regex Remove HTML Containers

Highlighted
7 - Meteor

My input includes html containers and I need to remove them (characters and html commands) and leave just the actual text values:

 

<div>How do I remove the html tags?</div><br><div>Looking forward to the answer.</div>

 

Need to see: How do I remove the html tags? Looking forward to the answer.

 

Thank you.

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

@Afammy,

 

I took a stab at this for you using your sample container entry.

 

Screen Shot 2017-02-01 at 11.39.38 AM.png

 

I added a space between the text.  You might want to have a PIPE delimiter left behind instead of the space.  Please try this and let me know if it helps.

 

Trim(
          Regex_Replace(
                                    Regex_Replace([HTML_Container],"<.*?>",'|'), "[|]{1,}",
                                                              ' ')
                                    )
Trim(
          Regex_Replace(
                                    Regex_Replace([HTML_Container],"<.*?>",'|'), "[|]{1,}",
                                                              '|')
                                    )

I've got the two sets of code (space and pipe replacements) above.  

 

Trim() gets rid of any spaces at the front or end of your field.

Regex_Replace uses a wild card for all sets of data enclosed by <> and replaces them with a '|' pipe.

Regex_Replace uses a search for  multiple pipes and replaces them with a single pipe.

 

At least for your test data it seems to work.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Highlighted
7 - Meteor

Thanks @MarqueeCrew! Worked like a charm. You also gave me a mix of formula and regex to learn and use moving forward. 

Highlighted
5 - Atom

Thanks!  This was just what I needed to solve my problem.

Highlighted
Alteryx Partner

Thank you!! Just what I needed right now.

Labels