In case you missed the announcement: Alteryx One is here, and so is the Spring Release! Learn more about these new and exciting releases here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx text find and replace

Marc_R
5 - Atom

Hello experts. I am struggling with my attempts to find either the correct formula or RegEx expression which will remove certain HTML tagging from a specific column in my dataset.

 

I have the following tags which I am trying to remove:

  • <p style="position: static;">
  • </p>
  • <span style="color: rgb(34, 34, 34); font-family: Poppins; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important; position: static;">
  • </span>

The raw text appears as

  • <p style="position: static;">TextTextText TextTextText.</p>
  • <span style="color: rgb(34, 34, 34); font-family: Poppins; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important; position: static;">TextTextText Text Text Text</span>

I have attempted to utilize the following Formula with no success (undesired text remains):

  • REGEX_REPLACE([Rule Name], "<pstyle=['\"]?[^'>'\"]*['\"]?>", "")

I have attempted to utilize the RegEx expression and a few others with no success (undesired text remains)

(<p style="position: static;">\s)

 

I appreciate any help the experts can provide!

3 REPLIES 3
cjaneczko
13 - Pulsar

Are there tags you need to keep? If you need to remove all HTML tags you can use this as your regex.

 

REGEX_REPLACE([Rule Name],<[^>]+>,"")
Marc_R
5 - Atom

Thank you for the suggestion. No, I do not want to keep any part of the tag - I wish to remove it entirely.

 

I utilized the formula and received an error - "Parse Error at char(25): Malformed Function Call (Expression #1).

  • Character 25 in the formula is ' ] ' which comes after the output column name, 'Rule Name'
     

     

Marc_R
5 - Atom

I just figured out the issue with the formula. The text that I want to replace was not surrounded by quotation marks. The correct syntax for the formula is:

REGEX_REPLACE([Rule Name],"<[^>]+>","")

Labels
Top Solution Authors