Hello all, I have a simple workflow:
The idea is to replace a string of text in each row of a column and output one with a 'cleaned' string.
In this workflow, I will be replacing HTML code such as <p*> and codes inside and replace it with a NULL (or "").
I have tried using the Regex tool as well as the formula tool to attempt to remove the code but it does not work.
1. <p><br><br><ul><li><span>XXXXXXXXXXXXXXXX</li><li><span>FXXXXXXXXXXXXXXXX</li><li><span>XXXXXXXXXXXXXXXX</li><li><span>XXXXXXXXXXXXXXXX</li></ul>
after
1. <p><br><br><ul><li>QED RXXXXXXXXXXXXXXXX</li><li>XXXXXXXXXXXXXXXX</li><li>XXXXXXXXXXXXXXXX</li><li>XXXXXXXXXXXXXXXX</li></ul>
2. before:
<p style="box-sizing: border-box; margin: 0px 0px 5px; padding: 0px; border: 0px; outline: 0px; font-size: 14px; display: inline-block; line-height: 20px; color: rgb(0, 0, 0); width: 580px; font-family: FuturaMD, Arial, Helvetica, sans-serif; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;">XXXXXXXXXXXXXXXX<br><br><p style="box-sizing: border-box; margin: 0px 0px 5px; padding: 0px; border: 0px; outline: 0px; font-size: 14px; display: inline-block; line-height: 20px; color: rgb(0, 0, 0); width: 580px; font-family: FuturaMD, Arial, Helvetica, sans-serif; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;">XXXXXXXXXXXXXXXX<br><br>
after
2. <p style ="box-sizing: border-box; margin: 0px 0px 5px; padding: 0px; border: 0px; outline: 0px; font-size: 14px; display: inline-block; line-height: 20px; color: rgb(0, 0, 0); width: 580px; font-family: FuturaMD, Arial, Helvetica, sans-serif; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;">XXXXXXXXXXXXXXXXXXXX <br><br><p box-sizing: border-box; margin: 0px 0px 5px; padding: 0px; border: 0px; outline: 0px; font-size: 14px; display: inline-block; line-height: 20px; color: rgb(0, 0, 0); width: 580px; font-family: FuturaMD, Arial, Helvetica, sans-serif; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;">XXXXXXXXXXXXXXXXXXXX<br><br>
My congifuartion:
As you can see, the orange and green parts are not removed despite using <p*>.
Any help will be appreciated
Solved! Go to Solution.
Hi Elizabeth,
You are close to the solution. The explanation is that what you ask in your expression is to first match "<" then match "p" zero or more times, and then match ">".
What I think you would like to have as your regex expression is "<p.*?>". This will first match "<" then match "p" then match zero or more (that is what the "*" means) "." which is wilcards and then match ">". The "?" makes it only match characters until it hits the ">".
Best,
Daniel
Thank you so much! I copied directly from an excel macro and hence the error.
The problem is fixed now.