Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Replace Function Help

bernardo_roschke
5 - Atom
I'm trying to scrape web data and clean up html. Using the Replace function, I want to target the area of data by replacing strings with pipes "|".

My first Replace formula works but the second does not. Perhaps it is because of special characters, I'm not sure. 

Link to module: https://dl.dropboxusercontent.com/u/60455118/BBQ%20Events.yxmd

The data I want to isolate and eventually turn into a table is about 3/4th down in the Download Data field. 

Thanks. 
3 REPLIES 3
ChadM
Alteryx Alumni (Retired)
Hi Bernardo,

The link is not working for me, can you please post the REPLACE() function you are trying to use with an example of the data?

Thanks!

Chad
bernardo_roschke
5 - Atom
Try this link. 

https://www.dropbox.com/s/2odc64x6k7gtibl/BBQ%20Events.yxmd

here is the replace formula. 

replace([DownloadData],'</td></tr></table>»</td>¶','|')

thank you. 
ChadM
Alteryx Alumni (Retired)
Hi Bernardo,

Based on the fact that the return data contains newline characters and a few other things, a RegEx script is probably your best bet.  Try this in a Formula Tool:

REGEX_Replace([DownloadData], '.*?(?:<h1>)(.*?)(?:</td> ).*', '$1')

If you want to also keep the <H1> tag, try this in your Formula Tool expression:

REGEX_Replace([DownloadData], '.*?((?:<h1>.*?)(?:</td> ).*', '$1')

Huge thanks to Garth Miles for his help with this!

Chad
Follow me on Twitter! @AlteryxChad
Labels