community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Replace HTML code with ASCII equivalent

Highlighted
Fireball
Fireball
This is not a question but simply sharing an expression that it just took me a few minutes to create and I thought I'd share.

Business Problem:
I often download web data using Alteryx but find that sometimes the text contains HTML codes that look ugly. I needed a way to replace them with the ASCII equivalent.

Example:
Copyright © 2015 needs to read Copyright © 2015

Methodology:
Here is a page (http://www.ascii.cl/htmlcodes.htm) with a full list of the HTML codes. I found that the numbers used in the HTML codes are the same as the ASCII numbers. Therefore all I had to do was search for the prefix &#, ensure that what followed was a number, and then if it was, use that number with the CharFromInt() function to create the code.
One cool trick I found is that the ToNumber() function automatically drops any non-numeric characters from the end of the number. That took care of the trailing semicolon so I didn't have to.

Syntax:
To use the syntax below in the Formula tool, simply replace [data] with [yourfieldname] where yourfieldname is the name of the field you need to modify.

iif(
   ToNumber(GetWord(Substring([data],FindString([data],'&#')+2),0))>0,
   Replace([data],GetWord(Substring([data],FindString([data],'&#')),0),CharFromInt(ToNumber(GetWord(Substring([data],FindString([data],'&#')+2),0)))),
    [data]
)
John Hollingsworth
Clear Channel Outdoor
Alteryx Alumni (Retired)

@Hollingsworth

 

Thanks for sharing, great tip! :-)

Andy Cooper
Senior Solutions Engineer - EMEA
Alteryx
Atom

I think this a better approach:

 

IF
FindString([html], '&#') >= 0 THEN

REGEX_Replace([html], '&#(.*?);',
CharFromInt(REGEX_Replace(
REGEX_Replace([html], '&#(.*?);', '\1'), '[A-Za-z]', '')))

 ELSE [html] ENDIF

 

also here is a related macro attached

 

 

Labels