Identifying hidden special characters in a field
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
I have a requirement of identifying certain special set of characters which are allowed in a field, rest if any other special character comes in the data field should be raised as a defect.
Please see the below set of data:
Name | LENGTH | Result |
Lee | 4 | Error |
Linda | 6 | Error |
Ad | 3 | Error |
Lee | 3 | Pass |
Linda | 5 | Pass |
Ad | 2 | Pass |
My code is
if
REGEX_MATCH(trim([FirstName]),'[A-Za-z0-9\.\·\`\,\,\ \-\\ÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸ¡¿çÇŒœßØøÅ寿ÞþÐð]+') = 0
then 'F'
else 'T'
endif
The first three records are being raised a defect(coming from a systemic data), although i am unable to see any special character.
For testing purpose,I manually entered the names and they are getting passed, doing length check on those shows a difference.
Any help in identifying the root cause of this error and identify the special character so that i can pass that as well in the code.
Also any other perspective if i am missing, please let me know.
Thank you.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
To get the ASCII or Unicode value for hidden characters, use a text editor like Notepad++
And/or you can use the Alteryx function CharToInt to get the Unicode number of the hidden character. See Conversion Functions (alteryx.com)
CharToInt(s): Returns the number that matches the input Unicode® character s.
CharToInt("B") returns 66 (U+0042 'Latin Capital Letter B').
CharToInt("©") returns 169 (U+00A9 'copyright sign').
If you post a workflow or other file with hidden characters, I can provide an example.
Chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you, Chris.
That helps.
Using notepad++, i was able to identify there were some hidden special characters as below:
'? ' at beginning of name
Is there a way that i can get rid of these special characters ? I tried using the replace or contains function, but it is not able to catch them.
Appreciate your help.
Regardsm
Rohit Jha.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It may be easier to specify the characters you want to keep, like a-z, A-Z, underscore, dash, 0-9, etc
Try the RegEx tool with Replace mode
Chris
