SOLVED
Finding Non-UTF 8 Characters
Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Sarath27
8 - Asteroid
08-03-2022
05:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I want to get the list of Non-UTF 8 Characters in my data.
i) [•¡¤¦§¨©ª«¬-®¯±²³¶¹º»¼½¾¿‡•…‰‹›⁰€™■・�]
ii) And any Non-English characters
Example
It can be done using Regex_Match in Filter Tool with the below code.
REGEX_Match([Field 1],"[^\x00-\x7F]+")
True will give all Non English Characters.
False will give English Characters.
But the problem is, True part gives NULL after this regex tool in Filter, pls kindly advise on this.Thanks.
Solved! Go to Solution.
Labels:
- Labels:
- Data Investigation
- Datasets
3 REPLIES 3
12 - Quasar
08-03-2022
07:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Are you trying to get a list of those characters? Or are you trying to filter any records that have those characters?
DavidSkaife
14 - Magnetar
08-03-2022
08:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Try changing the regex to ".*[^\x01-\x7F].*" which seems to work
08-03-2022
09:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dear David,
Thanks much. It works!
