Hi All!
I am working on a data set that has Chinese, Japanese, Korean characters.
Basically I have "안녕하세요" as a value in a column.
I need to split this out one character per column. Obviously grabbing each character only once (but if there were multiple letters like in HeLLo it would grab both L's in their own column).
There is no spaces in between characters so i can't use that.
So far I have tried in regex (.) as a test and it grabbed "any character", I wasn't sure if this was moving the character it grabbed so i did this next test "(.)(.)(.)(.)(.)(.)" (12 year old giggle :P) and that failed, didn't even run the very first instance.
Any help or ideas?
As a note, this is to try to match different names to each other that are using CJK characters by doing exact match tests on single characters in a variable amount of columns.
Thanks in advance!
Solved! Go to Solution.
Hi Marquee,
Thanks for the response.
I ran your RegEx and the result was the first character in my name column was retrieved into a new column, but it didn't grab the subsequent characters into their own columns as well.
I think i am going to look into String functions to get this one solved I believe.
Last night I was focused on the finding of duplicate CJK characters. Here is a workflow that will parse each character to a row.
Cheers,
Mark
Thank you very much Mark!!
You're very welcome @BobSnyder85.
I'm glad that my test data didn't get caught in @LeahK's naught word list.
Cheers,
Mark
User | Count |
---|---|
18 | |
14 | |
13 | |
9 | |
8 |