Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Convert Unicode to Language Text

Xena
5 - Atom

Hello Community!

 

I have a CSV file that contains twitter messages that are stored in Unicode (UTF-8) format. I am attempting to convert the Unicode to the original character language. I’ve used the “convertfromcodepage” function and have entered the following expression using the multi-field formula tool in an attempt to convert the unicode to Japanese (the original language of the tweet). 

 

ConvertFromCodePage([_CurrentField_], 20936) 

 

An example of the twitter input data is shown in bold below.

 

I'm at \u6d77\u9bae\u51e6\u3044\u308f\u3044 in \u6211\u5b6b\u5b50\u5e02, \u5343\u8449\u770c https://t.co/yLOdRigcY0

 

Unfortunately, I received the following output after using the convertfromcodepage function…not really what I was looking for

 

Im At U6d77U9baeU51e6U3044U308fU3044 In U6211U5b6bU5b50U5e02 U5343U8449U770c HttpsTCoYlodrigcy0

 

I need the obtain the following:

 

I'm at 海鮮処いわい in 我孫子市, 千葉県 https://t.co/yLOdRigcY0

 

I've read the posting about how to bring double-byte characters (DBCs) into Alteryx  at the following link https://community.alteryx.com/t5/tkb/articleprintpage/tkb-id/knowledgebase/article-id/609 

I'm new to Alteryx and believe that I am missing something here and welcome any and all feedback.

2 REPLIES 2
AdamR_AYX
Alteryx Alumni (Retired)

Hi Xena,

 

Sorry for the very late reply. I know you have already found a solution to your problem, but just wanted to add some additional details for any future users who come across the same issue.

 

The actually function calls that you needed here were 

 

CharFromInt(HexToNumber())

 

But with the added complication that these work only on a single character. That is to say

 

CharFromInt(HexToNumber(6d77)) = 海

 

To apply it to the whole string we can use a RegEx parse tool and then a replace tool to substitute the Unicode characters back into the original string.

 

ConvertFromEscapedUnicode.png

 

I have attached an example workflow.

 

Adam

 

Adam Riley
https://www.linkedin.com/in/adriley/
Emmanuel_G
13 - Pulsar

@AdamR_AYX Thanks for your answer ! 🙂

Labels