Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to convert broken letters?

knozawa
11 - Bolide

Hello,

 

I am using Grid database and I found multiple broken letters.

Sample Data.png

The data field included " ◻ {\displaystyle \sqâ" that supposed to be "-", therefore, I used a formula to replace it.  Although the letter "â" was replaced to "-", the two squares were not removed in the data.  It seems like the actual data does not include two squares though...

Sample Data 2.png

Does anyone know how to handle this issue?

 

Sincerely,

Kazumi

 

 

 

13 REPLIES 13
Aguisande
15 - Aurora
15 - Aurora

Hi @knozawa

One thing that works for me in this cases is to handle all String fields as WString (Use a Select Tool to change them).

 

Can you try this and let us know?

Thanks

knozawa
11 - Bolide

@Aguisande,

 

Thank you for your suggestion.  I converted the field from V_string to WString, but the result was the same.  The two squares were still there.

 

I also tried to use the ConvertFromCodepage() and DecomposeUnicodeForMatch(), but both didn't work well.  I believe the code is ISO639, but there was no choice for the ConvertFromCodePage() formula.  After using the DecomposeUnicodeForMatch(), some of the letter was converted to "?" mark.

 

Sincerely,

Kazumi

Aguisande
15 - Aurora
15 - Aurora

Can you share a sample of the data you're using, so I can see the exact case?

 

knozawa
11 - Bolide

@Aguisande,

 

I attached the sample .yxdb data.

Thank you for your help!

 

Sincerely,

Kazumi

knozawa
11 - Bolide

@Aguisande,

 

grid.413735.7 (name: Harvard–MIT Division of Health Sciences and Technology)

 

grid.10067.30 (label: Национальный университет «Львовская политехника»)

 

grid.10211.33 (label: Lüneburg‚)

 

Above are some examples that I would like to fix.

 

Sincerely,

Kazumi

Aguisande
15 - Aurora
15 - Aurora

Hi,

Reviewing the data you sent, I got this results:

UTF-8.PNG

 

The coding of the data seems to be UTF-8

 

In which format (.csv, Excel) is the original data ?

 

knozawa
11 - Bolide

@Aguisande,

 

The original data is csv (type: Microsoft Office Excel Comma Separated Values File).  I downloaded data from Grid database and combined all 11 data together.  Attached the workflow.

 

Thank you very much for your help.

 

Sincerely,

Kazumi

Aguisande
15 - Aurora
15 - Aurora

Using your workflow, I changed the Encoding of the label.csv file to UTF-8. All accents, foreign language characters and symbols are there (See below)... I think if you this to all your inputs, all characters should be right...

 

UTF-8.PNG

 

knozawa
11 - Bolide

@Aguisande,

 

Thank you very much!

 

Sincerely,

Kazumi

Labels