Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Some Unicode characters are not convertable by Alteryx workflow

kmoon
8 - Asteroid

Hi,

 

Recently I went through an experience where despite using the function TitleCase(DecomposeUnicodeForMatch([_CurrentField_])), some unicode characters are not passed through the workflow and write to database. One of the column field had value of K’collon. Alteryx didn't recognize this character and truncated all the value after this character. I see only the value K in the database table. This special character is typed by Alt+0146. The symbol is well observed in Excel.

 

Can Alteryx provide any mean to recognize this character and pass all the string after this special symbol to database?

 

Your response would be appreciated.

 

Thanks.

21 REPLIES 21
kmoon
8 - Asteroid

Hi @JohnJPS ,

 

I tried doing the same with the following formula of REGEX_REPLACE([INPUTFIELD]," ' ", ' " '). It shows the same thing, keeping only single character.

 

Just FYI....I want to provide some insights about the data sequential.

 

1st row: K’collon

2nd row: C'collon

3rd row: Dcollon

The 1st row has special character typed by Alt+0146.

When it's writing down the values to database, I see only

K

C

D

Whereas it should show the full string values for atleast the 3rd row.

Why the 3rd row is affected?

 

Thanks.

JohnJPS
15 - Aurora

Based on "Dcollon" also failing... out of curiosity, if you simply plug some generic data [Field] = "this", "that", "those" (etc)... all ordinary characters, and send it into the database.... still truncated?  If so, then it could be that the field length in the database is only one character, which would definitely explain all the truncations.

kmoon
8 - Asteroid

Hi @JohnJPS ,

When I am getting rid of the 1st row value, only keeping the 2nd (C'collon) and the 3rd row (Dcollon) values, it writes the 2 values with complete string as shown below:

 

C'collon

Dcollon

 

I guess the special character   is corrupting the whole table column and therefore truncating the values after it appears for all the rows.

FYI....the database table column size is 150. So it should fit in well.

 

Any remedy to handle the special character values so it doesn't corrupt the entire values.

 

Thanks.

JohnJPS
15 - Aurora

Weird.... when inserting to the database, you sure you're inserting "outputField" and not "inputField"?  LOL...  I only ask because outputField simply shouldn't even have the special character at all.

 

And regardless, it's really weird that the result of a bad character in one row would impact every other row being inserted... I wonder what happens if you move the K’collon example to the 2nd row... does the first row then succeed, but everything after K’collon fail?

kmoon
8 - Asteroid

Hi @JohnJPS ,

 

I tried that too! Changing the position of the row. But it shows the same in Oracle database.

 

Thanks.

JohnJPS
15 - Aurora

OK, I've tried another test - I set up a connection to LocalDB with SQL Server, and wrote out my entire data stream (e.g. what I see in the browse tool: original string, charToInt results, etc... see attached workflow) ... everything writes into SQL Server without issue:

 

image.png

 

If you replace my connection here with your Oracle connection, and it works, then the issue is somewhere in your workflow... if this doesn't work with your Oracle database, then the issue has something to do with Oracle and you should contact an Oracle DBA to have them look into it further.

 

Hope that helps!

John

kmoon
8 - Asteroid

Hi @JohnJPS ,

 

Here is what I did to make it work at the database side. 

1. Initially I replaced the special character ( ) with apostrophe ( ' ) using Regex_Replace([Inputfield]," ’ " , " ' ") in the formula tool.

2. Then changing the field type from V_WSTRING to V_STRING as explained in the below link by @MacRo . 

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/A-question-about-field-types-v-string-...

 

As per the explanation, the V_WSTRING requires more bytes per character than V_STRING.

In my database column, the field size was 150.

When the value with special character ( ) was present, it is writing only 1 character for all the values with data type being V_WSTRING. I guess it required more bytes per character than V_STRING.

 

Isn't it weird?

Would appreciate your feedback.

 

Thanks.

JohnJPS
15 - Aurora

I guess it makes sense, if V_WString is basically "double-wide" but the connection to Oracle is unaware, then in Oracle instead of getting...

ABC123

... it sees ...

A0B0C0102030

(every-other character is literally a hexadecimal zero)... and if Oracle regards that hex zero as the end of the string, all you ever get is the first character.  So... makes sense from that angle.

 

Glad to hear switching to V_String solves it!

kmoon
8 - Asteroid

Hi @JohnJPS ,

 

I found an interesting observation. The column type was originally V_String. However, when excel file is uploaded, the inputfield column changes automatically to V_WSTRING. Probably that's the reason when the value is chopped to 1 character.

 

Any thought on keeping the data type same.

 

Would appreciate your feedback as you always do.

 

Thanks.

JohnJPS
15 - Aurora
The select tool will do the job to enforce the V_String type.
Labels