Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Error: 'unicodeescape' codec can't decode bytes

stanleychen
6 - Meteoroid

When i run the code: tables = camelot.read_pdf('C:\Users\stanleychen\Desktop\MR2-PL1.pdf') in Python tools, the following error was noted. What i should do to avoid it? Thank you.

 

  File "<ipython-input-4-d9a6999f73bb>", line 2
    tables = camelot.read_pdf('C:\Users\stanleychen\Desktop\MR2-PL1.pdf')
                             ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

 

3 REPLIES 3
PaulN
Alteryx Alumni (Retired)

Hi @stanleychen,

 

The error is caused by the interpretation of escape characters (any sequence starting with '\'). In particular, "\U" introduces a 32-bit unicode character 

 

You should use a raw string to avoid such behaviour (r"xxx")

 

  tables = camelot.read_pdf(r"C:\Users\stanleychen\Desktop\MR2-PL1.pdf")

 

See https://docs.python.org/3.6/reference/lexical_analysis.html, 2.4.1. String and Bytes literals both escape characters and raw strings.

 

Best,

 

PaulN

warrenfelsh
5 - Atom

Unicode String types are a handy Python feature that allows you to decode encoded Strings and forget about the encoding until you need to write or transmit the data. Python tries to convert a byte-array (a bytes which it assumes to be a utf-8-encoded string) to a unicode string (str). This process of course is a decoding according to utf-8 rules. When it tries this, it encounters a python byte sequence which is not allowed in utf-8-encoded strings (namely this 0xff at position 0). One simple way to avoid this error is to encode such strings with encode() function as follows (if a is the string with non-ascii character):

 

a.encode('utf-8').strip()

 

Or

 

Use encoding format ISO-8859-1 to solve the issue.

david_alvas
5 - Atom

To Solve SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape Error You just need to put before your path string Just like this pandas.read_csv(r”C:\Users\ssc\Desktop\account_summery.csv”) OR Just Use double quotes and forwardslash character. Third solution is Just Use double quotes and forwardslash character.

https://exerror.com/syntaxerror-unicode-error-unicodeescape-codec-cant-decode-bytes-in-position-2-3-...

Labels