Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

XML - UTF problem

Tanai_Goncalves
8 - Asteroid

Good evening,

 

I'm trying to read several XML Files, but some of them have scpecial character '-'.

 

When I remove this '-' from the data, the workflow runs fine.

 

Is there a way to read the XML with special character?

 

error "UTFDataFormatException"

 

Kind regards, 

 

Tanai

4 REPLIES 4
JoaoLeiteV
10 - Fireball

Hello @Tanai_Goncalves,

 

Just a quick suggestion, maybe reading the file as a CSV instead of XML would solve your issue, but then you would probably need to adjust some of the parsing you're doing.

danilang
19 - Altair
19 - Altair

Hi @Tanai_Goncalves 

 

When you read the data as a csv file, you can specify the code page to use. 

danilang_0-1626529675617.png

 

If you can find the correct code page to use the special characters will be converted.  After this, use a Summarize tool to concatenate all the rows into a single cell.  Follow this with as many XML Parse tools as required to read out the elements you need.

 

Dan 

priya_mohana_dhl
7 - Meteor

Hi,

 

My XML file starts with

<?xml version="1.0"?>

<NSARESPONSE trandate="2022-03-04 16:13:03" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="">

 

The XML file contains '&' and special characters and reading this XML generates Unterminated entity reference error (Gone after removing &)

 

Will <?xml version="1.0" encoding="utf-8"?> help to get rid of the special characters like '&'

Thanks.

priya_mohana_dhl
7 - Meteor

Hi,

 

I tried to open an xml using the Input Data tool. The input xml file has  <?xml version="1.0"?> in the header.

priya_mohana_dhl_3-1647423732529.png

 

 

I replace the xml header with 

priya_mohana_dhl_1-1647423377512.png

Then save it as an xml file with this configuration

priya_mohana_dhl_2-1647423561572.png

 

Even after specifying the Code Page as UTF-8, the special characters are not converted. Any help is appreciated.

 

Thanks.

 

 

 

Labels