Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Input errors caused by excessive quotes in data

hasi82
6 - Meteoroid

Dear community

 

I am standing in front of a problem which I don't seem to able to fix with the existing tips I found in the forum.

I still feel like this should be a common problem though which should be fixable without editing all the data manually. Maybe someone here is able to help me.

 

My problem is being caused by extra quotes in text fields (or semicolons). These make Alteryx unable to determine the end of a field correctly. As result, the data is moving to other columns in some places.

I hope the following example is understandable.

 

My input in csv format looks like this:

"Product";"Number";"Description";"Amount";"Unit";"Date"
"Product";"1234";"NA";"250";"PCS";"20181212"
"Product";"1234";"Screen 24"";"250";"PCS";"20181212"
"Product";"1234";"Screen 28"";"250";"PCS";"20181212"
"Product";"1234";"NA";"250";"PCS";"20181212"
"Product";"1234";"NA";"250";"PCS";"20181212"
"Product";"1234";"Book "Alteryx for dummies"";"250";"PCS";"20181212"
"Product";"1234";"Game "Tetris"      .";"250";"PCS";"20181212"
"Product";"1234";"NA";"250";"PCS";"20181212"

 

When being imported with delimiter ";" the result looks like this:

 

RowProductNumberDescriptionAmountUnitDate
1Product1234NA250PCS20181212
2Product1234Screen 24;"250PCS20181213[NULL]
3Product1234Screen 28;"250PCS20181213[NULL]
4Product1234NA250PCS20181212
5Product1234NA250PCS20181212
6Product1234Book "Alteryx for dummies";"250PCS20181213[NULL]
7Product1234Game "Tetris"    .250PCS20181218
8Product1234NA250PCS20181212

 

As you can see in rows 2, 3 and 6 the data doesn't end up in the correct column.

For row 7 Alteryx still gives a warning but the data is in the correct column.

 

I have tried some workarounds I found here but sadly didn't get the desired result.

 

Reading in the data as one column with delimiter \0 and then replacing two double quotes ("") with one double quote doesn't work as the data may have trailing whitespaces separating the quotes. If I remove the whitespaces before, blank fields will be represented by two double quotes aswell and also be replaced with a single double quote.

 

I have also tried the different options in the "Ignore Delimiters in" setting. If I choose "None" I will get the error "too many fields in row XXX" as there are also text fields that contain a semicolon in the data.

 

I am kind of lost at moment and thankful for any advice.

Thank you!

11 REPLIES 11
Thableaus
17 - Castor
17 - Castor

Hi @artzee1556 

 

You can use some REGEX.

 

Try REGEX_Replace([Field], "\s{2,}", "|")

 

This will replace sequences of 2 whitespaces or more for a single pipe.

 

Cheers,

ChrisTX
15 - Aurora

Try the workflow I posted on 03-21-2019 with delimiter \t

Labels