Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Correct the data spliting



Is there a way to correct the data splitting caused due to an extra delimiter in the file?

For example: 




"," is extra in the Name - "Susan K" and Day - "Monday"

I can check the number of fields needed to be using the Text to Column tool. By using it I will know that 4 rows contains 4 fields however 2 row (3rd and 5th row) contains 5 fields. 

I can then use the Filter tool to filter out the row containing 5 fields.

But now is there a to correct it in alteryx?




Hi @Akash01 


If you import the data all in one string (\n delimiter), you can then use regex to parse out what you're looking for




From here it's a matter of cleaning up the 1st and 4th fields



Alteryx Certified Partner
Alteryx Certified Partner

Hi @Akash01 


It really depends on how the pattern of your data is structured.

In this case, I found a workaround - (if I consider that extra commas will always be present on words, I can use RegEX to parse the columns how exactly they should be.




In this case I know there will be digits in the second column. So I need everything until the comma before the first digit for the first column.

Then I can Parse based on the comma division.






Again with the same(close enough) answer




The extra delimiter "," can also be between digits or anywhere else and not just words.


For example:


Time,1001,27,Tuesday,Worked in Accounts
Harry,1223,23,Friday, Experience in sales
Susan,K,1211,02,Monday, Sales
Josh,1432,22,Thursday, Engineering
Katrina,2342,12,Mon,day, Team player
Roy,32,11,01,Tuesday, 15 year experience in Sales,Accounts and Audit


There is an extra comma in the last row in the fields "Employee" and Description.

Alteryx Certified Partner
Alteryx Certified Partner

@Akash01 Does the employee Number have always 4 digits?






What I have been able to determine so far about your valid data

1. a string of characters that gets assigned to the name field

2. a sequence of 4 digits that is the employee number

3. a sequence of 2 digits that is the day

4. a string of characters that can be either one of 2 fields, day or description.   Day is standard so you can write a function to parse it out and stick the rest in the description.


Based on this I'd say

1. Load the data in as a single string 

2. use regex to split the string in 2 with everything up to the last digit in the 1st part 

3. Remove all the "," from the 1st part and split it into text, 4 digits and 2 digits

4. Take the second part from step 2 and remove all the commas until you get to the first "y"

5. split all to the 1st "y" to Day and all the rest to Description








No! Any column can have any number of characters. :/


HI @Akash01 


You can assume the day field will always have 2 digits, so use this as a starting point. Edit:  Possibly a rash assumption.  If not, then this coupled with varying number of digits in the Employee field mean that you can't parse it automatically. 


1. the last 2 digits go into the day

2. all the digits before that go into the Employee

3. all the starting text goes into name


Continue on with the remaining steps from my previous post


An other option is based on the idea that most of your data will be correct.  In this case put a recordID on all the rows and then pull out all that have exactly 4 "," and process these automatically.     Send all the others to an error output to be corrected manually and re entered.  Use the recordID to order the combined output