community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Community v19.9

The latest release includes several enhancements designed to improve your Community experience!

Learn More

Data cleansing

Meteor

Hi All,

 

I would need some guidance on the following question of data preparation.

 

I have a text file which is in the below format. How do i need to use the RegEx or anyother method to split them into 6 columns ? The Text files looks like tabbed format but doesnt read out Tab.

 

Example:

 

John,Rambo                   9999111119        100.12    12345 xx1111    123456

 

Thanks in advance.

 

 

Regards,

VR

Highlighted
Quasar

@vrbravo While you could use regex to do this, there's no need. It appears the data is split by spaces, and so a text to columns tool would accomplish the same task using \s as the delimiter. Since you mentioned tabs, you could also include \t in the delimiter as well, just using \s\t.

 

 

Given there are multiple spaces below, I would also check the "ignore empty columns" option.

 

Good luck!

Meteoroid

Attached workflow for your reference. I hope this helps

Meteor

@neilgallen - Thank you for the response. It worked well until there are some names such as megan jr, fox. Then the columns are shifting to next columns. I think the name column is kind of random. Is there any other way to split columns ?

Meteor

Thank you a lot Santhosh, This is a great solution. Could you please help me with the logic you have used ?

Quasar

@vrbravo in that situation using RegEx may be your best option. There are different ways to go about this depending on the complexity of your input data.

 

Can you post a larger example input dataset?

Meteoroid

We used RegEx “Parse” output method. Based on the input pattern, the below expression was built and each expression in bracket () will be separated into new fields and  the non-bracket expression will be ignored in this case space\s+

 

(\w*\W*\w+)\s+(\d+)\s+(\d+\W\d+)\s+(\d+)\s+(\w+)\s+(\d+)

 

  • \w - Matches any letter, digit or underscore. Equivalent to [a-zA-Z0-9_]
  • \s - Matches any space, tab or newline character
  • \W -Matches anything other than a letter, digit or underscore
  • \d -Matches any decimal digit. Equivalent to [0-9]
  • * - Matches zero or more consecutive characters
  • + Matches one or more consecutive `a` characters
Labels