Alteryx Designer Desktop Discussions

JasonDuRandt · ‎10-24-2019

I have a column with a category code followed by a category desc, however some use a " - " to separate them and others simply are separated by a space.

Example:

12345 - Red Apples S
34567 Blue Oranges

I am trying to separate into 2 columns. Category Code and Category Desc.

I am able to separate these using 2 RegEx tools replace and parse. However this make my workflow very slow. Is there a better, perhaps simpler way of doing this that does not impact performance too badly?

rafalolbert · ‎10-24-2019

Hi @JasonDuRandt,

Not sure about performance at scale as i don't have data for this, but you can consider for splitting your data into 2 streams: one containing the hyphen character and one without, you could use 'Text To Columns' and split to 2x columns and cleanup up using 'Data Cleansing'.

Thanks,

Rafal

#Excuse me, do you speak Alteryx?

GiuseppeC · ‎10-24-2019

Hi @JasonDuRandt,

what @rafalolbert suggested or you could avoid the split, just remove the '-' with the Data Cleansing tool and then use Text to Column to separate on space "\s" (plus a following Data Cleansing tool to remove leading spaces)

See attached.

Hope this helps!

Giuseppe

neilgallen · ‎10-24-2019

Limited testing due to only two examples, but you could use one formula tool to create the fields you're looking for, still using RegEx.

Your first created field (Category Code) would be:

REGEX_Replace([Field1],"(\d+)(.*)","$1")

And your second (Category Description) would be:

REGEX_Replace([Field1],"(\d+\s-*\s*)(.*)","$2")

This may be quicker than two separate RegEx tools, but I haven't tested it.

benakesh · ‎10-24-2019

Hi @JasonDuRandt ,

Check this link for regular expression for these 2 examples .

12345 - Red Apples S
34567 Blue Oranges

https://regex101.com/r/beerWL/1

(\d+)\s(?:-\s)?(.*)

1st group is numbers

2nd group is non capturing group which may occur zero or 1 time

3rd group is whatever follows previous group

Alteryx Designer Desktop Discussions

Parsing Text using RegEx