Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex Help

KmbrlyPC
8 - Asteroid

I have some messy data in Excel they I am needing to separate out into columns before I can even do anything with the data. Based on my current knowledge of Alteryx, the RegEx tool would probably be a good way to parse out the data, however I am new to this tool and have trouble with arriving at the correct expressions. This would be a huge win because it would save me a ton of time not having to parse this all out using Excel since the actual data is hundreds of lines. Below is a screenshot of some mock data and the outcome I am needing. An Excel spreadsheet is also attached. Thank you in advance!

RegEx Help.JPG

6 REPLIES 6
apathetichell
18 - Pollux

try this version as a start.

KmbrlyPC
8 - Asteroid

Thank you. This is helpful. One thing I came across is the Rule Code may be more than just 1 listed. Sometimes it's 1, but other times it's 3 or 4. For example "Trees, Grass, Shrubs (1-234.56, 4-321.56)". How would I update the RegEx formula to capture this? This would be the 4th tool in the workflow you created.

 

 

Capture.JPG

 

 

Capture2.JPG

apathetichell
18 - Pollux

 

It would probably be easiest to do a text to columns split to rows (comma delimited)  after you have most of the data set up. this would create two entries  with the rest of the data static. I'd probably do the split after crosstab - as otherwise it'll re concatenate on crosstab assuming the column header is the same.

KmbrlyPC
8 - Asteroid

Could you provide an example of the configuration for the text to columns tool? Attached is the updated mock data that reflects the multiple rule codes Trees, Grass, Shrubs (1-234.56, 4-321.56). Thanks in advance.

KmbrlyPC
8 - Asteroid

Although, if it is possible to get the RegEx formula to work to recognize the multiple Rule Codes then that would be all I need. Everything else is working great to get the data I need. It's just this one piece.

 

Current formula below recognizes "Trees, Grass, Shrubs (1-234.56)"

 

if regex_match([Data],".+\(\d+-\d+\.\d+\)") then [Data] else [Row-1:Title (Sub-Category Description)] endif

 

But I am needing it to be more holistic to recognize this as well "Trees, Grass, Shrubs (1-234.56, 4-321.56)"

apathetichell
18 - Pollux

Try this one...

Labels