Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Struggling with Regex and/or Text-to-Columns

korodidanielp
6 - Meteoroid

Hi Guys,

 

I have a field with very long (sometimes exceeding 255 characters) descriptions, many of them containing IDs like:

 

PL021020-002510

 

These IDs always start with the "PL" prefix, followed by 6 numbers, followed by a hyphen, then by another 6 digits. The delimiters in the very long strings are always spaces. Length is 15 chars but that's not important to stick to 15 because the length can be defined by the next space delimiter.

 

My problem is that the field that contains these long descriptions containing these IDs are very long and the ID can be anywhere within the long strings. In VBA I could successfully create functions that could gather these IDs, regardless the ID's position within the very long string. I did that via utilizing InStr, Len, Left, Right and Mid functions and embedding them into each other many times.

 

However, I can't replicate the same in Alteryx. At a certain part of my WF, I could solve this problem by using Text-to-Columns tool, followed by a Formula tool which checks each of the newly created Text2Col1, Text2Col2, ... Text2Col20 fields and if the Left(...,2) equals "PL" then puts that value into my UniqueID field.

 

The big painpoint is that I need to solve the same problem at another part of my WF, but the raw data processed at that part of the WF consists of strings with much-much more delimiters: if I'd use Text-to-Columns tool, 76 new columns would be needed because of the big number of delimiters in that dataset. (And obviously I don't want to set up formula for each...)

 

I decided to go ahead with RegEx as that seems the most meaningful, but I am unable to make it work and I'm struggling with its very strange syntax. The only time I could make it recognize my pattern was not good enough for me because all I could achive was adding a Match field to the end of my table where it indicates if pattern could be found or not. But this is not what I need, I want to gather the ID itself from the long string and put it to another field, and I couldn't find any way to do that with RegEx. I also tried Formula and embed some RegEx expressions into the functions I used, still no results.

 

Could you please tell me some best practices how to tackle the above problem? (To put it in a nutshell: long string, space delimiters, ID can be anywhere, I need to recognize the ID, for that I also need a RegEx expression that matches things like "PL021020-002510", plus I need some hints how to set up tools and which tools in order to get my ID data extracted.)

 

Apologies for my lengthy question.

 

Thanks in advance!

5 REPLIES 5
atcodedog05
22 - Nova
22 - Nova

Hi @korodidanielp ,

 

I would suggest using Parse option in regex and using expression 

 

.*(PL\d{6}-\d{6}).*

 

If you could provide a sample data i could help you much better.

 

Cheers and Happy Analyzing : )

ChrisTX
15 - Aurora

Sites like https://regex101.com/ are helpful to test your regular expression

 

The Tool Mastery article for RegEx is also a good read: https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Tool-Mastery-RegEx/ta-p/37689

 

Chris

korodidanielp
6 - Meteoroid

Thanks a lot, mate! It works perfectly.

korodidanielp
6 - Meteoroid

Thanks, Chris!

atcodedog05
22 - Nova
22 - Nova

Hi @korodidanielp ,

 

If our replies were helpful in your use-case.

 

Can you please mark it as accept solution.

So that it will be helpful resource for others who are searching for a solution for the same issue.

 

Cheers and Happy Analyzing : )

Labels