Alteryx Designer Desktop Discussions

Alteregoryx · ‎01-07-2020

Hi!

I have a dataset concerning a vast amount of insurance-numbers, all of which I need to parse out and place in separate columns. These numbers are found within a text-field that has very little consistency. However, an insurance-number always has a length of 5-7 digits, thus that's what we are looking for.

Therefore, I applied, Tostring(REGEX_Replace([message],"^.*?(\d{5,7}).*?$","$1")) , in a formula tool. This only did half the job, since one text-field may hold 2 or 3 (max 5) insurance-number, all of which must be parsed out and placed in separate columns. So what I'm looking for is something, be it text-to-colums or parsing, that replicates the statement until there is no more text and NOT until the first match is found. I really hope you are able to help (and probably amaze me)!

Thank you in advance.

FYI:

I have added sample data below that also shows the outcome my formula above resulted in compared to what it should have resulted in.

Text-field	Outcome from formula	Intended outcome1	Intended outcome2	Intended outcome3
Insurance-number: 123456 and this customer needs special attention attention due to second insurance-number: 654321	123456	123456	654321
Insurance-number was granted 15/7: 999333	999333	999333
In. no. 1 (left back door): 224466 and in. no. 2 (right front door): 124578 and in no. 3 (right back door): 334466	224466	224466	124578	334466
This customer didn't provide in no

JosephSerpis · ‎01-07-2020

Hi @Alteregoryx I mocked up a workflow that produces your output let me know what you think?

JustinBabbitt · ‎01-07-2020

@Alteregoryx

This sounded similar to an issue I've run into before.

Instead of using a regex expression in the Formula tool, we used the RegEx tool to Tokenize what we are looking for in the string. This action will separate out what you Tokenize into a specified number of columns.

Justin Babbitt

Alteregoryx · ‎01-07-2020

Thank you very much @JustinBabbitt !

This does everything asked for 🙂

Alteregoryx · ‎01-07-2020

Thanks a lot @JosephSerpis!! I have implemented it into my workflow, it works like a charm!

Out of curiosity, compared to the solution below - any particular reason to why you went for 'split to rows' in regex and working your way back to columns, rather than splitting to columns initially; am I missing something crucial here? 🙂

JosephSerpis · ‎01-08-2020

Hi @Alteregoryx splitting to row is more dynamic as you don't need to specify the amount of rows to split by rather than splitting to columns as you have to set a arbitrary number for the amount of columns you want. However if you data has more numbers in the string than the amount of columns you are splitting you will lose data and have to adjust manually where you won't have that issue with splitting into rows then crosstabing the data as it will adjust to data accordingly.

Alteryx Designer Desktop Discussions

Parsing out numeric values x-amount of times