Alteryx Designer Desktop Discussions

rarauj26 · ‎10-31-2018

Hi Everyone,

Could you please give some help?

I'm rookie in RegEx and would like to tokenize the simple structure below.

Surname1, Name1 (Abb1) <userid1@samecompany.com>; Surname2, Name2 (Abb2) <userid2@samecompany.com>; Surname3, Name3 (Abb3) <userid3@samecompany.com>; ...

Regards,

Reinaldo S. Araujo

LordNeilLord · ‎10-31-2018

Hey @rarauj26

Can you give an example of what you want the output to be?

If you want to split in to rows, text to columns using ; as the delimiter and "split to rows" function would work

Then if you want to parse out the email address the RegEx tool using <(.*)> will give you the email address

LordNeilLord · ‎10-31-2018

Double post from me

rarauj26 · ‎10-31-2018

Hi Neil,

yes!

We want to split by rows and the columns with the fields

field1 field 2 field3 field4

row1 Surname1 Name1 Abb1 userid1

row2 Surname2 Name2 Abb2 userid2

row3 Surname3 Name3 Abb3 userid3

Regards,

MarqueeCrew · ‎10-31-2018

@rarauj26,

I split the record into rows using the TEXT TO COLUMNS tool and splitting on the ";" character. Then I used a formula tool that uses the function of GetWord() to find the data. Later I cleaned up characters that were "noise" and trimmed the result.

Notes: Assumes that words are without spaces. I placed the (abb1) into a field of userid instead of abbreviation. If names are present with spaces, I would use a regular expression to find them.

I hope that this helps you.

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

rarauj26 · ‎10-31-2018

Mark,

Thanks!

It'd help, but as we're in a lerning process, a RegEx solution is prefered.

Regards,

Thableaus · ‎10-31-2018

Hi @rarauj26!

Here's the solution I thought.

Sure there are many ways to do it.

Cheers,

rarauj26 · ‎10-31-2018

Hi @Thableaus

looks like it works well!

Is it possible to give a solution with tokenization?!

I undertstood that we don't need to use the Text to Columns

Regards,

jdunkerley79 · ‎10-31-2018

Here's my pure Regex tool solution

Regex split to rows first then regex parse

^\s*(.*?),\s*(.*?)\s*\((.*)\)\s*<([^>]+)>\s*$

For details see: regexr.com/429nr

rarauj26 · ‎10-31-2018

Hi @jdunkerley79,

It was great!

Could you please also get the last field, that is the "userid" before the @?

Notice that the company name must be the same in all addresses.

Regards

Alteryx Designer Desktop Discussions

Tokenize email address