Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parse Out EMAIL address with Y/N as first character

Nosal25
8 - Asteroid

I have an old system that for some reason developer inserted the Y/N to email authorization into the email address field.  The data looks like

Ymyemaila@home.com;SAM.WASHER@us.gov;heythere@umm.edu

 

Currently, I am trying to parse out using the following in REGEX tool: 

([Y|N])(.*[a-z]+[.][a-z]+[@][a-z]+[.][a-z]+) but it isn't working consistently.  Anyone have any suggestions? 

 

Thank you

5 REPLIES 5
JohnJPS
15 - Aurora
Hi @Nosal25
I believe you could just use TrimLeft of both Y and N, which would leave you with everything else, assuming the rest is a valid email address.

A concern, though: your list shows a few emails that do not start with Y or N... but what if a legitimate email does start with one of them (e.g. Norman.Rockwell@TheNewYorker.com or some such)... you wouldn't want to trim the leading N in that case, and there's unfortunately no definite way to know if the Y or N should really be trimmed, that I'm aware of anyway).
Hope that helps!
John
SeanAdams
17 - Castor
17 - Castor

other thing you could use is a text-to-columns control, or a regex tokenize that splits the text on ; characters - that way you can split out each e-mail independently if the e-mails are all in a single field.    

As @JohnJPS says, you may run the risk of e-mails that begin with the character Y or N genuinely -but if you're guaranteed that there will be an additional Y or N on each e-mail then:

 

- add a record ID to each of your original records

- use a select component to split out just the recordID and the e-mail field (this will give you a second data-set)

- use a regex to tokenize on the ; character into rows (this way you can flexibly deal with any number of e-mails

- then use a formula to create a new column called "Email authorized" 

       - eMailAuthorised - type vString (5) - formula: if (left([email],1)=='Y') then 'Yes' elseif (left[email],1=='N') then 'No' else 'Error' endif.    You could do the same thing with a switch statement

       - you can then, in the same formula tool, reprocess your e-mail field to remove this character with a second formula: Field: EMail.  Formula: substring([Email],1,999)

- now you have a set of e-mails, with the authorization field blown out, per record ID - you can join back if you want; or concatenate this back into a single string using the Summarize tool with the concatenate option

 

Nosal25
8 - Asteroid

This worked perfectly. One question, do I need to add the RECORDID tool?  Even without it, It worked great.  Thank you

Nosal25
8 - Asteroid

Thank you.  That worked well. 

SeanAdams
17 - Castor
17 - Castor

Hey @Nosal25 - you don't have to add the record ID - this just helps you to join back to the original data.

 

Say you had a set that looked like this:

 

NameSurnameEMails
SeanAdamsYsean.adams@domain.com;Nsean.adams@domainhome.com
FredFlintstoneYfred.flintstone@domain.com;Yfred.flintstone@domain2.com
WilmaFlintstoneYwilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.com

 

You could just parse out the e-mails - but then you'd be left with a set of e-mails with no way to tie this back to the original records

So - instead, it's useful to add a record ID before you split stuff out to allow you to join back.

 

RecordIDNameSurnameEMails
1SeanAdamsYsean.adams@domain.com;Nsean.adams@domainhome.com
2FredFlintstoneYfred.flintstone@domain.com;Yfred.flintstone@domain2.com
3WilmaFlintstoneYwilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.com

 

... then split out the e-mails

Version:1.0 StartHTML:000000264 EndHTML:000002643 StartFragment:000001980 EndFragment:000002611 StartSelection:000001983 EndSelection:000002585 SourceURL:https://community.alteryx.com/t5/forums/replypage/board-id/data-preparation-blending/message-id/1002...

 

RecordIDEMails
1Ysean.adams@domain.com;Nsean.adams@domainhome.com
2Yfred.flintstone@domain.com;Yfred.flintstone@domain2.com
3Ywilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.co

 

Then parse the e-mails into rows

RecordIDEMailMailAllowed
1sean.adams@domain.comY
1sean.adams@domainhome.comN
2fred.flintstone@domain.comY
2fred.flintstone@domain2.comY
3Wilma.flintstone@domain.comY
3Wilma.flintstone@domainhome.comY
3wilmaf2000@domain3.coY

 

because you have a record ID that ties back to the original data-set - you can now use it to join these cleaned up rows back to the original table.    

For example - if you put on a filter, followed by a Unique - then you could filter out the first e-mail with a Y value, and then join that back to the original table.

 

Hope this makes sense @Nosal25

Sean

 

 

 

 

 

 

 

Labels