I have an old system that for some reason developer inserted the Y/N to email authorization into the email address field. The data looks like
Ymyemaila@home.com;SAM.WASHER@us.gov;heythere@umm.edu
Currently, I am trying to parse out using the following in REGEX tool:
([Y|N])(.*[a-z]+[.][a-z]+[@][a-z]+[.][a-z]+) but it isn't working consistently. Anyone have any suggestions?
Thank you
Solved! Go to Solution.
other thing you could use is a text-to-columns control, or a regex tokenize that splits the text on ; characters - that way you can split out each e-mail independently if the e-mails are all in a single field.
As @JohnJPS says, you may run the risk of e-mails that begin with the character Y or N genuinely -but if you're guaranteed that there will be an additional Y or N on each e-mail then:
- add a record ID to each of your original records
- use a select component to split out just the recordID and the e-mail field (this will give you a second data-set)
- use a regex to tokenize on the ; character into rows (this way you can flexibly deal with any number of e-mails
- then use a formula to create a new column called "Email authorized"
- eMailAuthorised - type vString (5) - formula: if (left([email],1)=='Y') then 'Yes' elseif (left[email],1=='N') then 'No' else 'Error' endif. You could do the same thing with a switch statement
- you can then, in the same formula tool, reprocess your e-mail field to remove this character with a second formula: Field: EMail. Formula: substring([Email],1,999)
- now you have a set of e-mails, with the authorization field blown out, per record ID - you can join back if you want; or concatenate this back into a single string using the Summarize tool with the concatenate option
This worked perfectly. One question, do I need to add the RECORDID tool? Even without it, It worked great. Thank you
Thank you. That worked well.
Hey @Nosal25 - you don't have to add the record ID - this just helps you to join back to the original data.
Say you had a set that looked like this:
Name | Surname | EMails |
Sean | Adams | Ysean.adams@domain.com;Nsean.adams@domainhome.com |
Fred | Flintstone | Yfred.flintstone@domain.com;Yfred.flintstone@domain2.com |
Wilma | Flintstone | Ywilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.com |
You could just parse out the e-mails - but then you'd be left with a set of e-mails with no way to tie this back to the original records
So - instead, it's useful to add a record ID before you split stuff out to allow you to join back.
RecordID | Name | Surname | EMails |
1 | Sean | Adams | Ysean.adams@domain.com;Nsean.adams@domainhome.com |
2 | Fred | Flintstone | Yfred.flintstone@domain.com;Yfred.flintstone@domain2.com |
3 | Wilma | Flintstone | Ywilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.com |
... then split out the e-mails
Version:1.0 StartHTML:000000264 EndHTML:000002643 StartFragment:000001980 EndFragment:000002611 StartSelection:000001983 EndSelection:000002585 SourceURL:https://community.alteryx.com/t5/forums/replypage/board-id/data-preparation-blending/message-id/1002...
RecordID | EMails |
1 | Ysean.adams@domain.com;Nsean.adams@domainhome.com |
2 | Yfred.flintstone@domain.com;Yfred.flintstone@domain2.com |
3 | Ywilma.flintstone@domain.com;Ywilma.flintstone@domainhome.com;YWilmaF2000@domain3.co |
Then parse the e-mails into rows
RecordID | MailAllowed | |
1 | sean.adams@domain.com | Y |
1 | sean.adams@domainhome.com | N |
2 | fred.flintstone@domain.com | Y |
2 | fred.flintstone@domain2.com | Y |
3 | Wilma.flintstone@domain.com | Y |
3 | Wilma.flintstone@domainhome.com | Y |
3 | wilmaf2000@domain3.co | Y |
because you have a record ID that ties back to the original data-set - you can now use it to join these cleaned up rows back to the original table.
For example - if you put on a filter, followed by a Unique - then you could filter out the first e-mail with a Y value, and then join that back to the original table.
Hope this makes sense @Nosal25
Sean