Alteryx Designer Desktop Discussions

mertmetinn · ‎11-13-2023

Hi,

I have unstructured string data including tons of emails in a single cell. (around 5k row)

Inputs:

for eg, " Daniel. Beaudoin@x.Com'""Andrew.Sharp@x.ComBarbara.Zukor@x.ComBarbara.Zukor@x.Com / "

for eg, " Adam.Altberg@x.Com,Joe.Quinn@x.Com,Joe.Quinn@x.Com; Geoffrey.Cardy@x.Com,Joe.Quinn@x.com "

Note: There is no delimiter here. Separated either by , or ; or / or / / or ( )

Note: Almost every kind of domain is there(.uk,.org,.ca,.com etc.)

Outputs:

for eg, "daniel.beaudoin@x.com, andrew.sharp@x.com, barbara.zukor@x.com"

for eg, "adam.altberg@x.com, joe.quinn@x.com, geoffrey.cardy@x.com "

Summary:

need to clean each single cell and have it ready each unique emails in one cell seperated by comma(,)

Regards,

BrandonB · ‎11-13-2023

What system are you pulling these emails from originally? The only way you will be to successfully parse these out without a delimiter is if you know all of the email domains that are possible.

Scenarios like Andrew.Sharp@x.ComBarbara.Zukor@x.Com can't be parsed out unless you know that the .com is the end of the URL. Parsing and creating a unique list is native functionality in Alteryx, but I'm curious how you differentiate between Andrew.Sharp@x.com Barbara.Zukor@x.com and Andrew.Sharp@x.co mBarbarbara.Zukor@x.com

PangHC · ‎11-13-2023

since the pain point is to separate email. I will only focus on this.

1. remove all delimiter, add if you have more, exclude "."

ReplaceChar([Field1], " ,/():;","")

2. use regex to take each email with tokenize function (by this format, name@domain.xxx)

.+?@.+?\..{3}

.+? > name, any text before "@"

@.+? > domain, any text after "@" before "."

\..{3} > .xxx, 3 characters after "."

then the rest is simple.

mertmetinn · ‎11-14-2023

The problem is that there are no single domains, here are some examples of my domains.

for eg: @npcd.com, @maples.com, @odlumbrown.com, @churchofengland.org, @riyadcapital.com, @raymondjames.ca, @cdam.co.uk

mertmetinn · ‎11-14-2023

Another example:

""derekaguthrie@gmail.com; sarah@sarahmcquaid.com /// Julian Stair - has applied for POA over Derek -->Julian@Julianstair.com // Craig Ward lawyer: craig.ward@craybecklaw.co.uk; // Sophie OConnell: soconnell@wolferstans.com""