Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to clean long list of email addresses?

msantoso
8 - Asteroid

Hi everyone,

 

I need to review a 30K-record customer file with email addresses. 

 

I can see many repeated mistakes, such as 

- more than one '@' in an address

- 2 email addresses separated with a comma or a semicolon

- no suffix (.com )

etc. 

 

Is there a way to highlight all the addresses that do not follow a specific syntax such as

 

 [any_character_or_digit]@[any_character_or_digit][dot][a max of 3 characters]

 

If anyone has ever done this job, I'd be happy to learn from her/his experience.

thank you

Myriam

 

 

12 REPLIES 12
CharlieS
17 - Castor
17 - Castor

@msantoso

 

As you mentioned, the first step is identifying any email address that does not meet that format. I have attached a solution that uses the following RegEx formula to do just that:

 

(\w+\@\w+\.\w{3})

 

 

msantoso
8 - Asteroid

Hi CharlieS,

 a big thanks from Paris!! :)

 

May I ask for a new question?

I need to do the similar job for website addresses. 

I also find commas instead of dots, sometimes no "dotcom",...

 

the thing is : sometimes the "www" is specified, sometimes not.

sometimes I can read "http://"  And the three values are correct

 

any idea? 

thanks Myriam

 

 

CharlieS
17 - Castor
17 - Castor

Could you provide some sample website data? Attaching a workflow with a Text Input is the best way to share that on the Community.

msantoso
8 - Asteroid

Yes sure !!

 

I selected some "standard" ones, but also some with some spelling errors

thanks again for your help

Myriam

msantoso
8 - Asteroid

Yes sure !!

 

I selected some "standard" ones, but also some with some spelling errors

thanks again for your help

Myriam

msantoso
8 - Asteroid

Yes sure !!

 

I selected some "standard" ones, but also some with some spelling errors

thanks again for your help

Myriam

msantoso
8 - Asteroid

Yes sure !!

 

I selected some "standard" ones, but also some with some spelling errors

thanks again for your help

Myriam

msantoso
8 - Asteroid

Yes sure !!

 

I selected some "standard" ones, but also some with some spelling errors

thanks again for your help

Myriam

msantoso
8 - Asteroid

CharlieS

 

Your RegEx for email is excluding some email structures which are correct. I have listed some of them in the attached file. 

could you please show me how I can add new structures to refine the filtering? 

many thanks 

Myriam

Labels