Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to clean long list of email addresses?

msantoso
8 - Asteroid

Hi everyone,

 

I need to review a 30K-record customer file with email addresses. 

 

I can see many repeated mistakes, such as 

- more than one '@' in an address

- 2 email addresses separated with a comma or a semicolon

- no suffix (.com )

etc. 

 

Is there a way to highlight all the addresses that do not follow a specific syntax such as

 

 [any_character_or_digit]@[any_character_or_digit][dot][a max of 3 characters]

 

If anyone has ever done this job, I'd be happy to learn from her/his experience.

thank you

Myriam

 

 

12 REPLIES 12
CharlieS
17 - Castor
17 - Castor

Here's an updated RegEx formula to use::

 

([\w.-]+\@[\w.-]+\.\w{2,3})

 

Replacing '\w+' with '[\w.-]+' Allows '.' and '-' characters to be included with alphanumeric characters. The '{2,3}' means the domain can be either 2 or 3 characters long.

msantoso
8 - Asteroid

great ! thanks a lot 

msantoso
8 - Asteroid

Hi CharlieS

 

Your Regex formula

([\w.-]+\@[\w.-]+\.\w{2,4})

is working great. It covers 85% of the syntax errors I had in my file.  Thank you!!

 

but it rejects some email addresses which are correct

 

(i) Some emails have two dots in their domain name or a dash, which are both accepted. 

Examples: 

j.haber@vbv.bwl.de

office@scheuch.co.at

info@saint-gobain.com

 

(ii) Some emails have been set as FALSE for some reasons I do not understand

info@Kavitation24.de‬‬  (digits in the domain name?)

kita-sonnenschein-nahrstedt@gmx.de

Woelk&Partner@online.de (ampersand is allowed in email syntax)

 

do you know how I can get them set as correct addresses? 

thank you 

Myriam

 

 

Labels