Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Make a guess based on the most common patterns

Masond3
8 - Asteroid

Hi All, 

 

I have a challenge, that seems easy but has me stumped. 

Aim 

 

I want to try and guess a customer email address based on the most common patterns of email address for customers at that company.,

 

After analyzing email addresses, I've found out that the most common formats are:

 

Possible email structures 
FirstJohn@nike.com
First [1 letter] + LastJSmith@nike.com
First+.+LastJohn.Smith@nike.com
LastSmith@nike.com
FirstLastJohnSmith@nike.com
last + first [1 letter]SmithJ@nike.com
first [1 letter]+.+LastJ.Smith@nike.com
First+Last[1 letter]JohnS@nike.com
Last+FirstSmithJohn@nike.com
Last.FirstSmith.John@nike.com


Assumptions 

Since majority of companies have standard formatted for their email address, i am hoping to find the most common email structure for a given company and then extrapolate based on it.

 

Example 

 

Company : Nike.com

  • 500 Contacts
  • 100 Contacts have no Email
  • 400 Contacts have the following email structures 

 

Number of ContactsPercent Email Structure  
24060%First+.+LastJohn.Smith@nike.com
10025%First [1 letter] + LastJSmith@nike.com
6015%FirstJohn@nike.com

 

Given that 60% of the contacts at Nike have the email structure of "First+.+Last" i would like to then follow a similar format for those 100 Contacts which have no email addresses

 

Current Input

 

ContactidFirst NameLastNameEmailCompany NameCompanyid
111111RichardPiperRichard.Piper@Nike.comNike Inc001f100001InnV5AAJ
222222DanielleCollinsDanielle.Collins@Nike.comNike Inc001f100001InnV5AAJ
333333DaneSmithDane.Smith@Nike.comNike Inc001f100001InnV5AAJ
44444RobertAtleryxRAlteryx@Nike.comNike Inc001f100001InnV5AAJ
55555JohnKing Nike Inc001f100001InnV5AAJ
666666ChrisDannher Nike Inc001f100001InnV5AAJ

 

Expected Outcome

 

ContactidFirst NameLastNameEmailPredicted EmailRationalCompany NameCompanyid
111111RichardPiperRichard.Piper@Nike.com  Nike Inc001f100001InnV5AAJ
222222DanielleCollinsDanielle.Collins@Nike.com  Nike Inc001f100001InnV5AAJ
333333DaneSmithDane.Smith@Nike.com  Nike Inc001f100001InnV5AAJ
44444RobertAtleryxRAlteryx@Nike.com  Nike Inc001f100001InnV5AAJ
55555JohnKing John.King@Nike.comCommon Email Pattern " First+.+Last"Nike Inc001f100001InnV5AAJ
666666ChrisDannaher Chris.Dannaher@Nike.comCommon Email Pattern " First+.+Last"Nike Inc001f100001InnV5AAJ

 

Looking forward to your help & advice

 

Many thanks 

 

Masond3

 

20 REPLIES 20
MarqueeCrew
20 - Arcturus
20 - Arcturus

@Masond3 ,

 

First, why invent the wheel?

https://rocketreach.co/nike-email-format_b5c61e3ff42e0c48

 

Then you can create a dynamic formula based upon the free info.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Masond3
8 - Asteroid

@MarqueeCrew 
So I have looked at Rocket reach before however to onboard that vendor would take time, and the cost also needs to be considered. 

But you’re right, conceptually I am trying to do the same thing! The key thing is speed to market, so trying to find a crude solution in alteryx 

geoff_zath
Alteryx
Alteryx

@Masond3 

Interesting problem. This isn't the most elegant solution, but you can brute force your way through it by matching the email patterns with a Formula tool. Hope that gets you in the right direction!

Masond3
8 - Asteroid

@geoff_zath  - Hi - So your solution is very interesting, after playing around, i have found a flaw in my thinking which in turn has affected the solution that you have provided. 


What i have noticed is that the solution is taking the highest percentage email pattern and applying across all the contacts regardless of companyid

 

When actually, we should find the email pattern for a given companyid, and apply that email pattern for those contacts where the email is blank

 

 

Current Input

 

ContactidFirst NameLastNameEmailCompany NameCompanyid
111111RichardPiperRichard.Piper@Nike.comNike Inc001f100001InnV5AAJ
222222DanielleCollinsDanielle.Collins@Nike.comNike Inc001f100001InnV5AAJ
333333DaneSmithDane.Smith@Nike.comNike Inc001f100001InnV5AAJ
44444RobertAtleryxRAlteryx@Nike.comNike Inc001f100001InnV5AAJ
55555JohnKing Nike Inc001f100001InnV5AAJ
666666ChrisDannher Nike Inc001f100001InnV5AAJ
AAAARichardPiperRichard@Alteryx.comAlteryx11111
bbbbDanielleCollinsDaniel@Alteryx.comAlteryx11111
ccccDaneSmithDane@Alteryx.comAlteryx11111
ddddjohnSmith Alteryx11111

 

Expected Outcome

 

ContactidFirst NameLastNameEmailPredicted EmailRationalCompany NameCompanyid
111111RichardPiperRichard.Piper@Nike.com  Nike Inc001f100001InnV5AAJ
222222DanielleCollinsDanielle.Collins@Nike.com  Nike Inc001f100001InnV5AAJ
333333DaneSmithDane.Smith@Nike.com  Nike Inc001f100001InnV5AAJ
44444RobertAtleryxRAlteryx@Nike.com  Nike Inc001f100001InnV5AAJ
55555JohnKing John.King@Nike.comCommon Email Pattern " First+.+Last"Nike Inc001f100001InnV5AAJ
666666ChrisDannaher Chris.Dannaher@Nike.comCommon Email Pattern " First+.+Last"Nike Inc001f100001InnV5AAJ
AAAARichardPiperRichard@Alteryx.com  Alteryx11111
BBBBDanielleCollinsDaniel@Alteryx.com  Alteryx11111
CCCCDaneSmithDane@Alteryx.com  Alteryx11111
DDDDjohnSmith John@alteryx.comCommon Email Pattern " First"Alteryx11111
geoff_zath
Alteryx
Alteryx

@Masond3 That's a quick fix! See attached.

 

Capture.PNG

Masond3
8 - Asteroid

@geoff_zath I will take a look at your solution tomorrow when back in the office. 

Random question- how do you handle multiple contacts with different email patterns associated to same company that have different email formats 

 

Example - 
If you have three contacts, and two of those three contacts have  email addresses associated to same company and both emails addresses have a different format, meaning that they both have a 50% association to them . For that one contact that has the missing email, what should the email pattern be assigned to it? 

in those scenarios do you think it’s best to flag these exceptions 

caltang
17 - Castor
17 - Castor

Just curious on the use case application of this request. @Masond3 could you share a bit more of what you intend to do with this workflow? Trying to understand it.

 

From the looks of it, this allows you to get the info without RocketReach, right?

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
Masond3
8 - Asteroid

@caltang so my use case is to solve two particular problems statements 

1)we have a high level of bounces 

2) we have a high % of contacts that have no email

 

the first scenario wouldn’t be covered in this flow, however the goal would be to find those contacts which have a different email pattern than the rest at a given company and try and use the recomended pattern to change the email and hopefully reduce the bounces 

 

the second use case is we want to grow market share , so predicting an email address could be a great way to do that. 

 

 

caltang
17 - Castor
17 - Castor

Interesting. Thanks for sharing!

 

Though you may hit a lot of blanks and/or wrong PICs with same names. Getting to the decision maker seems to be a better bet if you can find their LinkedIn (if any), no?

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
Labels