Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Predictive/machine learning Model For Emails

salkhafaji
6 - Meteoroid

Hi All,

 

I am relatively new to predictive modeling using Alteryx. I have a data set that contains:

  1. Person's First Name
  2. Person's Middle Initial
  3. Person's Last Name
  4. Person's Email
  5. Company's name they work

My goal is to fill all the records that are missing an email address.  I have noticed a pattern that company's typical use the same formula which combines the first and last name in someway to generate their employee's email (i.e.  Jane.Doe@company.com or JDoe@company.com).  My task is to create a machine learning model that learns the formula for each company and then fills in the missing emails.  Any pointers on how to accomplish this would be greatly appreciated!!!!

1 REPLY 1
BrandonB
Alteryx
Alteryx

This is a bit different than a traditional machine learning model in that you are looking for patterns in a field rather than trying to predict a specific value.

 

Setting the machine learning model aside for a moment, you could easily accomplish the fill of a known pattern using a formula tool. For example, if the pattern was Jane.Doe@company.com and you had another record with John Smith as the first and last name, you could use a formula tool that says:

 

[First Name]+"."+[Last Name]+"@company.com" and it would generate the email addresses from the information that you have. 

 

Now back to the patterns...

 

You could create new columns that have flags for different pattern matches and then see which column of patterns has the most flags after checking all records. For example, you could create a new column called FirstInitialLastName where you say

 

IF Left([First Name], 1)+[Last Name]+"@company.com" = [Email]

THEN 1

ELSE 0

ENDIF

 

This would create a flag with a value of 1 in a new column for all situations where this is true. Then you could have another formula that creates a column called FirstNamePeriodLastName that says

 

IF [First Name]+"."+[Last Name]+"@company.com" = [Email]

THEN 1

ELSE 0

ENDIF

 

Rather than using a machine learning model, you could just sum up all of these flags for every column. Then you could use another formula tool to fill the email based on whichever one had the greatest sum. 

Labels