Validate email address with Formula tool and output reports as Excel workbooks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm trying to accomplish three things. I'm not very familiar with regular expression, and I continue to run into errors. I have a data table looks something like this:
CODE | ID | First Name | Last Name | |
E202201 | 20220101 | Jason | Sullivan | jsullivan@gmail.com |
E202201 | 20220102 | Joseph | Raven | jrs@HOTMAIL.com |
E202201 | 20220103 | Callan | McDosky | |
E202201 | 20220104 | George | Lee | z-li@school.edu |
(all data presented are fictional)
1) I'm trying to validate if the email addresses are valid. I want to add a column that includes flags indicating whether the address is empty "Not Applicable", valid "Valid" or invalid ("Not Valid").
2) After that I want to output this data to an Excel workbook that should be created at output. I want the workbook's name to be "report-for-CODE.xlsx". In this example, it should be "report-for-E202201.xlsx".
3) I will connect to different data tables stored in Excel workbooks, and I want all the output files to be stored in the same directory.
How do I use a Formula tool to achieve the first two goals?
Solved! Go to Solution.
- Labels:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @45179902
Here is how you can do it.
1. Validate email: You can use the formula with regex_match like below to check email. If an email address is format abc@xyz.com then its valid or else its not.
IF IsEmpty(Trim([Email]," ")) THEN "Not Applicable"
ELSEIF REGEX_Match([Email], "[\w.]+@\u+\.\u+") THEN "Valid"
ELSE "Not Valid" ENDIF
2. Output each code into a separate file: Create a formula like below creating the filename
"report-for-"+[CODE]+".xlsx|Sheet1"
Configure your output tool like below to output each code to a separate file.
Hope this helps : )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @atcodedog05
Thank you for your quick response with thumb up, solutions, and screenshot! Since I'm not familiar with regular expression, I would appreciate it very much if you can break down your regular expression in the solution to explain what each part means!
Also, I noticed that in your Alteryx Designer screenshot, "z-li@school.edu" a valid email address, is rated invalid in your solution. I believe this is because of the regular expression that you're using, but unfortunately, I lack the knowledge to adjust it.
How to edit the solution (regex) so that all valid addresses are detected correctly? For example, email addresses with subdomains (abc@careers.xyz.com) or country codes (abc@commerce.gov.cn), etc.
Regards,
George
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @45179902
I have adjusted the solution. Let me know if any more changes are required.
Workflow:
This article might help you get an better understanding of the Regex https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Tool-Mastery-RegEx/ta-p/37689
Hope this helps : )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@atcodedog05 Thank you for the update and resource! I will check that out. I have another question regarding filename. I just realised that some of the data tables don't have the CODE in a few rows. One of the tables doesn't have a CODE in the first row. The CODE should be the same across the entire table, so CODE value from other rows that have one should work as well.
Does your solution feed only the first row's filename column value into the Output Data tool? If so, how do I adjust your solution if the first row doesn't have CODE, and I need to use CODE from another row? The output file should include only one sheet that includes the email validation result of the entire table.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @45179902
All rows should have the code it groups based on the code and creates a file. If you are saying first always has code and next rows may not. You can fill down code using multi-row formula tool like below.
Input:
CODE | ID | First Name | Last Name | |
E202201 | 20220101 | Jason | Sullivan | jsullivan@gmail.com |
E202201 | 20220102 | Joseph | Raven | jrs@HOTMAIL.com |
20220103 | Callan | McDosky | ||
20220104 | George | Lee | z-li@school.edu | |
E202201 | 20220105 | George | Lee | abc@xyz |
E202202 | 20220101 | Jason | Sullivan | abc@careers.xyz.com |
E202202 | 20220102 | Joseph | Raven | abc@commerce.gov.cn |
Output:
Hope this helps : )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@atcodedog05 Hi, Thank you for your continued support! I understand that there is a multi-row formula that can copy the CODE all the way up or down to the last row of the data table.
The thing is that some of the rows are leaved intentionally blank and should not have a CODE or any value. If I fill down the CODE across the rows, does that mean my output result will be changed?
I want to leave those blank rows as they are in the output file, but I want the filename to be report-for-CODE.xlsx regardless of whether or not there's a CODE in the first row.
My priority is not to help the people who hand over me the data files to fix those, but I want to leave everything untouched and have email validation reports available.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Happy to help : ) @45179902
Happy new year to you and your close ones : ) May it be filled with happiness and prosperity =D
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Happy 2022 ~
