This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I was able to figure out how to extract emails into an Access database thanks to the clues from this thread. But, I don't really know how to get started to parse out the email content now that I have it in Access. In the case I have, the emails should be in a fairly standard format - these are notifications about the data entered in a web form - so I think RegEx might work. But, I don't know enough about RegEx to figure it out. If anyone has any suggestions or would be willing to assist, I can provide a cleansed example of one of the emails.
Hi AG. Sorry for the delay...was gone for Thanksgiving Holiday and just catching up. I was not able to attach a cleansed sample email (Outlook item) file. The email is a notification from another regional office with contact/lead information for someone in our region. I have copied the sample body content below in this thread (although the system removed some "invalid" HTML). The email contains the following information (not always there) that I would want to extract from the content:
- name for form
- URL of related landing page
- Request (type)
- Product of Interest
- name (first and last seem to be together)
- company/entity name
- Address line 1
- Address line 2 (maybe?)
- Zip/postal code
- Country name
- Email address
I'd like to extract these elements into a table format. I don't know if RegEx or some other tool is the right one to use for this. I do have the body content from these emails now in a single column/field in MS Access that I can use as an input to an Alteryx process. Any help would be GREATLY appreciated.
==================== EXAMPLE EMail Copy Below =============================
Lead Notification from EMEA Marcoms
We have received a lead for your region, please see details below. This is all of the information collected regarding this lead.
As always, there are many ways to accomplish things in Alteryx, so here is a humble approach to what you asked.
Below you'll find a sample workflow that parses:
Filled Out Form,
Determines that the URL you need (there are many in every email text) is the one that follows the "Filled Out Form line" and parses it,
Product of Interest content and
I tried to use several methods, just as a showcase of what you can do. That's why I used REGEX for some text, formulas for others and took the Commnets field (which I asumed it's going to be multiline, without any identifier for the second line) to have all possible use cases in the same workflow.
Hope this helps. Don´t hesitate to contact me if you need something.
PS: Thanks @csteele for the alternative approach to what I started doing.
AG - this was great. I think I see how it works now. You took the email content and turned it into multiple "rows" based on carriage returns. Then you used some multi-row formulas and some other parsing steps to strip out the different elements that I was looking for. This is really great, and I think it will get me over the hump on solving this. In fact, another project I was working on late last week may apply to this as well where I did something similar with splitting a comma-separated list of values into multiple fields with a max length each. I used some of these same tools...not the regex, but it looks as if I might be able to figure out how to use that now that the lines/rows from the email are split up. Thanks so much.