Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
StephenW
Alteryx Alumni (Retired)

Like so many tools, the idea for this sprang from a conversation in the Solutions Center, at Inspire 2016.  A user was regularly receiving emails that contained attachments.  He wanted to use them in his analytic workflows and encountered a lot of manual work related to getting the files into an accessible location.  I had a similar situation in a previous job, so it was easy for me to relate to his frustration.  Using tools that the Alteryx team had previously built, like the Google OAuth Macro and Google Sheets Tools, we were able to build a POC during our conversation.  The following walks through how this was accomplished and packaged as the Gmail Attachment Input tool.

 

HighLevelWorkflow.png

 

At a high-level, the tool makes three Gmail API requests that build on the prior requests.  The first request returns a list of email messages that match the search parameters specified.  The second request iterates through the message list and returns details (date, from, subject, attachment name, and attachmentId) for each message.  The final request returns the message attachment data.

 

Prerequisites

To access the account's email, the Gmail Attachment Input tool uses a Client ID, Client Secret, and Refresh Token associated with the Gmail API and account. The Alteryx Knowledge Base article, How to Create Google API Credentials, walks you through how to obtain these. To fit our use case, in Step 6 you'll want to use Gmail API and in Step 16 you'll want to use https://www.googleapis.com/auth/gmail.readonly.  

 

 

UserInputs.PNG













 

 

The initial step of any tool or macro is to gather user inputs.  For this tool, there are only four pieces of input that need to be gathered from the user:  Client ID, Client Secret, and Refresh Token (for authentication) and search parameters.  The search parameters are used to identify which emails we want to download attachments from.  An HTML frontend was used to capture this input from users. Using an HTML frontend allowed me to easily integrate search parameter tips and a link to the Gmail Help page, Search operators you can use with Gmail.  Creating an HTML frontend won't be covered in this blog post but you can find more information on the HTML SDK at Inspiring Ingenuity  (our CTO @Ned 's blog.)

 

 

Step1.PNG

 

 

 

 

 

 

 

 

 

 

 

The process begins with refreshing the Authentication token needed for accessing the Google APIs.  We'll gloss over it here because a previous blog post I wrote, OAuth Macro for Google API Authentication, covers it in detail.

 

 

Step2.PNG

 

 

 

 

 

 

 

 

 

 

 

The first Gmail specific API call made is to get a list of messages that match the search parameters specified.  Using the Download tool, a GET request from https://www.googleapis.com/gmail/v1/users/me/messages is made.

 

The URL is made up of three parts:

 GetMessageList.png

 

 

 

 

 

 

 

 

 

 

 

The request produces a JSON response which is parsed, using the JSON Parse tool, to produce a list of message_id's that match our search parameters:

MessageList.PNG

 

 

Step3.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using the list of ids from the previous step, an API call is made for each message_id.  Each call is a GET request from https://www.googleapis.com/gmail/v1/users/me/messages with the message_id appended.

 

The URL is made up of two parts:

GetMessages.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The JSON response has many different parts and requires extensive cleanup:

MessageDetails.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Key pieces of data that we want to parse out of the messages response are:  Date, From, Subject, Attachment name, and most importantly attachmentId.  AttachmentId will be used to identify which attachment to download in the next step.

 

 

Step4.PNG

 

 

 

 

 

 

 

 

 

 

 

The last step creates a final API call, using all of the information gathered in the previous steps, that returns the raw attachment data.

 

A GET request is made for each attachmentId from https://www.googleapis.com/gmail/v1/users/me/messages/[messageId]/attachments/[attachmentId]?fields=....

 

The URL is made up of multiple parts:

GetAttachment.png

 

Each attachment data response is a Base64 encoded string:

 

AttachmentBase64.png

 










 

Using two Blob Convert tools, it's easy to convert the response into usable data.  The Base64 encoded string is first converted into a Blob and then converted from a Blob to a UTF-8 string.  The final step is to remove unnecessary fields and output the message details and attachment data using a Macro Output tool.

 

The FINAL PRODUCT

 

FinalOutput.png

 

 





















 

To demonstrate this tool, I configured it to search for all emails from steve.wagner@alteryx.com that have an attachment with a .txt or .csv extension.   The Gmail Attachment Input tool returned four files attached to three emails.  

 

Unfinished Business

A great addition that wasn't implemented in v1.0 of this tool is an automatic parsing/transform of the attachment data.  Right now, the attachment data is returned in a single record, requiring the user to build out the workflow to transform the data into a table.  Also, I haven't been able to explore which attachment file types are supported.

 

Two Challenges

1.  Download the tool and implement automatic parsing/transform of the attachment data based on file type.  Publish your attempt to the Alteryx Gallery and link to it in the comments.

2.  Discover which file types are supported.  Post any new finds in the comments.

 

Click to download the Gmail Attachment Input tool

 

Note that GmailAttachmentInput.yxi requires Alteryx Designer v10.1+

Stephen Wagner
Product Manager, Technology Alliances

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Comments
nnauraiz
5 - Atom

Hi Stephen,

I have been trying to obtain my Refresh Token for this Gmail connection, but when I get to Step 18 here https://community.alteryx.com/t5/Alteryx-Knowledge-Base/How-to-Create-Google-API-Credentials/ta-p/11... - it does not provide me with a refresh token but in fact gives me this:

{
 
"error_description": "Unauthorized",
 
"error": "invalid_client"
}

Please let me know how to proceed. Thank you! 

nnauraiz
5 - Atom

Capture.JPGPlease let me know how to proceed.

StephenW
Alteryx Alumni (Retired)

@nnauraiz

 

Follow these steps to get your Client ID, Client Secret, and Refresh Token:

 

  1. Open https://console.developers.google.com
  2. Login with the Google account associated with the data you would like to analyze
  3. Create a new project by clicking the Select a project dropdown (top-left corner) and selecting Create a project
  4. Enter a Project name of your choosing and click Create
  5. Click ENABLE API
  6. Select Gmail API from the list
  7. Click Credentials
  8. Click Create credentials and select OAuth client ID
  9. Select the Web application radio button, add https://developers.google.com/oauthplayground as an Authorized redirect URI, and click Create
  10. A window will pop-up with your Client ID and Client Secret; save these for future reference and click OK
  11. Go to https://developers.google.com/oauthplayground
  12. Click on the gear icon in the top-right corner of the page, click the checkbox for Use your own OAuth credentials, enter the Client ID and Client Secret from step 10, and click close
  13. Copy/paste https://www.googleapis.com/auth/gmail.readonly into Step 1 Select & authorize APIs window on left side of page and click Authorize APIs
  14. Click Allow
  15. Click Exchange authorization code for tokens and save the Refresh token
  16. Use the Client ID, Client Secret, and Refresh Token with the Gmail Attachment tool
nnauraiz
5 - Atom

Thank you Stephen but I followed all these steps and received the Client ID, Client Secret, and Refresh Token but it is still giving me the attached error.Capture.JPG

 

 

StephenW
Alteryx Alumni (Retired)

@nnauraiz

 

Follow these steps while in the https://developers.google.com/oauthplayground

 

Click the gear icon in the upper right.  Click the User your own OAuth credentials and enter your Client ID and Client Secret.

Gear.png

 

For Step 1, select https://www.googleapis.com/auth/gmail.readonly and https://www.googleapis.com/auth/userinfo.profile and click Authorize APIs

Step1.png

 

For Step 2, click Exchange authorization code for tokens and click Refresh access token

 

For Step 3, update the Request URI to https://www.googleapis.com/oauth2/v2/userinfo and click Send the request.  Copy id from the response.

 

Step3.png

 

Update the Request URI to https://www.googleapis.com/gmail/v1/users/{userId}/profile.  Replace {userId} with the id copied from the previous response.  Click Send the request.

Step3.5.png

 

Being able to successfully perform the above should result in a Client ID, Client Secret, and Refresh Token that will work with the attached Gmail tool.  If this isn't successful please post a screenshot.

BenMoss
ACE Emeritus
ACE Emeritus

 Hi @StephenW


I appear not to be able to use the installer. When I try to run it I get the following error:

 

'Tool Installer Error: Tool Installer Error: Unable to find configuration file.'

 

I've tried it with a number of users though we are all on V11.

 

Have you ran into this issue yet?

 

Ben

StephenW
Alteryx Alumni (Retired)

@BenMoss

 

Version 11 made a change to the structure of yxi's that this tool wasn't updated for.

 

Here is an updated version of the tool that you should be able to install.

Purdue02
6 - Meteoroid

gmail_error.jpg

 

Any ideas why I'm receiving the above error?

StephenW
Alteryx Alumni (Retired)

@Purdue02

 

Here is a screenshot of the section that's causing the error; tool is highlighted:

Tool114.PNG

 

Without seeing your API response, I would guess that your query isn't returning any data.  To view the API response you could use Fiddler.

 

Also, you could try replicating your query in Gmail and see if it returns anything.

Purdue02
6 - Meteoroid

Below are the results from the JSON Parse tool.  It also doesn't appear the name variable is being passed through.  Any ideas?

 

 

NameVariable_NotListed.jpg

 

 

API_Response.jpg

StephenW
Alteryx Alumni (Retired)

@Purdue02

 

The error message makes me think that your Search Parameter includes something not supported.  Either, try without the q parameter or try re-obtaining your API credentials with additional Gmail scopes added (step 1 of my ‎03-16-2017 03:45 PM comment)

Purdue02
6 - Meteoroid

Thanks, that worked.  I do have another question... do you know if it is possible to download a file attachment and save it in a specified directory?

StephenW
Alteryx Alumni (Retired)

@Purdue02

 

You should be able to take the attachment and use an Output tool to save it.

engdell2
8 - Asteroid

Hi Stephen. 

Just wanted to give you a heads up regarding a bug in the tool. Google uses a different base64 encoding than alteryx (see this link) so you need to replace the character '-' with '+' and '_' with '/' before you can base64 decode it, atleast for excel files.

Thank you though for saving me a lot of time with this otherwise excellent tool. 

 

Best regards//Leif

Richard-23
5 - Atom

 

@engdell

 

I'm interested in using this tool to download and use excel (xlsx) files...how did you replace the characters you referenced?

 

Best,

Rich 

engdell2
8 - Asteroid

Hi Rich. 

First I removed all the parsing of the results from the gmail tool (blob and b64 stuff after the dl tool). 

Then I added these steps to my wf:

JSON parse output: DownloadData

Formula: ReplaceChar([JSON_ValueString],'-', '+') and then ReplaceChar([JSON_ValueString],'_', '/')

(I also do a '%temp%\'+[Attachment_Name]  for saving my file where I can pick it up later on)

After that a blobconvert on JSON_ValueString

and lastly a blob output with settings (top to bottom):

"arb-placeholder"

Replacing entire path

Attachment_Name

JSON_ValueString

 

I dl a bunch at the time so I do a block until done on dl and a dynamic input on flow two after that. 

Gmail hackGmail hack

Richard-23
5 - Atom

@engdell,

 

Thank you for the detailed writeup.

 

I'm pretty new to Alteryx and am not sure how to "open" the gmail tool to make the modifications you detailed. Can you either:

  1. Tell me how I can open that package? OR
  2. Send me your version of the tool? OR 
  3. Send me a sanitized sample of your workflow where you ingest the file(s), save them locally, and then process the data?
    • I'd also like to save the files on the hard drive and then do further processing within Alteryx

Best,

Rich

engdell2
8 - Asteroid

Hi Rich. 

It's just a macro but when it is "installed" it is placed in (for a user installation) in you profile-path like:

C:\Users\yourusername\AppData\Roaming\Alteryx\Tools\GmailAttachmentInput\Supporting_Macros it isn't just a right click away. But if you copy both macros to your local folder you're good to go. 

 

Best regards//L

PS. I follow this thread.

Richard-23
5 - Atom

Thanks Leif, this is exactly what I was looking for. Really appreciate it!

 

Best,

Rich

AndrewCrayford
8 - Asteroid

Will this work to download the email and not the attachment?

 

I would like to download the email and import the email into Alteryx so i can get a link from it and maybe other information from the email?

engdell2
8 - Asteroid

No, but it could easily be adapted, it is well documented. 

AndrewCrayford
8 - Asteroid

I am having a bit of trouble when i add my client ID, Client secret and refresh token it comes up with the following error 

 

Error: Gmail Attachment Input (1): Tool #2: Google API: HTTP/1.1 400 Bad Request
Error: Gmail Attachment Input (1): Tool #2: Tool #14: Parse Error at char(1): Unknown variable "token_type" (Expression #1)

 

Was wondering if anyone can help me on this?

StephenW
Alteryx Alumni (Retired)

@AndrewCrayford Sorry for the delay in responding - The error is caused by the Google OAuth tool.  Your credentials were likely not setup or input correctly.

Saranya
6 - Meteoroid

Hi Stephen,

 

I have finished extracting emails from my gmail account by following your blog. Now, my requirement is to fetch entire body of the email and currently I am using "snippet" column as email body column and it is just showing first 200 characters of the email body. Based on my analysis, I have found that "payload_parts_body_data" field has entire body of the email and it is encoded by default.

 

Do you have any idea how to decode "payload_parts_body_data" field to extract entire body of the email in Alteryx or can you please suggest any other possible solutions to extract entire body of the email?

 

gmail body encoded.PNG

 

 

Thanks,

Saranya

AndrewCrayford
8 - Asteroid

 

@StephenW Thanks for the reply and I got it working now, thanks

 
StephenW
Alteryx Alumni (Retired)

@Saranya

 

Use the Blob Convert tool to convert the Base64 encoded boyd into a Blob and then use another Blob Convert tool to convert the Blob into a UTF-8 string.  Here is an example of the flow:

Base64 conversion exampleBase64 conversion example

 

 

Saranya
6 - Meteoroid

@StephenW

 

Thank you very much for the quick reply. I am able to decode now. But still I am not getting the whole body of the email, the values are getting truncated after parsing. Is it possible to extract whole body of the email using Alteryx?

Could you please suggest any other possible solutions to extract entire body of the email? Thanks in advance!json parse truncate character.png

 

Thanks,

Saranya

StephenW
Alteryx Alumni (Retired)

@Saranya  Use a Select tool to increase the size of the JSON_ValueString field.

Saranya
6 - Meteoroid

@StephenW

 

I have already tried and it doesn't help :-(

 

Paulo1300
8 - Asteroid

@StephenW, i think the dropbox link is dead for the v11+ version

Paulo1300
8 - Asteroid

can anyone reupload the v11+ version? 

StephenW
Alteryx Alumni (Retired)

Hi @Paulo1300

 

There is an attachment at the bottom of the post - GmailAttachmentInput.yxi

Paulo1300
8 - Asteroid

Hi @StephenW,

 

Thanks for the response. I assumed that was the old version because I am getting the same error that this person posted. Then you provided a different alteryx file that was v11+ on dropbox

 

Capture.PNG

Paulo1300
8 - Asteroid

@StephenW, can you confirm that is the correct file?

StephenW
Alteryx Alumni (Retired)

@Paulo1300

 

Unfortunately, I can't confirm that.

lmoore
5 - Atom

@StephenW

 

I'd love to try out this tool. Think you can provide another dropbox link? The above link is giving me a 404.

 

Thanks!

 

StephenW
Alteryx Alumni (Retired)

@lmoore 

 

Sorry for the lagged response. Unfortunately, I don't have access to the file anymore so I will have to make some updates.  I'll try and post an updated tool.

StephenW
Alteryx Alumni (Retired)

 

@lmoore @Paulo1300 

 

I've been able to update the tool. You can find it at GmailAttachmentInput.yxi.

Alpan
5 - Atom

Hi Stephen, very useful tool. I installed and ran the tool, but I am getting funny characters in the attachment field. I tried both with a csv and a txt attachment bit the result is the same. Am I not using the correct version of the tool possibly ? Or am I doing something wrong?  A screen shot from the attachment and the wf output is below. Thanks in advance - Alpantestimage.jpg

StephenW
Alteryx Alumni (Retired)

Hi @Alpan 

 

I was able to test with a replicated dataset and my results turned out fine:

Testfile.PNG



Confirm that you're using this version of the tool:  GmailAttachmentInput.yxi.

 

It may be an issue with encoding of the file based on your system settings. I would open the macro and hardcode your credentials/search parameters and run it like a normal workflow. This should allow you to try changing the settings (specifically in the two tools I've circled in Step 4 which converts the attachment back into a data table.)step4.PNG

Alpan
5 - Atom

Hi @StephenW 

I tried all possible combinations where you suggested, but still the same result. I also tried on a second laptop (again with the latest link you provided), same issue.

Both instances are running Alteryx Designer 2019.4 on Win 10 Pro.

Any other suggestions ?

Thanks

Alpan

StephenW
Alteryx Alumni (Retired)

@Alpan Unfortunately, I wouldn't have any other recommendations.

alpanatalay
5 - Atom

OK I finally found the problem. And the solution. The Base 64 string is coming with extra characters in the beginning of the string (77u_) from some reason. So I just removed them and it worked like a charm. Here is the updated version of Step 4.pic.jpg

 
StephenW
Alteryx Alumni (Retired)

@alpanatalay Weird issue, but great to see you were able to get it resolved and working!

Regular
7 - Meteor

Hi,

 

I am getting the following error in the last step after performing all above mentioned steps: Please suggest.

 

 

Capture.PNG

StephenW
Alteryx Alumni (Retired)

Hi @Regular,

 

I'd retry these steps: https://community.alteryx.com/t5/Engine-Works-Blog/Download-Gmail-Attachments-with-Alteryx/bc-p/5051...

 

If it still doesn't work, I'd try adding additional gmail API scopes as they may have changed.

Regular
7 - Meteor

Hi,

I got the below screen. Now where should I get new Clinet ID , Client Secret and Refresh Token to use in the Gmail tool? Please suggest.

Capture.PNG

StephenW
Alteryx Alumni (Retired)

@Regular 

 

Use the same ones that were used to successfully call this API.

Regular
7 - Meteor
Used the same one but not working in the tool. Please suggest.
Regular
7 - Meteor
Also I used the same toll and try to run in but the same kind of error coming. https://gallery.alteryx.com/#!app/Google-API--Retrieve-Access-Token/579fa2b7a248970b446a4a12 .Could you please help.