Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
StephenW
Alteryx Alumni (Retired)

Like so many tools, the idea for this sprang from a conversation in the Solutions Center, at Inspire 2016.  A user was regularly receiving emails that contained attachments.  He wanted to use them in his analytic workflows and encountered a lot of manual work related to getting the files into an accessible location.  I had a similar situation in a previous job, so it was easy for me to relate to his frustration.  Using tools that the Alteryx team had previously built, like the Google OAuth Macro and Google Sheets Tools, we were able to build a POC during our conversation.  The following walks through how this was accomplished and packaged as the Gmail Attachment Input tool.

 

HighLevelWorkflow.png

 

At a high-level, the tool makes three Gmail API requests that build on the prior requests.  The first request returns a list of email messages that match the search parameters specified.  The second request iterates through the message list and returns details (date, from, subject, attachment name, and attachmentId) for each message.  The final request returns the message attachment data.

 

Prerequisites

To access the account's email, the Gmail Attachment Input tool uses a Client ID, Client Secret, and Refresh Token associated with the Gmail API and account. The Alteryx Knowledge Base article, How to Create Google API Credentials, walks you through how to obtain these. To fit our use case, in Step 6 you'll want to use Gmail API and in Step 16 you'll want to use https://www.googleapis.com/auth/gmail.readonly.  

 

 

UserInputs.PNG













 

 

The initial step of any tool or macro is to gather user inputs.  For this tool, there are only four pieces of input that need to be gathered from the user:  Client ID, Client Secret, and Refresh Token (for authentication) and search parameters.  The search parameters are used to identify which emails we want to download attachments from.  An HTML frontend was used to capture this input from users. Using an HTML frontend allowed me to easily integrate search parameter tips and a link to the Gmail Help page, Search operators you can use with Gmail.  Creating an HTML frontend won't be covered in this blog post but you can find more information on the HTML SDK at Inspiring Ingenuity  (our CTO @Ned 's blog.)

 

 

Step1.PNG

 

 

 

 

 

 

 

 

 

 

 

The process begins with refreshing the Authentication token needed for accessing the Google APIs.  We'll gloss over it here because a previous blog post I wrote, OAuth Macro for Google API Authentication, covers it in detail.

 

 

Step2.PNG

 

 

 

 

 

 

 

 

 

 

 

The first Gmail specific API call made is to get a list of messages that match the search parameters specified.  Using the Download tool, a GET request from https://www.googleapis.com/gmail/v1/users/me/messages is made.

 

The URL is made up of three parts:

 GetMessageList.png

 

 

 

 

 

 

 

 

 

 

 

The request produces a JSON response which is parsed, using the JSON Parse tool, to produce a list of message_id's that match our search parameters:

MessageList.PNG

 

 

Step3.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using the list of ids from the previous step, an API call is made for each message_id.  Each call is a GET request from https://www.googleapis.com/gmail/v1/users/me/messages with the message_id appended.

 

The URL is made up of two parts:

GetMessages.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The JSON response has many different parts and requires extensive cleanup:

MessageDetails.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Key pieces of data that we want to parse out of the messages response are:  Date, From, Subject, Attachment name, and most importantly attachmentId.  AttachmentId will be used to identify which attachment to download in the next step.

 

 

Step4.PNG

 

 

 

 

 

 

 

 

 

 

 

The last step creates a final API call, using all of the information gathered in the previous steps, that returns the raw attachment data.

 

A GET request is made for each attachmentId from https://www.googleapis.com/gmail/v1/users/me/messages/[messageId]/attachments/[attachmentId]?fields=....

 

The URL is made up of multiple parts:

GetAttachment.png

 

Each attachment data response is a Base64 encoded string:

 

AttachmentBase64.png

 










 

Using two Blob Convert tools, it's easy to convert the response into usable data.  The Base64 encoded string is first converted into a Blob and then converted from a Blob to a UTF-8 string.  The final step is to remove unnecessary fields and output the message details and attachment data using a Macro Output tool.

 

The FINAL PRODUCT

 

FinalOutput.png

 

 





















 

To demonstrate this tool, I configured it to search for all emails from steve.wagner@alteryx.com that have an attachment with a .txt or .csv extension.   The Gmail Attachment Input tool returned four files attached to three emails.  

 

Unfinished Business

A great addition that wasn't implemented in v1.0 of this tool is an automatic parsing/transform of the attachment data.  Right now, the attachment data is returned in a single record, requiring the user to build out the workflow to transform the data into a table.  Also, I haven't been able to explore which attachment file types are supported.

 

Two Challenges

1.  Download the tool and implement automatic parsing/transform of the attachment data based on file type.  Publish your attempt to the Alteryx Gallery and link to it in the comments.

2.  Discover which file types are supported.  Post any new finds in the comments.

 

Click to download the Gmail Attachment Input tool

 

Note that GmailAttachmentInput.yxi requires Alteryx Designer v10.1+

Stephen Wagner
Product Manager, Technology Alliances

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Comments