ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
StephenW
Alteryx
Alteryx

Like so many tools, the idea for this sprang from a conversation in the Solutions Center, at Inspire 2016.  A user was regularly receiving emails that contained attachments.  He wanted to use them in his analytic workflows and encountered a lot of manual work related to getting the files into an accessible location.  I had a similar situation in a previous job, so it was easy for me to relate to his frustration.  Using tools that the Alteryx team had previously built, like the Google OAuth Macro and Google Sheets Tools, we were able to build a POC during our conversation.  The following walks through how this was accomplished and packaged as the Gmail Attachment Input tool.

 

HighLevelWorkflow.png

 

At a high-level, the tool makes three Gmail API requests that build on the prior requests.  The first request returns a list of email messages that match the search parameters specified.  The second request iterates through the message list and returns details (date, from, subject, attachment name, and attachmentId) for each message.  The final request returns the message attachment data.

 

Prerequisites

To access the account's email, the Gmail Attachment Input tool uses a Client ID, Client Secret, and Refresh Token associated with the Gmail API and account. The Alteryx Knowledge Base article, How to Create Google API Credentials, walks you through how to obtain these. To fit our use case, in Step 6 you'll want to use Gmail API and in Step 16 you'll want to use https://www.googleapis.com/auth/gmail.readonly.  

 

 

UserInputs.PNG













 

 

The initial step of any tool or macro is to gather user inputs.  For this tool, there are only four pieces of input that need to be gathered from the user:  Client ID, Client Secret, and Refresh Token (for authentication) and search parameters.  The search parameters are used to identify which emails we want to download attachments from.  An HTML frontend was used to capture this input from users. Using an HTML frontend allowed me to easily integrate search parameter tips and a link to the Gmail Help page, Search operators you can use with Gmail.  Creating an HTML frontend won't be covered in this blog post but you can find more information on the HTML SDK at Inspiring Ingenuity  (our CTO @Ned 's blog.)

 

 

Step1.PNG

 

 

 

 

 

 

 

 

 

 

 

The process begins with refreshing the Authentication token needed for accessing the Google APIs.  We'll gloss over it here because a previous blog post I wrote, OAuth Macro for Google API Authentication, covers it in detail.

 

 

Step2.PNG

 

 

 

 

 

 

 

 

 

 

 

The first Gmail specific API call made is to get a list of messages that match the search parameters specified.  Using the Download tool, a GET request from https://www.googleapis.com/gmail/v1/users/me/messages is made.

 

The URL is made up of three parts:

 GetMessageList.png

 

 

 

 

 

 

 

 

 

 

 

The request produces a JSON response which is parsed, using the JSON Parse tool, to produce a list of message_id's that match our search parameters:

MessageList.PNG

 

 

Step3.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using the list of ids from the previous step, an API call is made for each message_id.  Each call is a GET request from https://www.googleapis.com/gmail/v1/users/me/messages with the message_id appended.

 

The URL is made up of two parts:

GetMessages.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The JSON response has many different parts and requires extensive cleanup:

MessageDetails.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Key pieces of data that we want to parse out of the messages response are:  Date, From, Subject, Attachment name, and most importantly attachmentId.  AttachmentId will be used to identify which attachment to download in the next step.

 

 

Step4.PNG

 

 

 

 

 

 

 

 

 

 

 

The last step creates a final API call, using all of the information gathered in the previous steps, that returns the raw attachment data.

 

A GET request is made for each attachmentId from https://www.googleapis.com/gmail/v1/users/me/messages/[messageId]/attachments/[attachmentId]?fields=....

 

The URL is made up of multiple parts:

GetAttachment.png

 

Each attachment data response is a Base64 encoded string:

 

AttachmentBase64.png

 










 

Using two Blob Convert tools, it's easy to convert the response into usable data.  The Base64 encoded string is first converted into a Blob and then converted from a Blob to a UTF-8 string.  The final step is to remove unnecessary fields and output the message details and attachment data using a Macro Output tool.

 

The FINAL PRODUCT

 

FinalOutput.png

 

 





















 

To demonstrate this tool, I configured it to search for all emails from steve.wagner@alteryx.com that have an attachment with a .txt or .csv extension.   The Gmail Attachment Input tool returned four files attached to three emails.  

 

Unfinished Business

A great addition that wasn't implemented in v1.0 of this tool is an automatic parsing/transform of the attachment data.  Right now, the attachment data is returned in a single record, requiring the user to build out the workflow to transform the data into a table.  Also, I haven't been able to explore which attachment file types are supported.

 

Two Challenges

1.  Download the tool and implement automatic parsing/transform of the attachment data based on file type.  Publish your attempt to the Alteryx Gallery and link to it in the comments.

2.  Discover which file types are supported.  Post any new finds in the comments.

 

Click to download the Gmail Attachment Input tool

 

Note that GmailAttachmentInput.yxi requires Alteryx Designer v10.1+

Stephen Wagner
Product Manager, Technology Alliances

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Stephen Wagner is a Product Manager, Technology Alliances with a passion for enabling users to take full advantage of the Alteryx platform. Stephen’s background includes analyzing sales, marketing, and operational data for a Fortune 50 retailer, as well as data visualization development as an Analytics Consultant.

Comments
Regular
7 - Meteor
Still not working. I tried a lot .If anyone can help then it would be
appreciated.
Regular
7 - Meteor

Hi,

I verified the Gmail API and its working fine but why then this tool not working?Any suggestion?Capture.PNG

Regular
7 - Meteor

Hi,

 

It's works for me now.

I used this workflow bit different ways.Basically problem is sometimes download tool throws SSL error due to the proxy setup. But 2018.3 designer works well with download tool.

Thanks a lot.

shreyanshrathod
10 - Fireball

Hi @alpanatalay , @StephenW 

 

I am facing a similar issue with attachment files.

For smaller files, e.g the one @StephenW used for testing, the macro works fine.

 

But for the actual files, I get funny characters in Attachment data column.

I am attaching screenshots for your reference.

 

shreyanshrathod_0-1617879757240.png

 

shreyanshrathod_1-1617879802599.png

 

Any thoughts and solutions would be really helpful.

Thanks in advance.

 

Regards,

Shreyansh Rathod

StephenW
Alteryx
Alteryx

@shreyanshrathod 

 

I'd take a look at this part of the tool:

Using two Blob Convert tools, it's easy to convert the response into usable data.  The Base64 encoded string is first converted into a Blob and then converted from a Blob to a UTF-8 string.  The final step is to remove unnecessary fields and output the message details and attachment data using a Macro Output tool.

 

It's converting the data to a UTF-8 string and your file encodings may be different. Try playing around with adjusting that and it may resolve things. If your attachment is an XLSX file or some other non-text based file, you may have to write out the blob as the file type and read it back into Alteryx.

shreyanshrathod
10 - Fireball

Hi @engdell ,

 

I tried your solution for replacing '-' with '+' and it worked well with '.csv' files.

However, I am getting garbage data for '.xlsx' files.

 

Is there a different base64 encoding for it?

I would love to know if you have successfully parsed '.xlsx' files from gmail attachment.

 

Thanks in advance!!

Regards,

Shreyansh Rathod

engdell
8 - Asteroid

Hi @shreyanshrathod 

You also need to replace '_' with '/' 

I used the flow with xlsx for some years and it worked fine. The solution has since then been replaced so I no longer have access to the content and cannot guarantee that other things have changed (alteryx versions, gmail api specs, xlsx formats). 

//Leif

StephenW
Alteryx
Alteryx

@shreyanshrathod if @engdell 's solution doesn't work, I'd recommend writing out the xlsx attachments as files and then dynamically reading them back in with an Input Data tool.

shreyanshrathod
10 - Fireball

Hi @StephenW ,

Thank you for replying. That was really quick!!

 

However, I didn't get your comment. How do I write out the xlsx attachments as files?

 

Also, @engdell , Thanks for your reply too.

Of course, I have replaced both the characters '-' and '_'.

Worked like magic for '.csv' files.

 

Not sure, what '.xlsx' files are gonna take!!

 

 

shreyanshrathod
10 - Fireball

Hi @StephenW ,

 

I used a Blob Output tool to write out the attachment data.

Coupled with @engdell ' thought of replacing '-' with '+' and '_' with '/', 

I am now able to parse '.csv' and '.xlsx' gmail attachments!!!!!!

 

Thank you so much!!

 

Regards,

Shreyansh Rathod

Preshan
6 - Meteoroid

Hi All, 

 

I have managed to get this wonderful gmail macro set up, however i cannot seem to download xlsm attached files.

 

I want to download them to a fullpath i hardcoded in a formula tool.

 

Any idea how to solve this ? 

 

Preshan_0-1619441395024.png

 

StephenW
Alteryx
Alteryx

@Preshan 

 

I'd give the comments above yours a look. I believe others ran into the same issues and may have found a solution.

Preshan
6 - Meteoroid

thanks @StephenW,

 

I got it to work.

 

Can someone help me with paging for results ? GMAIL API by default only allows 100 items per call, so currently I can only download/ view 100 emails.

 

I need to download 500 emails. 

 

How can i achieve this ? 

 

Regards

StephenW
Alteryx
Alteryx

@Preshan 

 

You'd likely have to add a batch macro within this that looked at pageToken.

shreyanshrathod
10 - Fireball

Hi @Preshan ,

 

I have built an Iterative macro to solve the issue. It works as expected and fetches all emails.

Unfortunately, I can't share the macro on this post.

 

Let me know how can I share the macro.

 

Regards,

Shreyansh Rathod

Preshan
6 - Meteoroid

Hi @shreyanshrathod,

 

Wow that's awesome !!! Would you be able to share with me your batch macro ? 

Thanks so much for assisting me.

 

Regards,

Preshan