Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Salesforce Input Tool & Going Beyond The 2000 Row Limit for Existing Reports

Alteryx Certified Partner

Background

 

The Salesforce Input Tool has gone through some great functional iterations over the past few years. Although the deprecated version is included with every installation of Alteryx Designer (a bit hidden to find -- Right Click any tool on the "Connectors" palette and click "Show Deprecated Tools"), the most updated version requires users to download an install the .yxi in their instance of Alteryx Designer (https://gallery.alteryx.com/#!app/Salesforce-Input-Tool/59b84ccbeffc2a09d4057c43).

1.png

The newest version, developed by Alteryx, leverages the integrated Python SDK, which is seemingly more reliable than the deprecated tool. Depending on your use case you may want to use the deprecated version.

 

2.png

The Problem

 

If you're like most Enterprises leveraging Salesforce, your organization most likely has hundreds if not thousands of existing reports. These reports range from basic to highly detailed depending on the use case. Salesforce reports are extremely easy for business users to construct, and can be shared and exported in various formats. It is very common for a results set of these reports to have thousands of rows. The Salesforce Input Tool as it is today will only export a maximum of 2000 rows for an existing report. In addition your report must be a report that you most recently viewed in the UI of Salesforce.

 

SOQL could have worked if there were not complex joins between tables. There are no current methods for finding the underlying SOQL (Salesforce Query Language) for existing reports.

 

The Plan


That posed an interesting challenge for me. How can we best leverage the Salesforce API (https://developer.salesforce.com/docs/atlas.en-us.api_analytics.meta/api_analytics/sforce_analytics_...) to retrieve all of the data from a report and still allow it to be processed downstream in our workflows?

If you've worked with APIs before you may have encountered a implementation called pagination. This allows for APIs to limit the amount of data that can be retrieved so that source systems are not taxed with large workloads. Instead of giving you 100,000 results in one output, an API will give you lets say 50 pages of 2,000 results per page summing up to 100,000 total results.

 

You could create an iterative macro in this instance that passes the next page of results until you get to your final page. This would allow you to combine all of the results after the last page of results has been requested. I got a great tutorial and overview here https://www.thedataschool.co.uk/laine-caruzca/api-pagination-in-alteryx/

 

I have used iterative macros for sourcing all kinds of data in the past, but never Salesforce. Iterating through hundreds of thousands of results could result in a very long execution time.

 

I then started to scour GitHub for other possible options specifically focusing on Python: https://github.com/search?l=Python&p=1&q=salesforce+report&type=Repositories with the thought that I could leverage the Python tool and pass the report output via a dataframe.

 

Like a lot of Alteryx developers, I had a deadline, and a relentless drive to solve. Could I parse out the data for 8 different reports w/ Python? After hours of Python ups and downs and multiple issues with already developed Salesforce related packages and their limitations, JSON encoding, dataframe manipulation, & super lengthy workflow execution times due to the virtual environment I gave up on Python.  Remaining vigilant, I looked for other options and found https://github.com/ThyWoof/sfdc2csv. Bingo!

 

The Solution

 

  1. Download Git for Windows https://git-scm.com/download/win or https://www.cygwin.com - There are multiple tutorials on how to get this working. Essentially this will allow you to run bash shell scripts via the Windows command prompt.
  2. Download the sfdc2csv script from GitHub https://raw.githubusercontent.com/ThyWoof/sfdc2csv/master/sfdc2csv.sh
  3. https://github.com/ThyWoof/sfdc2csv has more documentation that you need to run your script via the run tool.
  4. Follow the instructions in each individual tool. (I have attached a workflow to this post)

    Screen Shot 2019-12-19 at 11.08.57 AM.png

Future Changes & Backlog

 

  • Refactor the sfdc2csv shell script to be in Windows native .bat format or PowerShell script.

 

 

 

 

Highlighted
Alteryx
Alteryx

@mchamps Thanks for sharing, pretty neat solution!

Digan
Alteryx
Labels