This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
So we’re now downloading all the network-shared documents we want thanks to instructions posted on our Knowledge Base, and we’re on our way to mastering FTP in Alteryx. But what if we want to take it a step further? A lot of our users rely on FTP as a drop zone for datasets that are generated periodically (e.g. weekly, monthly, or quarterly datasets). We should then be able to schedule a workflow to coincide with those updates, automatically select the most recent dataset, crank out all the sweet data blending and analytics we have in our scheduled workflow, and proceed with the rest of our lives, right? Right. We can do just that, and with a little work up front, you can automate your FTP download and analysis to run while you’re enjoying the finer things in life. Here’s how in v10.1:
One of the greatest strengths of modern web APIs is their flexible, developer-friendly nature, which provides numerous options for both the provider and the user. However, this flexibility can make it more intimidating for business users to deal with the various data formats that these APIs provide. The purpose of this article is to familiarize you with the main data formats used by the vast majority of web APIs, and provide the basic knowledge that will allow you to confidently process the data they return into a typical tabular format.
As currently designed, the Amazon S3 Download tool only allows one file, or object, to be read in at a time. This article explains how to create a workflow and batch macro that will read in the list of objects in a bucket and allow you to filter for the file(s) you want using wildcards!
By combining Alteryx and Microsoft Power BI, organizations can streamline and accelerate the process of preparing and analyzing data. This provides a faster way to deliver an end-to-end experience for data access, preparation, analysis, visualization and consumption — delivering deeper business insight faster with a more complete set of data.
Web scraping, the process of extracting information (usually tabulated) from websites, is an extremely useful approach to still gather web-hosted data that isn’t supplied via APIs. In many cases, if the data you are looking for is stand-alone or captured completely on one page (no need for dynamic API queries), it is even faster than developing direct API connections to collect.
Through Adobe Analytics the ability to collect and visualize data from your websites has allowed for improved decision making, yet the format of that data can only take you so far. Within Alteryx you now have the ability to connect to Adobe Analytics and bring that data into Alteryx, allowing you to perform greater data manipulation to provide further insights, as well as Predictive & Spatial Analytics!
In a previous article , we've shown you how you can upload to FTP using the Download tool. With the release of Alteryx 10.5, the Download tool now supports uploading to SFTP . With this addition, we'll take the opportunity to show you some more examples of uploading data to SFTP/FTP and show you how seamless it can be.
API connections give access to many web-based applications, database systems, or programs by exposing objects or actions to a developer in an abstracted format that can easily be integrated into another program.
Connecting to Salesforce
In order to use the Salesforce Input or Output tools in Alteryx, you must first connect to Salesforce using your Salesforce credentials:
URL - The URL of your salesforce instance in the format: https://[instance].salesforce.com where [instance] is the domain of your Salesforce environment.
User Name - Salesforce username (often an Email address).
Password - Password associated with Salesforce account.
Security Token - If necessary, you can send a new token to your email by logging in to Salesforce, going to “My Settings”, and selecting “Reset My Token” under the “Personal” tab.
Once you click the ‘connect’ button, Salesforce will authorize your credentials, and you will be able to begin using the tool to query Salesforce data.
Salesforce Input Tool (Querying Salesforce)
The Query Builder has four fields that allow you to select the data you would like to pull from Salesforce simply, without having to write full SOQL (Salesforce Object Query Language) statements:
Table: Select the table you would like to pull fields from. *The list only includes ‘Queryable’ tables as defined by the flag ‘Queryable’ set to true or false returned by the API call.
Fields (optional): Select the fields in the table you need data from. If no fields are selected, all fields will be pulled.
Record Limit (optional): Place a MAX on the number of records you will pull.
WHERE Clause (optional): Using a SOQL statement, specify the conditions that you require for the data you pull. *You do not need to include ‘WHERE’ in the statement. Ex. AccountID = ‘2543456’
It is best practice to limit the records you bring in with the Fields, Record Limit, and WHERE clause arguments in the Salesforce Tool instead of bringing in all of the salesforce data, and then filtering down the data with tools in Alteryx.
If you prefer Querying Salesforce using SOQL, you can use the Customer Query to write out SOQL statements. For full SOQL Syntax, see this link. If you began using the Query Builder, and decided to change to the custom query, you will be prompted in the Custom Query Builder to pull in the query you began in the Query Builder in full SOQL syntax:
Below the SOQL Query text box is a check box for “Attempt to Parse JSON Response.” With this box checked (the box is checked by default), Alteryx will attempt to parse the response returned from the API call for quick viewing in the results window. If the box is left unchecked, the API response will be returned in one field titled “JSON.” You can parse this response using the JSON Parse tool in Alteryx.
Next to the “Attempt to Parse JSON Response” check box is a “Validate” button. Clicking this button prior to running the workflow will submit the query to Salesforce to determine if it valid. It will also check to see if the response from the API will be able to be parsed automatically, which should prompt you to leave the “Attempt to Parse JSON Response” check box selected.
Salesforce Output Tool (Writing to Salesforce)
Connecting to the Salesforce Output is identical to the Salesforce input tool. URL, User Name, Password, and Security Token are all required credentials to connect to the API.
There are only two options that need to be selected in the Salesforce Output tool. Both are required.
Table: Select a table to write to from the list of tables available
Output Operation: Select the operation for how you will write the data to the table. The three available options are Update, Insert, and Delete.
** If you want to overwrite values in Salesforce with null values, use “#N/A” instead of “null.” You can accomplish this with a replace function in the Formula tool.
Basic Troubleshooting steps:
Error: “Unable to reach SOAP API (Check URL)”
-The first thing to check with this error is that the URL you entered is correct and in the format: https://[instance].salesforce.com.
-Check to see if your username needs to have the domain attached (eg. JonDoe@alteryx.com).
-If you’ve ensured that the URL is correct, this may be a proxy issue. Alteryx should pick up the proxy settings, but you may need to enable them manually by going to Options->User Settings->Edit User Settings->Advanced. There is an article on the community that walks through this.
-The tool requires that your Salesforce account is API enabled. You may have to work with your Salesforce administrator for granting your account API User Permissions.
Error: “The following fields are not updateable members of the target table: (table)”
This error is telling you that your Salesforce administrator has locked the fields from being updated, and you will have to work with them determine what can be updated.
Error: “INVALID_LOGIN: Invalid username, password, security token; or user locked out.”
After you have confirmed that your credentials are correct, and you are not locked out of Salesforce, check to see if your company uses SSO (single-sign-on) for Salesforce. You can check this by seeing if Salesforce requires a password when you log in a browser. This authentication is not supported by the connector, and will not work. Custom domains are also not supported by the tool. Check to see if you are using a custom domain to log onto salesforce. You will know you are using a custom domain if this screen appears when logging in to Salesforce:
Because there are “two pages”/steps for authentication, and the tool can only send one request, this type of authentication is not supported.
*See the “Common Issues” section of this Community article by @JordanB for more common issues and troubleshooting steps.
Salesforce has a maximum length for SOQL statements, which is set at 20,000 characters by default. There is also a maximum length of 4,000 characters for the WHERE clause. Other SOAP API call limits can be found here.
As the API accepts data in batches which have limits, the output tool contains logic to take care of the batching. Please see this link for more details.
Question Does Alteryx support web crawling?
Yes. In Alteryx you can look at a web page, find embedded links (e.g. using regular expressions), and add to a queue of "links to visit". Then continue visiting/adding indefinitely, while also extracting various other tidbits of interest from each page visited.
In a Text Input Tool, enter URLs to crawl. Alteryx can take the URLs from a data stream (a database where we have all of the URLs we want to crawl) and iteratively repeat the process of connecting and getting the code beneath that URL:
Use the Download Tool and point it to a web address:
Alteryx returns the whole content available for that URL:
The attached v10.0 workflow allows you to connect to wikipedia and "crawl" the content of that URL. It can be saved, parsed etc. Additional functionality may be added to create a very powerful crawling engine.
To do your best data blending, it is a critical need to have the flexibility to connect to as many data stores as possible. No puzzle reveals a complete picture without all the pieces in place, and the same adage holds true in analytics. While we’re proud to boast a list of supported input file formats and data platforms that may even be large enough for database storage itself, unfortunately, in the ever expanding world of data you just can’t catch them all. Enter the Download Tool . In addition to FTP access, this tool can web scrape or transfer data via API (check your data source – there’s almost always an API!), giving you access to even the most secluded data stores. With the examples compiled below, and the wealth of data accessible on the web, you can turn nearly any analytical puzzle into the Mona Lisa :
To go along with our example on how to download a file from FTP, we’ve assembled steps in v10.1 below (credentials, server removed) as an example of uploading a file to FTP. In this example (attached) I’ve encoded a string field as a Blob to be posted as a text file. Theoretically, all your fields could be concatenated to a CSV format, or another delimited format, to be converted and posted using the same steps:
My field string to be converted:
1. First identify the field to be converted to Blob in your Blob Convert Tool:
2. Specify in a Formula Tool your FTP URL and filename in the format URL/filename.extension:
3. Have your Download Tool use this field as the URL field in the Basic Tab:
4. In the Payload tab specify the HTTP action PUT and select the option “Take Query String/Body from Field” and specify your Blob field:
5. Specify your credentials in the Connection tab of the Download Tool, leave all other configuration options default:
6. Run the workflow!
After running, you should be able to confirm the successful transfer of your file in the DownloadHeader field returned from the Download Tool (it'll also be hosted on your FTP path):
Take a look at the results below:
Tab 1 - Basic
URL : The URL for the resource you are trying to access must come from an upstream tool and is the only field required by Alteryx to configure the Download tool. Based on the API you are trying to pull information from (or send data to), other information will be required such as headers and/or a payload.