This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Question Does Alteryx support web crawling?
Yes. In Alteryx you can look at a web page, find embedded links (e.g. using regular expressions), and add to a queue of "links to visit". Then continue visiting/adding indefinitely, while also extracting various other tidbits of interest from each page visited.
In a Text Input Tool, enter URLs to crawl. Alteryx can take the URLs from a data stream (a database where we have all of the URLs we want to crawl) and iteratively repeat the process of connecting and getting the code beneath that URL:
Use the Download Tool and point it to a web address:
Alteryx returns the whole content available for that URL:
The attached v10.0 workflow allows you to connect to wikipedia and "crawl" the content of that URL. It can be saved, parsed etc. Additional functionality may be added to create a very powerful crawling engine.
Tab 1 - Basic
URL : The URL for the resource you are trying to access must come from an upstream tool and is the only field required by Alteryx to configure the Download tool. Based on the API you are trying to pull information from (or send data to), other information will be required such as headers and/or a payload.
Alteryx provides many built-in data connectors and there are additional ones available for download from the Gallery . This list currently includes tools such as Google Sheets , Salesforce , Marketo , and others. If you require a connection to an API or web service that Alteryx does not already provide a tool for, this knowledge base article will serve as a reference guide for you to build a connector using Alteryx.
Before reviewing this section, we recommend watching our video on Standard Macros . In this section we will explain how to turn your working connector workflow into an Alteryx macro so it can be used as a tool within other workflows. The Interface tools will let you control which parameters and inputs can be entered into the connector macro. Interface tools take user inputs to update other tools inside the macro.
One of the greatest strengths of modern web APIs is their flexible, developer-friendly nature, which provides numerous options for both the provider and the user. However, this flexibility can make it more intimidating for business users to deal with the various data formats that these APIs provide. The purpose of this article is to familiarize you with the main data formats used by the vast majority of web APIs, and provide the basic knowledge that will allow you to confidently process the data they return into a typical tabular format.
Through Adobe Analytics the ability to collect and visualize data from your websites has allowed for improved decision making, yet the format of that data can only take you so far. Within Alteryx you now have the ability to connect to Adobe Analytics and bring that data into Alteryx, allowing you to perform greater data manipulation to provide further insights, as well as Predictive & Spatial Analytics!