Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
Alteryx Alumni (Retired)

Recently, there has been much talk around Creating a Culture of Analytics. The goal: use analytics to make better decisions as a company. Part of this process is designing a system that allows different users to collaborate in one single platform - a language as we call it. More and more, customers are turning to Alteryx to be their Common Language for decision making. Dean Stoecker, the CEO of Alteryx, often states that Alteryx is a “Code-free and code-friendly platform,” and this is something our customers truly take to heart, turning Alteryx into the go-to platform that allows coders and non-coders to interact.


What does a Common Language across the enterprise look like? I would like to discuss two unique styles that I have seen from Alteryx customers:


  1. Language Agnostic: Many customers are agnostic to its users being code-free or code-friendly. Each user has the freedom to decide the best way to solve their use case. Alteryx is used as the glue that holds everything together, allowing users to collaborate in one platform.
  2. Defined Language: Other customers are code free enterprises at their core. All analysis is performed using the GUI in Alteryx Designer, making work easily replicable and understandable across their user bases. The code-friendly nature of Alteryx is used to expand on the GUI capabilities, performing advanced tasks such as custom model comparisons in R.


No matter which style our customers choose, they love Alteryx because they are not bound to one language or one way of doing things. Their users can focus on solving actual use cases in a common language and continue to build a culture of analytics in their organization.


The common language of Alteryx allows our customers to continually innovate with new technologies from a single framework. One great use case for this would be for reading Parquet files – a compressed columnar data format – from Hadoop.


Currently, the Input Tool only reads .csv and .avro file formats from Hadoop. With the Python Tool, users can leverage the code-friendly nature of Alteryx to connect to Hadoop, read the Parquet file in as a Pandas DataFrame, and pass the data to Alteryx.


David Hare discusses how to use the WebHDFS API to load Parquet files into Alteryx in his Parquet, will it Alteryx? blog post. This example will show you how to load Parquet using Spark via the Livy API:




The above code could easily be wrapped up into a custom tool using the Python SDK, allowing code-free users to access the same code.


The resulting data is a list of NYC addresses that we can easily graph in Alteryx:




Now, this data has ~20,000 addresses, too many to get useful information out of one graph. Perhaps we are interested in just addresses in Times Square. Code-friendly users could modify the Spark code, using the SparkSQL LIKE statements to extract only the Times Square addresses from Hadoop:




Code-free users can use the Regex Tool in Designer to graphically perform the same task without having to know Spark structure or syntax:






Either way you do it, the result is the same, a map of addresses in Times Square:




This is the power of using Alteryx as a Common Language. Users can solve problems with the technology they see fit with one central tool, Alteryx. Customers will often turn their work into Analytic Apps for end-users, requiring no Alteryx or coding knowledge to consume the information.


As you begin building a culture of analytics, look to use Alteryx as the common piece for all of your disparate data sources, technologies, and analysis tools. With the Parquet example, typically companies will use one tool to extract the Parquet, one to prepare the data, and another one to visualize it. All of this is now accomplished via your Common Language, Alteryx.