Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
pmaier1971
Alteryx
Alteryx

by Philipp Maier and Ellen Zhao (@EllenZ


In our new Enterprise Readiness series, we will help navigate features and functionality in Alteryx to help better understand how our products fit into your technology ecosystem and meet the needs of your enterprise. In this iteration we will discuss data governance, or more specifically, how Alteryx’s capabilities can be leveraged to integrate with your data governance platforms and policies.

 

Alteryx empowers end users with self-service analytics and offers powerful tools and built-in capabilities to address the multifaceted challenges of data governance. When handling confidential or private data, robust controls over access to data and analytic assets are needed. A comprehensive data governance platform enables monitoring data quality, avoiding data discrepancies, and ensuring compliance with regulatory standards.

 

To this end, some of our clients are looking to integrate Alteryx into their existing data governance platform. Data governance platforms (also referred to as “Data Lineage Services”) like Collibra, Manta, or Solidatus can enhance the visibility of how users transform and use data in production processes across many systems and technologies, providing a unified view into data flows, data quality, and data transformations.

 

In this post, we will introduce several ways Alteryx can be integrated into a broad range of data governance services. We will start with our on-prem offering, before diving deeper into our cloud offering and future capabilities within our product roadmap in future posts.

 

What is Data Governance and why is it important?

 

Data governance plays a vital role in ensuring compliance with regulatory requirements. Numerous new regulations on data and data usage require organizations to maintain accurate and complete records of all personal data they hold and process (see Figure 1 below). This implies that our clients need to understand the types of data they store and how it is used and shared.

 

 

fig1.png Figure 1: Data Lineage Critical to Meet Regulatory Requirements

 

A successful Data Governance initiative requires a combination of people, process, and technology. Given there are many interpretations of data governance, Figure 2 provides definitions to help segment this space and create a common language. We also highlight areas where Alteryx provides capabilities that can be directly leveraged to enhance data governance, as well as optional ways to integrate into external services.

 

fig2.png

Figure 2: Common elements of data governance, including definitions and a description of Alteryx capabilities

 

Speaking of external services—dedicated data governance platforms (like Collibra, Atlan, or Solidatus) typically provide many, if not all, of the functionality outlined above. Specifically, these dedicated services help monitor data quality and apply automated data quality checks, ensuring data is accurate, consistent, and reliable. The benefits of using a data lineage service include avoiding duplicative data storage, the ability to run impact analysis, and detecting data anomalies. Many data lineage services also contain detailed metadata management functionalities, ensuring that data is well documented and capturing the impact of data transformations.

 

Can Alteryx be integrated with a Data Governance Platform?

 

Many of our clients have asked if Alteryx can integrate into their data governance platforms. To understand the various ways Alteryx can be integrated, let’s first review how these platforms typically function. Upon integration with data source systems via built-in connectors or APIs, these platforms systematically extract, parse, standardize, and catalog metadata. In the case of retrieving and parsing lineage information, they retrieve data usage from system logs, parse SQL or Python-based data transformations, or pull the relevant metadata directly from databases.

 

This yields two general ways Alteryx can integrate into these services.

  • First, the data lineage platform can pull the relevant lineage information from the Alteryx database via a connector.
  • Second, Alteryx can gather the relevant information and push to make it available e.g. through APIs, so the lineage service can ingest it.

 

fig3.png

 

How it can be done: 3 options for our clients

 

Based on this, and depending on the platform and the data needs, we recommend our clients leverage one of the following three ways to integrate Alteryx into any data governance platform (see Figure 3):

 

  1. Integrate: Native connectors already exist for several data governance platforms. This is likely the easiest way to get started.
  2. Import: Alteryx can make existing metadata information available by leveraging the Alteryx Metadata Loaders from Alteryx Connect. This provides very detailed information, but without additional processing, most services will not be able to ingest the data directly.
  3. Build: Lastly, via custom workflows, Alteryx can collect lineage information by parsing workflow XMLs and making it available in any format to ingest. This requires extracting the information from the Alteryx database (Mongo DB or SQL), which can be challenging – but the upside is that the data extract can be directly tailored to the needs of the platform.

 

fig4.png

Figure 3: Ways to integrate Alteryx into data governance platforms

 

A general consideration is that Alteryx workflows can be quite complex, so uploading every step or data transformation may not be optimal. This illustrates the need to carefully plan to ensure that information is useful, relevant, and fit for purpose. And, for Alteryx customers interested in a comprehensive data governance solution, but needing help to implement, we should mention that our vast partner network or Alteryx Professional Services can be of assistance.

 

Stay tuned and follow our series

 

Want to learn more about optimizing Alteryx for your data governance needs? Want to share your successes or challenges?

 

Leave us a comment!

 

Check out our first blog of the series What does it mean to be Enterprise Ready? Stay tuned for the next installment!

 

 

 

We thank Arezou Seifpour for helpful comments on earlier drafts.

Comments
simonaubert_bd
13 - Pulsar

Hello,

This would be very cool if Alteryx work with OpenMetadata  and Acryl Datahub for integration of both classic server and cloud. These are two open source solutions with a promising future (just look at the demo, sooooo cool). Also, being compatible with openlineage would help https://openlineage.io/ Using open standards is the most effective way to do it.

Best regards,

Simon