Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
awrangler
Alteryx Alumni (Retired)

What is Data Wrangling?

 

A data pipeline is a method in which raw data is ingested from various data sources and then ported to a data store, like a data lake or data warehouse, for analysis. Before data flows into a data repository, it usually undergoes some data processing. That data processing is called data wrangling which involves prepping raw data into a usable form suitable for analysis or further downstream processing. During data wrangling, raw data is typically obtained from multiple sources, such as databases, spreadsheets, or APIs. The data may be incomplete, inconsistent, or contain errors, making it necessary to clean and preprocess it before it can be effectively analyzed

 

image001.png

 

Fortunately, here in our Alteryx Analytics Cloud (AACP), we have Designer Cloud Designer Experience and Designer Cloud Trifacta Classic to help us wrangle data! Although both Designer Clouds are capable of easing data wrangling and are user-friendly, they differ in UI experience. Designer Cloud Designer Experience will be closer to Designer Desktop UI. As for Designer Cloud Trifacta Classic UI, that experience will be closer to the Designer Cloud powered by Trifacta UI. On Google Cloud (GCP), Designer Cloud Trifacta Classic is the only version of Designer Cloud accessible there. Before the marriage of Alteryx with Trifacta, Designer Cloud powered by Trifacta was branded as Dataprep by Google.

 

Understanding How Data Wrangling Works in Designer Cloud Trifacta Classic


We have created a Data Wrangling Cheatsheet to help you refer to as you wrangle your data in Designer Cloud Trifacta Classic. Depicted below are figures from the Data Wrangling Cheatsheet.
Note: Data wrangling experience may differ between Designer Cloud Trifacta Classic and Designer Cloud Designer Experience but are similar conceptually.

The first page covers
how to combine, reshape, manipulate columns & rows, and restructure datasets.

 

cheatsheet page 1.png

Figure 1


The second page covers columnar datasets, data types, and top Trifacta pattern syntax.

 

cheatsheet page 2.png

Figure 2

 

The cheatsheets are also attached to this article in PDF form for your convenience!

Comments