Today, many organisations face the same recurring challenge:
Data is engineered in one place, analysed in another, and the connection between the two is often manual, fragile, or inefficient.
As a result, companies often end up with:
This article addresses exactly that problem.
By using only Databricks Free Edition and Alteryx One, we demonstrate that anyone can:
A fully modern, end-to-end, reproducible analytics pipeline, accessible to both data engineers and business users, without needing a full cloud environment or complex infrastructure.
If your goal is to understand how to connect the Lakehouse world (Databricks) with the no-code analytics world (Alteryx), this article shows the how and the why through a practical example you can reproduce today.
1. Introduction: Why Databricks and Alteryx?
In this article, I’ll walk through a simple yet powerful end-to-end workflow demonstrating how to combine Databricks for scalable data engineering with Alteryx One for intuitive, no-code analytics.
Even with only:
…it’s possible to build a pipeline inspired by modern Medallion Architecture, expose clean Delta tables, and make them instantly consumable through Live Query in Alteryx One.
The goal is not to replicate a full enterprise setup, but to show how both platforms complement each other and accelerate analytics for technical and business users alike.
2. End-to-End Architecture Overview
Here is the architecture we will build:
The key message is:
Databricks handles scalable data preparation, Alteryx unlocks business-ready analytics.
3. Databricks Pipeline: Simple, Reproducible, and Modern
Even with the Free Edition, Databricks provides everything needed to structure a clear data engineering workflow using Delta Lake and notebooks.
Onboarding for the Free Edition is very easy. You can sign up in just a few clicks by searching “Databricks Free Edition” and opening the official link.
You can sign up for the Free Edition here:
Once you complete the initial steps, you now have access to Databricks, congratulations!
3.1 Bronze – Raw Ingestion
We start by uploading a CSV file into DBFS (or directly from a cloud bucket if preferred).
By clicking “Upload Data,” you can directly add flat files into Databricks. For the purpose of this article, we keep things simple by adding raw data directly into DBFS.
The process is straightforward.
Databricks automatically converts the uploaded file into a Delta table, allowing us to preserve raw data in a single, unified environment.
Opening a notebook, we can now see that our table is available:
Databricks also provides Serverless clusters, meaning you don’t need to configure or manage any compute to start working with your data. It just works, Databricks handles all the compute in the background.
Our files are now fully available in Databricks.
To simulate a production environment, we now copy our data from Raw to Bronze. Databricks structures data into three layers: Bronze (raw), Silver (cleaned and standardized), and Gold (analytics-ready).
Tables are now created and ready for cleansing in the Silver layer.
We are now ready to move to the next stage.
3.2 Silver – Cleaning & Standardization
The Silver layer produces a clean, consistent dataset that enables value creation in the downstream Gold layer. Uncleaned data often contains inconsistent types, missing values, duplicate records, and other quality issues.
To do this, we stay in the same notebook and switch to Python, demonstrating Databricks’ flexibility by allowing users to choose the language they prefer. We start using PySpark SQL so we can easily manipulate the data directly in our notebook.
In a single step, we can now see clean data in our Silver layer after applying correct data types, recalculating amounts, and adding quality filters.
This step can be directly automated from the Notebook interface, allowing us to eliminate manual effort and reduce operational toil.
We are now ready in Databricks to build our Gold layer.
3.3 Gold – Analytics-Ready Table
We simply run a LEFT JOIN between our two Silver tables to produce the Gold table, which is now ready for downstream analytics.
We can now run our Alteryx workflow on this data.
We treat sales_silver as our transactional fact table (each row represents a transaction) and customers_silver as our cleaned customer dimension.
In the Gold layer, we bring both together into a single fact_sales_gold table, which is the one we expose to Alteryx via Live Query.
4. Connecting Alteryx One to Databricks with Live Query
Alteryx One now allows us to use all Alteryx products in a single, seamless experience, whether on the cloud or on a laptop. We first connect to our Databricks data using Alteryx One. To do this, we go to the Alteryx One homepage, click on our profile, and navigate to Workspace Admin.
Here we can see the Databricks menu, where we can provision our Databricks workspace:
This information can be found easily in Databricks.
The service URL is the portion of your Databricks workspace URL up to cloud.databricks.com.
We now return to Databricks to generate the Personal Access Token (PAT). Be mindful of the security implications: these keys should never be shared. Go to your profile, open Settings → Developer, and generate a new token as shown below. Paste this token into Alteryx.
Everything is now set. You just need to fill in the remaining information:
As a final step, in the Data tab, we simply need to add the connection — and we are all set:
Just add a connection name, all information has already been filled in:
Once connected, Alteryx queries the Delta table live, without moving or duplicating data, a perfect fit for Lakehouse patterns..
This allows data engineers to refine the pipeline in Databricks while analysts explore the same data instantly in Alteryx.
5. No-Code Business Enrichment in Alteryx
We can now select our Gold table and begin working with our data:
When loading our data in Alteryx, nothing is actually imported into the backend. Everything stays in Databricks, keeping costs low and minimizing data movement:
This is where Alteryx shines: turning a curated dataset into actionable business insights — without writing code.
We now create an Alteryx workflow in Designer Cloud. From the homepage, click Create New → Designer Cloud to begin:
We can now add an Input tool and start working with our Gold table from Databricks:
By default, LiveQuery is enabled, allowing us to use the entire dataset directly in our browser without any replication in the Alteryx infrastructure. This is a major advantage. It enables full pushdown processing and allows users to leverage no-code tools without copying data, relying instead on the powerful scaling capabilities of Databricks.
You can verify whether Live Query is enabled by clicking your profile icon (top right), navigating to Workspace Admin → Settings, and checking the Enable Live Query option.
We can now import our Excel file into Alteryx One:
We can now add the remaining tools and easily prep and blend the data, without writing a single line of code:
And the beauty of this approach is that all processing happens in Databricks.
7. Combined Benefits of Databricks + Alteryx
🟧 What Databricks brings
🟦 What Alteryx brings
🟪 Together
Together, they deliver a modern, efficient workflow that bridges engineering and business teams — without unnecessary complexity.
8. Conclusion
This project demonstrates that, even with minimal resources, Databricks Free Edition and an Alteryx One environment, it is entirely possible to build a modern Lakehouse-style pipeline and deliver business-ready insights.
Databricks provides the engine, Alteryx provides the experience, and together they accelerate analytics from raw data to actionable value.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.