Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Writing large tables to Databricks

jheck
6 - Meteoroid

I need some help writing a table to Azure Databricks. I’ve been successful in writing smaller sized tables using the In-DB tool, however it appears like there is a 2GB size limit within Databricks that is potentially limiting my ability to write larger tables (Receiving “Error from Databricks”). I’m wondering if there is a work around? I’ve recently downloaded the 2023.1.1.5 version and started experimenting with the Databricks Delta Lake Bulk Loader (Avro) for writing, but without much luck (This could be due to my Shared Key not being correct). If I can’t write directly to Databricks, is there a backdoor I can write my large tables to from Alteryx, maybe a Blob Storage or something along those lines? Any help would be much appreciated!

 

Thanks!

2 REPLIES 2
apathetichell
19 - Altair

With blob storage - this is really dependent on how your DB is set up - if you are connected to a blob storage in your db- you should be able to write to your blob storage and have databricks process this. This is beyond the scope of the Alteryx->Databricks connection - and is part of the Storage -> Databricks mapping. I use 2021.4 on AWS Databricks - and have had issues with data (can't remember the specific size but quantity was in the 10mm range) with Cluster timeout. This was not staged to an S3.  I write exclusively using csv and datastream in (vs write in-db.).

BonusCup
11 - Bolide

@jheck I'm working on creating a new table in Databricks using inDB tools, can you share how you made this connection and was able to create the table?  TIA

Labels