I need some help writing a table to Azure Databricks. I’ve been successful in writing smaller sized tables using the In-DB tool, however it appears like there is a 2GB size limit within Databricks that is potentially limiting my ability to write larger tables (Receiving “Error from Databricks”). I’m wondering if there is a work around? I’ve recently downloaded the 2023.1.1.5 version and started experimenting with the Databricks Delta Lake Bulk Loader (Avro) for writing, but without much luck (This could be due to my Shared Key not being correct). If I can’t write directly to Databricks, is there a backdoor I can write my large tables to from Alteryx, maybe a Blob Storage or something along those lines? Any help would be much appreciated!
Thanks!
With blob storage - this is really dependent on how your DB is set up - if you are connected to a blob storage in your db- you should be able to write to your blob storage and have databricks process this. This is beyond the scope of the Alteryx->Databricks connection - and is part of the Storage -> Databricks mapping. I use 2021.4 on AWS Databricks - and have had issues with data (can't remember the specific size but quantity was in the 10mm range) with Cluster timeout. This was not staged to an S3. I write exclusively using csv and datastream in (vs write in-db.).
@jheck I'm working on creating a new table in Databricks using inDB tools, can you share how you made this connection and was able to create the table? TIA