We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Appending data using In-DB Connections with DataBricks

Bam98
5 - Atom

Hello,

 

I have an analytical app that is writing to a table which I am trying to append app entries too. The table is in DataBricks. 

 

I am having a problem with the append portion of the in- db connection. I have it set as temp table> append existing but It keeps overwriting the table. How do I get it to append new rows?

 

 

5 REPLIES 5
apathetichell
20 - Arcturus

Haven't looked at this in a few years -> but last I looked merge/append was not supported. For multi-million rows I ended up doing a stream/out-> union -> datastream-in. for > 10mm rows -> I'd write a new table to DB and then union in a notebook,

alexnajm
18 - Pollux
18 - Pollux

Out of curiosity, have you just tried the output data tool instead of the In-DB tool? Since you have to stream the data into the database anyways, it may save you a tool

apathetichell
20 - Arcturus

@alexnajm- I do not believe that Output Data works with Databricks - it's not listed here - https://help.alteryx.com/current/en/designer/data-sources/databricks.html#idp394001 and my experience is that it wasn't working..  For my flow - once the incremental add was done to the dataset via datastream-in - we were doing enrichment to a larger dataset in-db. I'd probably script via the Databricks CLI If I was working on this now.

 

alexnajm
18 - Pollux
18 - Pollux

Thank you @apathetichell - I should've double checked before responding (and after having a bit more caffeine)! I just looked at the generic ODBC options then and didn't check Databricks specifically

apathetichell
20 - Arcturus

@alexnajm- not you - Databricks is very complex and a beast to set up. If you've been following Snowflake's (generally successful) product push over the past two years (Streamlit, Containers, Python Notebooks, Cortex) - its to provide features Databricks has. For Databricks the push is to make it easier to use/set up because it's so imposing vs Snowflake and requires so much more configuration. Of course everyone is pushing Iceberg.

Labels
Top Solution Authors