Hello,
I have an analytical app that is writing to a table which I am trying to append app entries too. The table is in DataBricks.
I am having a problem with the append portion of the in- db connection. I have it set as temp table> append existing but It keeps overwriting the table. How do I get it to append new rows?
Haven't looked at this in a few years -> but last I looked merge/append was not supported. For multi-million rows I ended up doing a stream/out-> union -> datastream-in. for > 10mm rows -> I'd write a new table to DB and then union in a notebook,
Out of curiosity, have you just tried the output data tool instead of the In-DB tool? Since you have to stream the data into the database anyways, it may save you a tool
@alexnajm- I do not believe that Output Data works with Databricks - it's not listed here - https://help.alteryx.com/current/en/designer/data-sources/databricks.html#idp394001 and my experience is that it wasn't working.. For my flow - once the incremental add was done to the dataset via datastream-in - we were doing enrichment to a larger dataset in-db. I'd probably script via the Databricks CLI If I was working on this now.
Thank you @apathetichell - I should've double checked before responding (and after having a bit more caffeine)! I just looked at the generic ODBC options then and didn't check Databricks specifically
@alexnajm- not you - Databricks is very complex and a beast to set up. If you've been following Snowflake's (generally successful) product push over the past two years (Streamlit, Containers, Python Notebooks, Cortex) - its to provide features Databricks has. For Databricks the push is to make it easier to use/set up because it's so imposing vs Snowflake and requires so much more configuration. Of course everyone is pushing Iceberg.