Free Trial

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Databricks IN-DB support for create or replace table when overwriting an existing table

Currently the Databricks in-database connector allows for the following when writing to the database

  1. Append Existing
  2. Overwrite Table (Drop)
  3. Create New Table
  4. Create Temporary Table

This request is to add a 5th option that would execute

  • Create or Replace Table

Why is this important?

  • Create or Replace is similar to the Overwrite Table (Drop) in that it fully replaces the existing table however, the key differences are
    • Drop table completely removes the table and it's data from Databricks
      • Any users or processes connected to that table live will fail during the writing process
      • No history is maintained on the table, a key feature of the Databricks Delta Lake
    • Create or Replace does not remove the table
      • Any users or processes connected to that table live will not fail as the table is not dropped
      • History is maintained for table versions which is a key feature of Databricks Delta Lake

 

While this request was specific to testing on Azure Databricks the documentation for Azure and AWS for Databricks both recommend using "Replace" instead of "Drop" and "Create" for Delta tables in Databricks. 

 

AStasi_0-1661864644374.pngAStasi_1-1661864772827.png

4 Comments
AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes
 
simonaubert_bd
13 - Pulsar

Hello  @AStasi 

I think this is a duplicate for https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Ideas/Overwrite-Table-for-output-write-ind...

 

Can you please confirm and add your vote to this idea?

Best regards,

Simon

AStasi
7 - Meteor

Hi @simonaubert_bd - Reading your idea it seems like they are different. This idea is specific to one database (Databricks) and how there is a capability within Databricks to overwrite a table in place without dropping and recreating it.  It enables table history/time travel capabilities that are native in Databricks. They way Alteryx does it currently in-bd is that it will drop the table and recreate it but lose all history in databricks.  Also during that drop/recreate the table becomes temporarily unavailable which can create issues for downstream apps.

simonaubert_bd
13 - Pulsar

@AStasiOk, thanks for the clarification. Same issue but yes, on Databrick, this is even more important since there is this deltalake feature. I guess there is the same issue with Snowflake and transient table but not sure. For the other bases, I had developed an in db macro that create the table toto with anothername (like toto_bis) and then alter with rename the table toto to toto_old and then rename toto_bis to toto, and finally dropt toto_old. This doesn't solve totally the unaivability time but decrease it hugely.

Best regards,

Simon