Currently the Databricks in-database connector allows for the following when writing to the database
- Append Existing
- Overwrite Table (Drop)
- Create New Table
- Create Temporary Table
This request is to add a 5th option that would execute
Why is this important?
- Create or Replace is similar to the Overwrite Table (Drop) in that it fully replaces the existing table however, the key differences are
- Drop table completely removes the table and it's data from Databricks
- Any users or processes connected to that table live will fail during the writing process
- No history is maintained on the table, a key feature of the Databricks Delta Lake
- Create or Replace does not remove the table
- Any users or processes connected to that table live will not fail as the table is not dropped
- History is maintained for table versions which is a key feature of Databricks Delta Lake
While this request was specific to testing on Azure Databricks the documentation for Azure and AWS for Databricks both recommend using "Replace" instead of "Drop" and "Create" for Delta tables in Databricks.