04-16-2021 10:34 AM - edited 10-07-2021 07:30 AM
When using the "Update: Insert if New" or "Update: Error on Update Failure" output options for the Output Data tool when connecting to Hive ODBC, the following error occurs:
Primary Key required for Update option
Hive Primary Key capabilities are only available on 3.x of Hive. No primary key constraint exists on the table using the Update option in the Output Data tool. Only Primary keys will work as a constraint, Unique keys constraints will not qualify. Or due to an underlying bug in Hive, the schema name is not referenced in the SQL query.
No Primary key constraint exists on the table.
An underlying Hive bug is interfering.
A Hive table must have a Primary Key constraint for the Update option to function.
Alternatively, it's possible to add DDL scripts to create a new table or alter an existing table with the Primary Key constrain as follows.
These DDL scripts will require an elevated or DBA permission role or request your DBA for assistance to run these commands. And these commands are meant to be run only once and can be removed from the PreSQL Query section.
Create a new table with a Primary key constraint built-in. Example DDL Script:
CREATE TABLE MyTBL_W_PK (id int, first string, last string, primary key(id) disable novalidate);
Or use the Alter table DDL script within the PreSQL Query section of the Output Data tool for an existing table. Example DDL Script:
ALTER TABLE MyTBL_W_PK ADD CONSTRAINT pk_id PRIMARY KEY (id) DISABLE NOVALIDATE;
CREATE TABLE mytable (id INT, name STRING, PRIMARY KEY(id) DISABLE NOVALIDATE) CLUSTERED BY (id) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES('transactional'='true'); set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; INSERT INTO TABLE mytable VALUES (1, 'joe'); UPDATE mytable SET name = ‘fred’ WHERE id = 1;