One of the powerful things about the Alteryx Designer is that you can do most things related to your workflow right from within the workflow itself. One such operation is creating a Primary Key for your database table and namely using the Pre-Create SQL and Post- Create SQL options in the Input Data and Output Data tools to do that.
What Are Primary Keys?
A Primary Key uniquely identifies a record in a database table. The value of a unique identifier, among other benefits, is that it improves database performance and allows updates on the records.
Primary Keys can be made up of one or multiple columns in the table in a database table. However, a table can only have one Primary Key. Primary Keys cannot be null and must be unique and therefore assigning a Primary Key at least consists of two steps: setting the column to Not Null and then setting it to be the Primary key. The values in the target column you’re setting as Primary Key must only have unique values otherwise the database will throw an error. If no column in the table is suitable to be a Primary key, you can use the Alteryx Record ID tool to create such a column.
This article will deal with the case of using one column as the Primary Key. Once you know how to create a one-column Primary Key, you can find many articles online explaining how to create multi-column Primary Keys in SQL.
Creating Primary Keys in Alteryx
All of the following examples assume that you know how to connect to your database.
1. Primary Key for a new table:
When creating a new table in Alteryx and then saving it on the database, the easiest way is to save the table first and then “alter” the table using Post-Create SQL to set the Primary Key.
The Pre/Post-Create SQL statements are dependent on the database you’re using. In this Article you’ll find examples for SQL Server and Oracle but you can get more examples online.
For SQL Server - Expression 1a:
ALTER TABLE ExampleTest1
ALTER COLUMN PrimaryK int NOT NULL;
ALTER TABLE ExampleTest1
ADD PRIMARY KEY (PrimaryK);
For Oracle (10+) – Expression 1b:
ALTER TABLE "ExampleTest1"
MODIFY "PrimaryK" NUMBER NOT NULL;
ALTER TABLE "ExampleTest1"
ADD CONSTRAINT Example_pk PRIMARY KEY ("PrimaryK");
As you’d notice, expressions 1a and 1b are slightly different.
- The words with all capital letters are SQL keywords. These are the commands that the database understands.
- “ExampleTest1” is the name of the table; replace it with the name of your table.
- “PrimaryK” is the name of field that you want to make as Primary Key. In this example it was generated using the Record ID tool.
- In Oracle, Example_pk is the name of the Constraint. You can set it to whatever you like as long as it doesn’t have spaces or special characters and it is unique in the database. For example you can use ExampleTest1_pk.
- The int word in the SQL Server statement and the NUMBER word in the Oracle Statement are the types of the column in SQL Server’s and Oracle’s parlance respectively. If you’re using the Record ID tool similar to the example workflow then you can keep one of these types. Otherwise, you will have to change it to the correct type. You can get more details of the database types here: SQL Server, Oracle.
- Note: for Oracle if you have set the Output Data tool option Table/FieldName SQL Style to Quoted (default) then you must use the quotations around the table and field names; otherwise remove them.
You can confirm that the table now has a Primary Key by using an Input Data tool and checking the Visual Query Builder. The Primary Key will show a key sign next to the field name.
2. Primary Key on an existing table:
If you already have a table in the database to which you want to assign a Primary Key, there are two cases:
- The table already contains a column that can be assigned as the Primary Key:
- The column must only contain unique values and no nulls.
- You can check the column type for the field you want to set as Primary Key using the Visual Query Builder.
- Use the Input Data tool and fill Expression 1a or 1b in the Pre-Create SQL Code In this case the table exists in the database and you only need to assign the primary key so you can use the Pre-Create SQL code. Since the retrieved data is ignored, it’s advisable that you limit the number of rows retrieved to only the first 10 records using these SQL statements:
- For SQL Server: SELECT TOP 10 * FROM ExampleTest2;
- For Oracle: SELECT * FROM “ExampleTest2” WHERE ROWNUM <= 10
2. The table does not have a column/columns suitable to be a Primary Key:
- In this case, you need to bring the data to Alteryx, append a field that can be a Primary Key, write the data out, and then set the Primary Key.
- You can use the Record ID to create the new field for the Primary Key.
- You will have to drop (delete) the table and re-write it to make this change. To make sure that you read all the data from the table you must use the Block Until Done tool right before the Output Data tool.
- Similar to the first example, we will put Expression 1a or 1b in the Post-Create SQL. The difference here is that Output Data tool Output Option is set to Overwrite Table (drop).
These are the most common cases of creating Primary Keys and you can use the same logic to create more complex Primary Keys or indeed move some of the SQL table maintenance right into your workflow using the Pre-Create/Post-Create SQL options.
Side note: if you noticed in my screenshots, I have short names for the database connections (if not, check them out). These are Aliases – a neat way to refer to you database connections. If you’re not using them you should check this article.
Common Errors Related to Primary Keys
Here are the most common errors that might point to issues with primary keys:
- Primary Key required for Update option…
Make sure a primary key is declared on the table.
- Violation of PRIMARY KEY constraint 'PK_TEST'. Cannot insert duplicate key in object 'dbo.TEST'. The duplicate key value is…
You are trying to insert a key that already exists. Make sure you are not inserting duplicates. Is the Primary Key column alphanumeric but not case sensitive? For example SQL server is not case sensitive and EhzA and ehza are considered duplicate.
Note: if the key appears multiple times on the input file and an update option is chosen, the same record will be updated multiple times.
- In regular tools…Cannot insert explicit value for identity column in table 'TEST‘….
The key is set to auto-increment and Alteryx is trying to insert a value in this column. Deselect primary key column before appending to table and let the database create the value.
- Write Data In-DB …. An explicit value for the identity column in table 'Test' can only be specified when a column list is used and IDENTITY_INSERT is ON….
The In-DB tools cannot generate the SQL statement needed to update a table that has a key which is set to auto-increment. Either change the way the key is generated in your table or use the regular tools.
For further information, please contact Alteryx Support and one of us will reach out to you.
You can find the workflows used in this article attached to this post. When you open these workflows you will get errors - that's expected because your connection details are different. You'll need to update the connection details and table/column names before using them. These workflows have been created with Alteryx Designer 10.1 (10.1.7.12188).
Fadi, Henriette, Margarita