Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Create Database Table Primary Key in Alteryx

fadib
Alteryx Alumni (Retired)
Created

One of the powerful things about the Alteryx Designer is that you can do most things related to your workflow right from within the workflow itself. One such operation is creating a Primary Key for your database table and namely using the Pre-Create SQL and Post- Create SQL options in the Input Data and Output Data tools to do that.

 

What Are Primary Keys?

 

A Primary Key uniquely identifies a record in a database table. The value of a unique identifier, among other benefits, is that it improves database performance and allows updates on the records.

 

Primary Keys can be made up of one or multiple columns in the table in a database table. However, a table can only have one Primary Key. Primary Keys cannot be null and must be unique and therefore assigning a Primary Key at least consists of two steps: setting the column to Not Null and then setting it to be the Primary key. The values in the target column you’re setting as Primary Key must only have unique values otherwise the database will throw an error. If no column in the table is suitable to be a Primary key, you can use the Alteryx Record ID tool to create such a column.

 

This article will deal with the case of using one column as the Primary Key. Once you know how to create a one-column Primary Key, you can find many articles online explaining how to create multi-column Primary Keys in SQL. 

 

Creating Primary Keys in Alteryx

 

All of the following examples assume that you know how to connect to your database.

 

1. Primary Key for a new table:

 

When creating a new table in Alteryx and then saving it on the database, the easiest way is to save the table first and then “alter” the table using Post-Create SQL to set the Primary Key.

Create New Table with Primary Key.png

 

The Pre/Post-Create SQL statements are dependent on the database you’re using. In this Article you’ll find examples for SQL Server and Oracle but you can get more examples online.

 

Create New Table with Primary Key - SQL Server.png

Create New Table with Primary Key - Oracle.png

 

For SQL Server - Expression 1a:

  

ALTER TABLE ExampleTest1
ALTER COLUMN PrimaryK int  NOT NULL;
 
ALTER TABLE ExampleTest1
ADD PRIMARY KEY (PrimaryK);

  

For Oracle (10+) – Expression 1b:

  

ALTER TABLE "ExampleTest1"
MODIFY "PrimaryK" NUMBER NOT NULL;
 
ALTER TABLE "ExampleTest1"
ADD CONSTRAINT Example_pk  PRIMARY KEY ("PrimaryK");

 

 As you’d notice, expressions 1a and 1b are slightly different.

  • The words with all capital letters are SQL keywords. These are the commands that the database understands.
  • “ExampleTest1” is the name of the table; replace it with the name of your table.
  • “PrimaryK” is the name of field that you want to make as Primary Key. In this example it was generated using the Record ID tool.

Create PK using RecordID.png

  • In Oracle, Example_pk is the name of the Constraint. You can set it to whatever you like as long as it doesn’t have spaces or special characters and it is unique in the database. For example you can use ExampleTest1_pk.
  • The int word in the SQL Server statement and the NUMBER word in the Oracle Statement are the types of the column in SQL Server’s and Oracle’s parlance respectively. If you’re using the Record ID tool similar to the example workflow then you can keep one of these types. Otherwise, you will have to change it to the correct type. You can get more details of the database types here: SQL Server, Oracle.
  • Note: for Oracle if you have set the Output Data tool option Table/FieldName SQL Style to Quoted (default) then you must use the quotations around the table and field names; otherwise remove them.

You can confirm that the table now has a Primary Key by using an Input Data tool and checking the Visual Query Builder. The Primary Key will show a key sign next to the field name.

Result.png

 

2. Primary Key on an existing table:

 

If you already have a table in the database to which you want to assign a Primary Key, there are two cases:

  1. The table already contains a column that can be assigned as the Primary Key:

Create Primary Key from Existing Column.png

  • The column must only contain unique values and no nulls.
  • You can check the column type for the field you want to set as Primary Key using the Visual Query Builder.

Column Type.png

  • Use the Input Data tool and fill Expression 1a or 1b in the Pre-Create SQL Code In this case the table exists in the database and you only need to assign the primary key so you can use the Pre-Create SQL code. Since the retrieved data is ignored, it’s advisable that you limit the number of rows retrieved to only the first 10 records using these SQL statements:
    • For SQL Server: SELECT TOP 10 * FROM ExampleTest2;
    • For Oracle: SELECT * FROM “ExampleTest2” WHERE ROWNUM <= 10

Pre-Create SQL - SQL Server.png

Pre-Create SQL - Oracle.png

 

2. The table does not have a column/columns suitable to be a Primary Key:

Update Existing Table to Add Primary Key Column.png

  • In this case, you need to bring the data to Alteryx, append a field that can be a Primary Key, write the data out, and then set the Primary Key.
  • You can use the Record ID to create the new field for the Primary Key.
  • You will have to drop (delete) the table and re-write it to make this change. To make sure that you read all the data from the table you must use the Block Until Done tool right before the Output Data tool.
  • Similar to the first example, we will put Expression 1a or 1b in the Post-Create SQL. The difference here is that Output Data tool Output Option is set to Overwrite Table (drop).

Overwrite Table and Add PK - SQL Server.png

Overwrite Table and Add PK - Oracle.png

 

These are the most common cases of creating Primary Keys and you can use the same logic to create more complex Primary Keys or indeed move some of the SQL table maintenance right into your workflow using the Pre-Create/Post-Create SQL options.

 

Side note: if you noticed in my screenshots, I have short names for the database connections (if not, check them out). These are Aliases – a neat way to refer to you database connections. If you’re not using them you should check this article.

 

Common Errors Related to Primary Keys

 

Here are the most common errors that might point to issues with primary keys:

 

  • Primary Key required for Update option…

Make sure a primary key is declared on the table.

 

  • Violation of PRIMARY KEY constraint 'PK_TEST'. Cannot insert duplicate key in object 'dbo.TEST'. The duplicate key value is…

You are trying to insert a key that already exists. Make sure you are not inserting duplicates. Is the Primary Key column alphanumeric but not case sensitive? For example SQL server is not case sensitive and EhzA and ehza are considered duplicate.

Note: if the key appears multiple times on the input file and an update option is chosen, the same record will be updated multiple times.

 

  • In regular tools…Cannot insert explicit value for identity column in table 'TEST‘….

The key is set to auto-increment and Alteryx is trying to insert a value in this column. Deselect primary key column before appending to table and let the database create the value.

 

  • Write Data In-DB …. An explicit value for the identity column in table 'Test' can only be specified when a column list is used and IDENTITY_INSERT is ON….

The In-DB tools cannot generate the SQL statement needed to update a table that has a key which is set to auto-increment. Either change the way the key is generated in your table or use the regular tools.

 

 For further information, please contact Alteryx Support and one of us will reach out to you.

 

You can find the workflows used in this article attached to this post. When you open these workflows you will get errors - that's expected because your connection details are different. You'll need to update the connection details and table/column names before using them. These workflows have been created with Alteryx Designer 10.1 (10.1.7.12188).

 

Fadi, Henriette, Margarita

Attachments
Comments
Treyson
13 - Pulsar
13 - Pulsar

This has saved my life. Thank you for the post!

Suzanne
7 - Meteor

This is an excellent post.  One question:  Is there a way to use the In-Database tools to do the same thing for Oracle?  ie. Alter table Add Constraint PK?

mdgajes1
7 - Meteor

I'm struggling getting this to work. 

 

In my workflow I start by pulling data from 3 tables

  • litigation (primary key - litigation_id)
  • claim (primary key - claim_id)
  • contact (primary key - contact_id)

I need to update a table "summary" in a different database with the results of my sql; however, this "summary" does not have a primary key assigned. I want to use the litigation_id as this is the unique identifier.

 

 

I've spent the last few hours going over this. I put the statement in the PreSQL on the output tool then tried the PostSQL. Neither worked so i switched to InputTool. Neither worked. Tried various other tactics but everything ends up in an error. Example: " Output Data (6) Error running PreSQL on "..... Syntax error, expected something like a name or a Unicode delimited identifier or a 'SET' keyword or a 'CONVERT_TABLE_HEADER' k

 

I'm using the following:

alter table summary
alter column litigation_id int not null;

alter table summary
add primary key (litigation_id);

 

What am i doing wrong?

 

mcbridewilliam
6 - Meteoroid

Please update this string to include a Hadoop example.  Thank you!

prasanna1992
5 - Atom

@fadib  That's a good explanation.

 

I have a use case where I need to create a unique identifier to sum up the price based on product(It's not available in data set).

 

Deep diving into use case:

 

we get data from multiple data sources every night and it stored in our database as one big table similar to below one(Not including Product column, included for reference). For each product  we have 3 identifiers ( Supplies, Grocery, Maintenance) but each data provider will only 2 identifiers at given time. 

 

 

If you look at the below example: Africa uses Primary identifier as supplies and secondary as Grocery 

                                                    USA uses Primary Identifier as Grocery and Secondary as Supplies.

                                                    England uses Primary identifier as Maintenance and Secondary as Grocery.

 

Technically all this comes under one product so I want find the sum price for product Milkyway . Is there a way I can create a unique identifier which I can use as Primary Key to sum up the values.

 

Note : At any given time we will have 2 identifiers populated. Any idea how we can resolve this using alteryx ?

 

Thanks in advance.

                                                    

 

Product Data sourcePrimary IdentifierIdentifier TypeSecondary IdentifierSecondary Identifier TypePriceUnique Identifier
MilkyWayAfricaXYZ123suppliesB123456789Grocery44XXXXXXXXX
MilkyWayEnglandABC12345maintenanceB123456789Grocery 12XXXXXXXXX
MilkyWayIndiaB123456789GroceryABC12345maintenance67XXXXXXXXX
MilkyWayUSAB123456789GroceryXYZ123supplies14XXXXXXXXX
MilkyWaySpainB123456789GroceryXYZ123supplies15XXXXXXXXX
PepsiUAEZZZ123SuppliesC123456789Grocery 55XXXXXXXXX
PepsiEnglandC123456789Grocery ZZZ123Supplies11XXXXXXXXX
PepsiIndiaC123456789Grocery ZZZ123Supplies22XXXXXXXXX
PepsiUSAC123456789Grocery ZZZ123Supplies33XXXXXXXXX
PepsiSpainC123456789Grocery ZZZ123Supplies62XXXXXXXXX

 

Expected  Output

 

Data sourcePrimary IdentifierIdentifier TypeSecondary IdentifierSecondary Identifier TypePriceUnique Identifier
AfricaXYZ123suppliesB123456789Grocery44RANDOM1234
EnglandABC12345maintenanceB123456789Grocery 12RANDOM1234
IndiaB123456789GroceryABC12345maintenance67RANDOM1234
USAB123456789GroceryXYZ123supplies14RANDOM1234
SpainB123456789GroceryXYZ123supplies15RANDOM1234
UAEZZZ123SuppliesC123456789Grocery 55RANDOM345
EnglandC123456789Grocery ZZZ123Supplies11RANDOM346
IndiaC123456789Grocery ZZZ123Supplies22RANDOM347
USAC123456789Grocery ZZZ123Supplies33RANDOM348
SpainC123456789Grocery ZZZ123Supplies62RANDOM349
mdesmith
5 - Atom

This article was extremely helpful! I do have a question regarding an issue that has come up for me. I had to create my unique column by combining a couple of other columns. This made it necessary to format the column as a v_string value. Now when I try to run the Post-SQL statement, I keep getting errors. I've tried using "char" & "varchar" in the statement, but keep getting the following error: 

 

Error: Output Data (2): Executing PostSQL: "ALTER TABLE "Coverage"
ALTER COLUMN "Key" varchar NOT NULL" : [Microsoft][SQL Server Native Client 11.0][SQL Server]String or binary data would be truncated.[Microsoft][SQL Server Native Client 11.0][SQL Server]The statement has been terminated.

 

Here is the exact statement I am using. The Key column is a maximum of 11 characters.

 

ALTER TABLE "Coverage"
ALTER COLUMN "Key" varchar NOT NULL;

ALTER TABLE "Coverage"
ADD PRIMARY KEY "Key"

 

Any advice?

Treyson
13 - Pulsar
13 - Pulsar

@mdesmith I think that error is because you are trying to put a value in that column that is more than 11 characters. Run a filter at the end of your workflow that looks at the length of the values.

 

Another thing that I would suggest is creating a composite key on the table, that is where your primary key consists of a few fields. I used to do the exact method you described but I changed it once I discovered this: https://www.1keydata.com/blog/composite-key-in-sql.html

 

Essentially, if you are writing your script to create the table, the end will

 

have something like this: Create Primary Key (Column1,Column2,Column3) 

Currently looks like this: Create Primary Key (ConcatenatedColumn)

 

Clear as mud right?

RodLight
8 - Asteroid

@mdesmith 

The SQL error "String or binary data would be truncated" indicates the value you are inserting in that field is too long. 

To troubleshoot this, I'd put a Select tool right before the Output tool and explicitly set the "Key" field to a length of 11. My suspicion would be that the process would now run without the SQL error, but you will then get an Alteryx warning on the Select tool that certain records have been truncated within Alteryx. That should identify where the issue is from a data standpoint.

 

Rod

 

mcbridewilliam
6 - Meteoroid

This is a great article.  Is it possible to add additional guidance when writing to Snowflake?

Treyson
13 - Pulsar
13 - Pulsar

@mcbridewilliam it should work the same way in Snowflake. As long as your tables are created with keys. There are also some things around casing in Snowflake that you need to be aware of. Make sure you are matching your casing.

TerriLH
7 - Meteor

This is a great thread. However, my primary key is a UPC code, which is 12 characters. I keep getting the "String or binary data would be truncated in table" error when trying to add this as a primary key in SQL. Isn't there a way to stop truncating the field? I need all 12 characters as a primary key...not 11.

 

Nevermind. I figured it out (add varchar (12) to the sql statement). I'm new to writing Alteryx up to SQL, so it's a daily learning process.

Treyson
13 - Pulsar
13 - Pulsar

@TerriLH we are glad to see you found this out! One of the biggest learning curves for the Alteryx/SQL relationship is the data types.

jasmine9n1
5 - Atom

Thanks for the detailed article!

simonaubert_bd
13 - Pulsar

Hello @fadib 

Yes, it works with in-memory but not in database, which is frankly a shame...