EDIT I did what the top comment suggested. For anyone who wants more details, here are the exact details:
1) I set Alteryx output file format to "sql server bulk loader" as this is faster than regular odbc upload.
2) I set Alteryx output options to "create new table" with a table name of "dbo.temp_"
3) I set pre-sql to:
-- If table exists, delete it
IF OBJECT_ID('reporting.dbo.temp_', 'U') IS NOT NULL
drop table reporting.dbo.temp_;
4) I set post-sql to this, where cworderid is my pk/join column:
-- Set index
CREATE INDEX cworderid on reporting.dbo.temp_(CWOrderID);
-- Update Orders table with temp table
MERGE reporting.dbo.Orders AS TARGET
USING reporting.dbo.temp_ AS SOURCE
ON (TARGET.cworderid = SOURCE.cworderid)
--When records are matched, update the records if there is any change.
--You must add a line for every field you uploaded to your temp table that you want to update in your target table
--> THIS MUST BE MANUALLY CONFIGURED, DEPENDING ON WHAT YOU ARE ADDING
WHEN MATCHED AND TARGET.lastupdatedate <> SOURCE.lastupdatedate OR
TARGET.[updated by user] <> SOURCE.[updated by user]
--> THIS MUST BE MANUALLY CONFIGURED, DEPENDING ON WHAT YOU ARE ADDING
THEN UPDATE SET TARGET.lastupdatedate = SOURCE.lastupdatedate,
TARGET.[updated by user] = SOURCE.[updated by user];
5) This is BLAZINGLY fast compared to using the "update" option in alteryx. It's almost a wonder they have it in there.
I need to update millions of records in sql server from alteryx.
I have a good understanding of how to optimize speed for "Append Existing".. But "Update - warn on update failure" takes FOREVER.
I can insert 1,00,000 x 200 records in about 18 minutes. But updating just 20,000 x 3 records on an indexed column takes about 30 minutes.
My primary key is indexed.
ODBC vs bulk load doesn't seem to make a difference.
The byte sizes on the columns are all small.
Any performance suggestions?
Solved! Go to Solution.
One thing I have tried in the past is put the data into a temporary table and then in the post sql statement run your update statement. This should help with speed. Then once the update happens you can delete the temporary table.
Thanks, Ill try that
My solution is to set up a flow which saves a CSV then uses the Run tool to run a .bat file containing a BCP command. It's lightning fast. Need to use a 'block until done' to make sure the CSV has finished saving before the BCP executes.
The .bat file will contain something like this:
bcp tablename in "filepath.csv" -S servername -d databasename -T -q -c -t ","