Case: I want to prevent reading from a database table that is being written to.
Problem: let’s say I have two workflows. Workflow 1 uses the regular output tool in Alteryx to write 1 million records to a table. Workflow 2 then starts reading from the same table that workflow 1 is writing to and this results in a dirty read – workflow 2 only reads the number of records that were transmitted at the time of reading.
The way I understand the output tool is that it chunks up the writing session into smaller transactions (specified by the user) and finishes once the writing session completes. However, it doesn’t lock the table while writing and therefore dirty reads can happen if workflow 2 reads meanwhile workflow 1 is writing.
The situation occurs when workflows on the Alteryx server runs simultaneously.
Note:
- In many cases workflows are executed in an extract, transform, load order to avoid concurrency problems. However, this is not possible as workflows need to be run at different schedules which have overlapping runtime.
- DB runs on MSSQL
I’ve tried
- Changing the isolation level on the database, but Alteryx ignore these.
- Using locks in a pre-SQL statement, but they release right before the writing sessions start.
- The SQL Server bulk loader protocol, but dirty reads still occur.