Hello,
I need to incrementally refresh data for an extremely large dataset based of the last time a workflow was kicked off. I am not allowed to store the kickoff time in the same database as the source data, nor alter that data. Is there a way where I can filter my input to only take the newest data from when the last time a workflow was ran?
I have tried using the action tool to no effect, and can't use a where clause for an in-db input
Hi @jbe0230 ,
can you let us know what the flag is in the database to allow you to recognise if the data has already been processed? ie how do you differentiate between processed and un-processed records?
M.
@jbe0230
There might something we can do if there is some timestamps.
@mceleavey Essentially, there is no flag. There is an access date column that we are using, but we have to generate our own flag based off kick off time separately - and then filter the access date to be greater than the kick-off time from the previous run. The general outline for what I'm trying to accomplish is below:
1. generate new kick off time and save for later
2 .grab existing kick off time
3. update input query to read only data based off of last run date
4. load data to target
5. update kick off time with item number 1
I think the biggest problem is that we have to store the kick off time in a different database than the source data. Hopefully I explained this well enough! Happy to elaborate more, if I can
There are timestamps in the source data that I'm thinking we can use to leverage against the alternately stored kick off time.
User | Count |
---|---|
106 | |
82 | |
70 | |
54 | |
40 |