Incremental inserts between two databases, how?

Hi,

I have two tables A on MSSQL and B on Redshift. I need to incrementally send data from A to B and I can't trust any created_at or updated_at fields.

Because table A is 9M records I was doing a join ONLY on the primary key to know which records are missing from my Redshift table.

How can I do an Input Data with an input like ... SELECT field1,field2 FROM A WHERE primary_key IN list, where list is the output of the join ids (the missing ones)

Doing a join primary key is easy and fast, doing a join on the whole table just to select a few records is a waste of CPU.

Any ideas?

Thanks

Joao

Common Use Cases

Accepted answers

RodL

I've tried to lay out the process that I'm thinking you need below (of course, it's showing errors since I really don't have anything connected). It's built off of your screenshot. Let me know if it doesn't make sense.

Updates between databases.png

All comments

RodL

Not sure I'm entirely understanding your process, but from what I gather, you could stream out the list of the missing IDs and feed that into a Dynamic Input tool that has the Select and IN statement from the database you want?

joao

The Dynamic Tool only accepts a green connection .. I can't connect the exit of the filter True to the Dynamic Input In-DB.

Also my data is not In-DB is in MSSQL, so I would need a Dynamic Input Data which I can't find.

Any ideas?

Thanks

Joao

Quick Links

This months top contributors

atcodedog05 19598

Qiu 15867

binu_acs 15708

MarqueeCrew 13708

apathetichell 13703