Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Incremental inserts between two databases, how?

joao
8 - Asteroid

Hi,

 

I have two tables A on MSSQL and B on Redshift. I need to incrementally send data from A to B and I can't trust any created_at or updated_at fields. 

 

lixo.jpg

 

Because table A is 9M records I was doing a join ONLY on the primary key to know which records are missing from my Redshift table.

 

How can I do an Input Data with an input like ... SELECT field1,field2 FROM A WHERE primary_key IN list, where list is the output of the join ids (the missing ones)

 

Doing a join primary key is easy and fast, doing a join on the whole table just to select a few records is a waste of CPU.

 

Any ideas?

 

Thanks

Joao 

 

 

 

9 REPLIES 9
RodL
Alteryx Alumni (Retired)

Not sure I'm entirely understanding your process, but from what I gather, you could stream out the list of the missing IDs and feed that into a Dynamic Input tool that has the Select and IN statement from the database you want?

joao
8 - Asteroid

The Dynamic Tool only accepts a green connection .. I can't connect the exit of the filter True to the Dynamic Input In-DB. 

 

Also my data is not In-DB is in MSSQL, so I would need a Dynamic Input Data which I can't find. 

 

Any ideas?

 

Thanks

Joao

RodL
Alteryx Alumni (Retired)

That's why I said you would "stream out" the data (but I wasn't clear on how to do that...you would use the In-DB Data Stream Out tool).

The idea I had was that you could stream out the list, use the IN capabilities of the "regular" Dynamic Input tool to get the records that you wanted, and then put those into a "regular" Output Data tool with output put option in its configuration to "Update; Insert if New".

joao
8 - Asteroid

Hi Rodl,

 

Sorry I got lost ... I got the ids on the browse. Now I need to use them to make a query to MSSQL to get all fields and then Update (or insert if new) on Redshift. 

 

I dont find the In/Out Dynamic Tool, I only have Dynamic for the In-Database.

 

I dont start by selecting all fields because it is a significantly large amount of data.

 

1.jpg

RodL
Alteryx Alumni (Retired)

I've tried to lay out the process that I'm thinking you need below (of course, it's showing errors since I really don't have anything connected). It's built off of your screenshot. Let me know if it doesn't make sense.

 

Updates between databases.png

joao
8 - Asteroid

Thanks Rodl, that was helpful! That worked just one note though. My Query is SELECT * FROM X WHERE primary_key = 'X'.

 

It is executing row by row which is really slow, how do I do : WHERE primary_key IN (SELECT * FROM LIST) ..... how does Alteryx send a list to the Dynamic Input?

 

s.jpg

RodL
Alteryx Alumni (Retired)

You just make the Select statement to be "WHERE primary_key IN ('X', 'Y')" as a template.

It generally will recognize the IN clause and go from there.

 

You might want to check out Help on the Dynamic Input tool as well as my response in this post.

joao
8 - Asteroid

Almost there, I got an error on Output Data that says: "Primary Key Required for Update Option" but if I try to pick one Alteryx says: "Primary Key creation is supported for Create and Overwrite only."

 

Im a bit clueless here.

 s.jpg

RodL
Alteryx Alumni (Retired)

That one I'm not sure about.

You might want to try to create the key on the Redshift side (through some sort of SQL Assistant tool) as opposed to doing it in the Alteryx mapping, and see if Alteryx recognizes it at that point.

If it doesn't, then I would suggest opening a case for that with support@alteryx.com so they can take a look at your setup.

 

Labels
Top Solution Authors