Alteryx Designer Desktop Discussions

MarshallG · ‎11-14-2018

Hi all-

My situation is that I am trying to replicate one pretty long table (~25 million records). The old data doesn't change, however newer data does get updated and new records are created. I'm trying to avoid moving the entire 25 million records every night because the original source DB is SLOW.... So I currently have two staging tables (one with the old data 20+ million records) that isn't being changed and a 'new data' table that is being overwritten nightly with the most recent data. Then, I am Unioning them into the 'production' table.

My question is whether the two workflows below are effectively doing the same thing? In the first, I use InDB tools (Redshift, in this case) to Select from the two tables, union them, and write into the final table. In the second, I use a Pre-SQL statement of something like this in a 'traditional DB input tool:

DROP TABLE IF EXISTS detail;
SELECT query.* INTO detail

FROM (SELECT * FROM staging_only.detail_new
UNION ALL SELECT * FROM staging_only.detail_old) query;

Is there any meaningful difference in how Alteryx and/or Redshift would handle these or any performance benefit to one vs the other (and yes, obviously the traditional Input tool will return 1 row of data but that doesn't meaningfully affect the load time)? Thanks!

cmcclellan · ‎11-14-2018

I don't think there's any difference at all ... the only one I can see would be that in your SQL you are defining the table name, for in-DB the table name is created automatically & randomly.

Alteryx Designer Desktop Discussions

Union Two Tables: InDB vs PreSQL statement

Re: Is there any way the computer vision tools can...

Re: Batch Macro

Re: How to get cell reference address from excel

Re: Replacing Forecast columns with Actual Data

Re: Row creation