Union Two Tables: InDB vs PreSQL statement
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all-
My situation is that I am trying to replicate one pretty long table (~25 million records). The old data doesn't change, however newer data does get updated and new records are created. I'm trying to avoid moving the entire 25 million records every night because the original source DB is SLOW.... So I currently have two staging tables (one with the old data 20+ million records) that isn't being changed and a 'new data' table that is being overwritten nightly with the most recent data. Then, I am Unioning them into the 'production' table.
My question is whether the two workflows below are effectively doing the same thing? In the first, I use InDB tools (Redshift, in this case) to Select from the two tables, union them, and write into the final table. In the second, I use a Pre-SQL statement of something like this in a 'traditional DB input tool:
DROP TABLE IF EXISTS detail;
SELECT query.* INTO detail
FROM (SELECT * FROM staging_only.detail_new
UNION ALL SELECT * FROM staging_only.detail_old) query;
Is there any meaningful difference in how Alteryx and/or Redshift would handle these or any performance benefit to one vs the other (and yes, obviously the traditional Input tool will return 1 row of data but that doesn't meaningfully affect the load time)? Thanks!
Solved! Go to Solution.
- Labels:
- In Database
- Input
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I don't think there's any difference at all ... the only one I can see would be that in your SQL you are defining the table name, for in-DB the table name is created automatically & randomly.
