Hello folks,
I have a spreadsheet of 5000 records and then need to join to a 30m+ records Database. I filtered as much as I could before it exits that input tool. The spreadsheet takes 2 seconds to run. However, the data needed from the Database takes 20 minutes. Isn't there a way to use the 5000 records as a Pre-sql statement so I don't have to wait this long? Been looking at Macros and pre-sql statements. If not what am I missing? I have a ton of these scenarios. But, nothing is coming together for me. Does anyone have some simple examples I can download?
Thanks in advance.
Kristin
Solved! Go to Solution.
Is there a single field that is joining? One way to go about this would be to only query the records from the database that have the join field value that appears in the spreadsheet. Create a list of the distinct values from the spreadsheet and use that to update a WHERE clause in the query. If you have more than 1000 unique values to put in the WHERE clause, a couple of extra tools can break the query into a few batches for a Dynamic Input tool to execute.
The In-Database tools can also be useful here in a similar process. Create the unique list of join field values from the spreadsheet, then use the Data Stream In tool to include that list into your In-Database process that returns only the necessary values.
Hey Charlie, thanks for the quick reply.
Ok I like the idea of connecting the Data Stream in. I've used the In-DB feature but starting with the Connect-In-DB. I tried to connect the Data Stream In to the Connect in DB tool and they don't connect. This where I want the 5000 records to talk to the DB right?
Yep I built this as I was waiting for your answer. I used the join and it's 6 minutes. WAY better than before. Thanks!!!