I have a workflow that runs separate query defined in the 2 connect in-db tools. Each query runs on a different server. Then I use the data stream out tool so that I can union those two results. However, the workflow appears to take forever. When running it, it seems like it's pulling a lot more records than it should. In the screenshot, the top flow should pull about 35k records, while the bottom flow pulls about 2m records. Instead, the top flow is pulling 20m+ and then eventually gives me the 35k records I want. The bottom flow is pulling 100m+ (cancelled it after that). Why is it doing this and how can I optimise this? The workflow should eventually union the data and output that as an extract to tableau server as shown in the screenshot.
My first piece of advice is because you are not doing any In-DB transformations, you only need to use a regular Input Data tool instead of the Connect In-DB tool. My second suggestion is to take a look at the drivers that are being used to query the data. It may be worth trying OLEDB/ODBC/OCI and checking to see which runs the fastest. Also, make sure that you take a look at our data sources page for more driver information: https://help.alteryx.com/current/designer/data-sources
Can you elaborate on how the top flow is pulling 20M records but then gives you 35k? I'm not quite sure what you mean here. Have you executed the query independently of Alteryx to validate the number of records that you are expecting? Where are you getting the 20M number from?
Maybe I'm misreading but it streams like 1k a minute. If I run either query in SQL Server, it takes me two minutes tops.
The bottom flow should give me ~2m records, so 1k is far too slow. Please see newly attached.
edit: I also tried input data tool suggestion on a different tab. Getting the same performance issue
The 17MB is referencing the size of the data that has transferred. For Microsoft SQL Server I might suggest trying an OLEDB connection with the SQL Server Native Client 11 option. Sometimes this runs significantly faster relative to ODBC.
I'm trying to do OleDb method, and I've set it up for the top flow. However the bottom flow is failing. I tested connection and succeeded, but when I input the query and test it, I get an error. Error: : \HYT00 = 0 in Query....
Can you just do OLEDB for the top and regular odbc for the bottom? This depends on the type of database that you are trying to connect to
I think I fixed it by using the query in both the pre sql and the sql, It ran and completed in under 10 minutes.
You shouldn't need it in the PreSQL if you have the same query in your main SQL, but maybe something else was going on