Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Write data in-db tool extremely slow?

mst3k
11 - Bolide

I have a workflow here which is pretty simple, I'm joining an input data file to 1 database query, anything that doesn't match is joined to another database query instead, then I bring it back together and write to a table.

 

This is all occurring in-db - nothing is streaming back down to my computer that would make it slow.

I am also restricting to just the top 1000 records in the first query. So only 1000 records pass through this whole workflow.

 

With only 1000 records, the workflow runs in 4 seconds... until I add a write data in-db tool at the end. It's literally just putting those 1000 records into a table, and not even streaming them from my computer since it's all in-db. That extra write in-db tool takes the run time from 4 seconds to 2-3 minutes. 2-3 minutes to just write 1000 rows into a table using records that are all still in-db? My full starting dataset has 1.5 million records. It would take months to run the whole thing at that pace?

 

mst3k_0-1633620348787.png

 

mst3k_0-1633620862852.png

 

2 REPLIES 2
mst3k
11 - Bolide

hmm I think I partially figured it out, without the write in-db tool there, i don't think it's actually executing anything. it realizes there's no "end goal" and just goes through the motions. 

if i use a browse in-db tool instead of a write in db, it still takes 2 minutes, so i assume the query itself is finally executing and taking the extra time.

 

what's the proper way to use in-db tools to look up a list of data against a database? those extra connect in-db tools that it's joining to are not individual tables, they are queries. i want to take my starting records and look up a bunch of things. so i put a simple query in those other connection tools, that joins 4 or 5 tables. then i used the join in-db to connect my table to it.

 

am i not using the indexes by doing it this way? does it try to execute the whole query in the connect-in-db tool first, before doing the join? do i have to put each table i want into a separate connect in-db tool and do all the joins using alteryx join tools rather than just putting a single query into the connect in-db?

if i do this entire exercise in SSMS it only takes a few seconds. 

 

cmcclellan
14 - Magnetar

When you say SSMS I assume you're using SQL Server ? - that's important as well 

 

For every inDB tool there is (usually) a temp table generated, so your first join is a temp table, then the filter is (possibly?) a query on that table, then the Select could be another query/part of the filter query/another temp table (you would need to see what's happening in the log file or in the database to tell the difference).  But you're right, the REAL processing happens when you are writing or browsing the data.

 

BUT having said all that, try it on a million records, I don't think the processing time will be as linear as you think, it might do a million records in 5-6 mins 

Labels
Top Solution Authors