I have a workflow that takes 30 minutes to run a Python script that calls an API that outputs 800k rows (this part takes 30mins), and the part that writes these rows to the DB (SL server using a Output Tool) takes an hour so total time for this workflow is 1.5 hours. Is there a way this can be faster?
Solved! Go to Solution.
What are the data types of the fields you are writing? Try adding Auto Field tool before the Output tool.
What's your underlying DB that you are using? Have you tried a bulk tool or datastream in? There are lots of reasons why this could take a long time - but it's hard to identify with the information you've given.
As a general rule, if you are not careful with your field sizes, it will slow down insertion of data. This is why Auto Field Tool may help. It will size your fields to smallest appropriate size. If you know the data sizes you can also use a select tool and skip the auto field tool. Also, if you are inserting blob data that will be slow in my experience. I also agree that bulk load should be a lot faster as @apathetichell mentioned.
Hello @avanibhatnagar
-check your field types/length
-by default Alteryx use an insert command, which isn't fast to say the least... however usually you can also plau on the quantity of row you insert each time.
-on some db you can do "bulk load" using in-db tools, it can be really faster (check your write option of the in-db connection)
-you can also do your own bulk load by generating a zip/gz file and then a cmd to load it in the base
Best regards,
Simon
Thank you!! It is a lot faster with the bulk load, but now I can't get it to run on gallery. I have never saved a Data Connection with Bulk Load before - what am I doing wrong?
This is what I tried saving on Alteryx Server in Data Connections in Admin:
ssvb:DSN=NAME_OF_DSN;Server=SERVERNAME;Database=DBNAME;UID=username;PWD=__EncPwd1__|||TableName
I tried some variations of this and it does not work. Is there a syntax for the Data Connection String on Server and Admin?
@avanibhatnagar Well, you have a read and write tabs, you must use the bulk load option (it can has distinct name such as hdfs for hive) in the write tab of your in db alias connection window. And a classic odbc string like you show for the read.
Do not hesitate to show some screenshot.
Best regards,
Simon
These are screenshots of what I have in the output tool, the error I get on gallery, how I have the BULK Connection in the Gallery, and how i have the DSN mentioned in the Connection in the gallery (I mention the DSN in the BULK connection set up and it is exactly what I have the connection set up as in gallery)
your database is not set up on your server. Your database must bet set up in ODBC 64 and the driver must be installed.
The DB is set up on the server. I am connecting to that DB just fine in a non bulk connection, but when I make the bulk connection is when it fails. I have 2 different connections - 1 for the DB normally and that one works and another for the same DB but for bulk.