Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Workflow writing 800k rows to DB takes too long

avanibhatnagar
8 - Asteroid

I have a workflow that takes 30 minutes to run a Python script that calls an API that outputs 800k rows (this part takes 30mins), and the part that writes these rows to the DB (SL server using a Output Tool) takes an hour so total time for this workflow is 1.5 hours. Is there a way this can be faster?

 

 

9 REPLIES 9
TimN
13 - Pulsar

What are the data types of the fields you are writing?  Try adding Auto Field tool before the Output tool.

apathetichell
19 - Altair

What's your underlying DB that you are using? Have you tried a bulk tool or datastream in? There are lots of reasons why this could take a long time - but it's hard to identify with the information you've given.

TimN
13 - Pulsar

As a general rule, if you are not careful with your field sizes, it will slow down insertion of data.  This is why Auto Field Tool may help.  It will size your fields to smallest appropriate size.  If you know the data sizes you can also use a select tool and skip the auto field tool.  Also, if you are inserting blob data that will be slow in my experience.  I also agree that bulk load should be a lot faster as @apathetichell mentioned. 

simonaubert_bd
13 - Pulsar

Hello @avanibhatnagar 
-check your field types/length
-by default Alteryx use an insert command, which isn't fast to say the least... however usually you can also plau on the quantity of row you insert each time.
-on some db you can do "bulk load" using in-db tools, it can be really faster (check your write option of the in-db connection)

-you can also do your own bulk load by generating a zip/gz file and then a cmd to load it in the base

Best regards,

Simon

avanibhatnagar
8 - Asteroid

Thank you!! It is a lot faster with the bulk load, but now I can't get it to run on gallery. I have never saved a Data Connection with Bulk Load before - what am I doing wrong? 

This is what I tried saving on Alteryx Server in Data Connections in Admin: 
ssvb:DSN=NAME_OF_DSN;Server=SERVERNAME;Database=DBNAME;UID=username;PWD=__EncPwd1__|||TableName

I tried some variations of this and it does not work. Is there a syntax for the Data Connection String on Server and Admin?

simonaubert_bd
13 - Pulsar

@avanibhatnagar Well, you have a read and write tabs, you must use the bulk load option (it can has distinct name such as hdfs for hive) in the write tab of your in db alias connection window. And a classic odbc string like you show for the read.
Do not hesitate to show some screenshot.
Best regards,

Simon

avanibhatnagar
8 - Asteroid

@simonaubert_bd 

 

These are screenshots of what I have in the output tool, the error I get on gallery, how I have the BULK Connection in the Gallery, and how i have the DSN mentioned in the Connection in the gallery (I mention the DSN in the BULK connection set up and it is exactly what I have the connection set up as in gallery)

apathetichell
19 - Altair

your database is not set up on your server. Your database must bet set up in ODBC 64 and the driver must be installed.

avanibhatnagar
8 - Asteroid

The DB is set up on the server. I am connecting to that DB just fine in a non bulk connection, but when I make the bulk connection is when it fails. I have 2 different connections - 1 for the DB normally and that one works and another for the same DB but for bulk.

Labels