Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Output Data Tool Takes Hours to Complete

cpet13
10 - Fireball

I have a workflow that takes data from a database and writes it to a different database, which used to run in about 30 minutes. I added in another data source, joined the two sources together, and added about three columns to the original data source from the new source. But after making that change, the workflow now takes over 8 hours to run.

 

When I do performance profiling on the workflow, it indicates that the output data tool is what is taking the longest, 99.5% of the time. All of the other tools run and finish in a few minutes. In the output data tool, I have tried using Delete Data & Append as well as Overwrite Table, but there is very little difference in the performance of the two.

 

What is going on? How can I make the output data tool not take so long? Why would the workflow now be taking so long to run?

4 REPLIES 4
apathetichell
19 - Altair

There's like a hundred different reasons why this could happen.

 

1) is AMP on.

2) are you using In-DB tools or on canvas

3) can you toggle off your output data and can you add a count instead? How many records are you uploading. (ie data quantity)

4) in resource monitor - are you maxing out your ram for your workflow

5) Is there a new column that you are adding which has a character which does not work in the downstream database (ie a space)

6) Have you made changes to your connections in odbc 64?

 

there's probably a few more. but the usual culprit is non-distinct joins exploding your datasize accidentailly.

cpet13
10 - Fireball

Hi @apathetichell , thank you for your reply.

 

1) AMP is on

2) I am not using In-DB tools

3) I have toggled off the output tool. There are about 1.11 million  records, which is what the query was returning before (I made sure that the join wouldn't add duplicate values).

4) When the workflow is running, I jump from 85% of memory usage to about 94%.

5) The values of the columns I have added are either null or decimal

6) I have made no changes to the data connection.

 

 

cpet13
10 - Fireball

As an update, I think I have found the solution. A few of the fields were being cast to strings of enormous lengths (677K, or something like that). I changed those, as well as the other fields, to be more logical sizes, and that reduced the time to about 20 minutes.

apathetichell
19 - Altair

Yup - skipped the "throw in an auto-field" before output data selection - which can also solve some of these issues.

Labels