Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Workflow freezes at last join

robinsonm6
5 - Atom

I have a workflow that is processing 2.3GB of data (3.5Mrecords). When the workflow gets to the last join in the workflow that enhances the dataset with a single column it freezes at ~50%-54%.

 

I removed the join from the workflow and upon doing so the freeze happens at the closest join upstream.

 

I have tried to replace the join tools with new ones as well as substitute the join for a find & replace tool.

 

This workflow worked fine for last month's processing but now refuses to complete. The join does not appear to be creating a cartesian join based on previous runs so I am at a loss on how to troubleshoot (v2020.2).

 

robinsonm6_1-1610384303751.png

 

5 REPLIES 5
Emil_Kos
17 - Castor
17 - Castor

Hi @robinsonm6,


You are creating tons of duplicates. Please use the sample tool on the key and you will be able to quickly check if that is the case. 


One of the most common reasons is also a null or empty column on the key.

 

Please see the below post to check more detailed explanation:

 

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Join-Tool-generates-duplicate-values/m...

robinsonm6
5 - Atom

Thanks for the quick reply. I do tend to duplicates with the Unique tool applied to a UID field after the J outputs then union the L & J.

 

The oddest part as you can see below is that after I reconnect another join tool downstream (being the last one in the stream now) the freeze happens at that join.

Emil_Kos
17 - Castor
17 - Castor

Hi @robinsonm6,

 

I believe I had seen several issues like this and it always was due to the duplications in the joins.


If that is the case I don't have any other tip for you.

pedrodrfaria
13 - Pulsar

Hi @robinsonm6 

 

When it comes to the Join tool, these are the usual issues:

 

- Duplicate values, make sure you have unique primary keys.

- Null and blank values could be creating multiple duplicates

- Join by nature is a slower tool than the Find & Replace. You should always use it if you can when dealing with large datasets.

 

These are some examples of thigs you need to watch out for educational purpose.

 

Pedro.

 

robinsonm6
5 - Atom

After trying and checking all the suggestions it turns out it was the output tool. Trying to write 3.5M records to the server was bogging the workflow down and made it appear that it was a problem with the last join. I ended up writing to a local db then copying the table using sql. Thank you for all your assistance!

Labels