Alteryx Designer Desktop Discussions

Watermark · ‎01-12-2021

Anyone explain what's going on here? I have a file 24G, 31M lines joining to a file that's 15 Megs, 51k lines. You can see the stats on the join is blowing up

AngelosPachis · ‎01-12-2021

Hi @Watermark ,

That's a common issue with a join, if both L and R inputs have duplicate records. There are many posts in the community that address this

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Join-returns-too-many-records/td-p/308...

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Why-My-Join-Is-Getting-More-Records-th...

The most common solution is either to stick a unique/summarize tool before your join or increase the number of fields you are joining on. If you also work with that many records, I will suggest exploring the Calgary tool palette. It indexes your data base and your workflow will run much faster.

Hope that helps

Angelos

Watermark · ‎01-12-2021

Angelos,

It's a simple CSV connecting to a spreadsheet. It only has one field to join on, that's the URL. I'm going to go look at the to links you entered.

Emil_Kos · ‎01-12-2021

Hi @Watermark,

It is also worth to mention that if you got empty or null columns they will also create thousands of duplicates. So it is worth to keep that in mind each time when you are performing join tool.

JBLove · ‎01-12-2021

@Watermark ,

Are you expecting there to be only one row per URL?

If that's not the case then you may need to do some investigation in the data to understand what other data elements are causing the URLs to appear on multiple rows. Perhaps a filter needs to be applied to the data or you can pare down the number of columns and follow @AngelosPachis suggestion on using the Summarize tool to remove duplicates.

Watermark · ‎01-12-2021

Yep, Enormous number of duplicates (not expected, lesson learned), as well as a hefty chunk of nulls (also not expected). Thanks for the help.

Alteryx Designer Desktop Discussions

Simple join is blowing up, help?

Re: Issue with “Block Until Done” and Multiple Out...

Re: Question anchor in Formula and Filter tool

Re: How to Filter out with certain value

Getting Started | Learning Workflows: Predictive T...

Re: Need to Trim Suffix Strings of Variable Length