Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Find and Replace is more performant than Join Assertion

AnnArborThom
7 - Meteor

I attended an Alteryx Tips and Tricks session and this years Inspire conference and the "ACE" asserted that:

 

  • In some situations a find and replace can be used instead of a join
  • And when this is done it is faster by a factor of a 100

 

I am having trouble understanding what cases a find and replace can be used instead of a join. 

 

The ACE did state this can be done in only certain circumstances. 

 

Can anyone explain the use case in which this is possible? 

3 REPLIES 3
MarqueeCrew
20 - Arcturus
20 - Arcturus

@AnnArborThom,

 

Suppose you have a large file, maybe 10 Million records, maybe 150 Million records.  The weight of each record is a factor of the number of fields and size of each field.  Now suppose that one of those fields is a coded value with domains of [A-Z].  If you wanted to include the description of each value and you were going to join the data to a metadata table, then Alteryx would sort each of the files into domain orders (A-Z).

 

That is going to take much more time than it would to do a quick lookup into a table of 26 values and put the right description onto the records (no sort required).

 

Find Replace allows for roughly 32,000 values to be put into memory for lookup (lookup allows for partial match and case insensitive match too).  You can replace the value or append other values to the existing record.  The use of Find & Replace is going to save you lots of time in this case, not to mention that it will allow all records on output without having to use a union for unmatched records.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Joe_Mako
12 - Quasar

Does page 43 in the attached Tips and Tricks document help?

 

FROM THE TIP MEISTERS:
ACE Marquee Crew’s Recommendations for Join Vs Find and Replace:
Join is AMAZING if Left and Right inputs have the same record
count and order. Suppose that you’ve used a select tool and do
some functions to create new columns of data that you want to join
back to the original data. You can use the join and use the “Join by Record Position”
configuration (radio button). You can’t get any faster than that. Your data is NOT sorted.
If you are joining a Large set of data to a small set of data (Large is large and small is <
32,000 rows) the Find Replace offers you the ability to put the small data into memory
and avoid the expensive sort. Imagine with 138 million records using a TILE tool and
then looking up eight (8) length descriptors from a text input tool. The JOIN version runs
in 3:40 minutes and the FIND REPLACE version runs in 3:09 minutes.
FIND REPLACE also allows for case insensitive and partial matching. If you’re ever
MarqueeCrew
20 - Arcturus
20 - Arcturus

Hey @Joe_Mako!

 

I thought that that tip sounded familiar :)

 

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels