Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Alteryx is here to help you solve your biggest data challenges. Read about the new Virtual Solution Center here.
SOLVED

Question on how Find Replace tool works when both "R" and "F" streams have some duplicates

Highlighted
8 - Asteroid

I have two datastreams. First looks like this:

 

Apple

Apple

Peach

Watermelon

Banana

Apple

Apple

 

The second looks like 

 

Apple

Peach

Apple

Banana

Watermelon

 

My point being.. The first dataset is smaller, the second is bigger. I want to append ALL of the items from the first set, to the second. But some of the items in the first set are identical. Will find replace only take the first record of apple from set 1, to append next to both records of apples in set 2? Or once a record has been matched it looks for the next record to match, even if the first record could theoretically match too? The behavior I need is the latter.

 

Basically even though there are "duplicates" in set 1 and set 2, I want the find/replace to treat each repeated record as unique. I know a simple solution would be to append a unique identifier to each record that is repeated (in each stream) before the "find replace", but I'm not sure how. Any ideas? Do I simply use the unique tool and then add an incremental number to "D" output of the tool and then append them again? So it would look like Apple Apple1 Apple2 Apple3?

 

Basically the join/replace tool I'm looking for should make the result should look like this:

 

Apple            Apple (1st apple from Set 1)

Peach          Peach (Peach from Set 1)

Apple          Apple (2nd apple from Set 1)

Banana        Banana (Banana from Set 1)

Watermelon   Watermelon (Watermelon from Set 1)

 

The last 2 apples from Set 1 should remain unmatched.

Highlighted
Alteryx Certified Partner

This one..

 

Will find replace only take the first record of apple from set 1, to append next to both records of apples in set 2

 

Imagine F&R as a vlookup

Highlighted
8 - Asteroid

@LordNeilLord Thanks. So the solution is to in fact add a unique identifier to each repeating record before going into Find Replace.

Highlighted
7 - Meteor

If I understand the result you are trying to get correctly, then a join may be what you are after.

 

For each Apple in your second dataset you would end up with 4 records when you join on Apple. If you then need to replace a different value in the dataset, you can either do that in the join tool through the select part or use a formula afterwards.

Highlighted
8 - Asteroid

@Lbunce The result needs to look like this:

 

Apple            Apple (1st apple from Set 1)

Peach          Peach (Peach from Set 1)

Apple          Apple (2nd apple from Set 1)

Banana        Banana (Banana from Set 1)

Watermelon   Watermelon (Watermelon from Set 1)

 

The last 2 apples from Set 1 should remain unmatched.

7 - Meteor

In that case you need to use identifiers as you mentioned before. I don't think just adding ID's is enough as I assume you would want this to scale when you add more data.

 

I've attached a workflow which I believe will solve the problem, it sorts both datasets into the order in which the item like Apple appears and then adds a number to say which occurrence of that word it is and joins based on that, after the join you can remove fields that aren't needed and change things but it gives you the result you need. This should also work when you add more rows into either dataset.

 

FInd_Replace_Example.PNG

Highlighted
17 - Castor
17 - Castor

hi @AkisM 

 

You might want to try something like this

 

w.png

 

Inputs 1 and 2 contain the data from your first post.  For both of these, add a record ID(InputxID) and then sort by item name and InputxID.  The Multi-row tool adds an ItemOrder based on record position within each group, i.e. Apple 1, Apple 2, Apple x,  Banana 1, Banana 2, etc.  Join on Item name and ItemOrder, so Apple 1(R) is joined with Apple 1(L).  Sort by Input 2, giving you

 

r.png

 

The 1st 3 columns are only the records from Input 1 that match the corresponding records in input 2 by position.

 

Edit: @Lbunce Hey look.  Twins

 

Dan

 

 

Labels