This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
My point being.. The first dataset is smaller, the second is bigger. I want to append ALL of the items from the first set, to the second. But some of the items in the first set are identical. Will find replace only take the first record of apple from set 1, to append next to both records of apples in set 2? Or once a record has been matched it looks for the next record to match, even if the first record could theoretically match too? The behavior I need is the latter.
Basically even though there are "duplicates" in set 1 and set 2, I want the find/replace to treat each repeated record as unique. I know a simple solution would be to append a unique identifier to each record that is repeated (in each stream) before the "find replace", but I'm not sure how. Any ideas? Do I simply use the unique tool and then add an incremental number to "D" output of the tool and then append them again? So it would look like Apple Apple1 Apple2 Apple3?
Basically the join/replace tool I'm looking for should make the result should look like this:
Apple Apple (1st apple from Set 1)
Peach Peach (Peach from Set 1)
Apple Apple (2nd apple from Set 1)
Banana Banana (Banana from Set 1)
Watermelon Watermelon (Watermelon from Set 1)
The last 2 apples from Set 1 should remain unmatched.
If I understand the result you are trying to get correctly, then a join may be what you are after.
For each Apple in your second dataset you would end up with 4 records when you join on Apple. If you then need to replace a different value in the dataset, you can either do that in the join tool through the select part or use a formula afterwards.
In that case you need to use identifiers as you mentioned before. I don't think just adding ID's is enough as I assume you would want this to scale when you add more data.
I've attached a workflow which I believe will solve the problem, it sorts both datasets into the order in which the item like Apple appears and then adds a number to say which occurrence of that word it is and joins based on that, after the join you can remove fields that aren't needed and change things but it gives you the result you need. This should also work when you add more rows into either dataset.
Inputs 1 and 2 contain the data from your first post. For both of these, add a record ID(InputxID) and then sort by item name and InputxID. The Multi-row tool adds an ItemOrder based on record position within each group, i.e. Apple 1, Apple 2, Apple x, Banana 1, Banana 2, etc. Join on Item name and ItemOrder, so Apple 1(R) is joined with Apple 1(L). Sort by Input 2, giving you
The 1st 3 columns are only the records from Input 1 that match the corresponding records in input 2 by position.