I'm working with a finite, known set of individual data strings similar to this:
| AAA |
| AAA AA |
| BBB |
| CCCCC |
| DDD DD |
The strings will come in groups into a single cell and can look like the below:
| Key | String 1 | String 2 | Match / No Match | Duplicate String |
| Johnson | AAA; AAA AA; BBB; CCCCC; DDD DD | AAA; AAA AA; BBB; CCCCC; DDD DD | Match | No |
| Brown | DDD DD; AAA AA; BBB | BBB; AAA AA; DDD DD | Match | No |
| Smith | AAA; CCCCC; DDD DD | AAA; DDD DD | No match | No |
| Jones | CCCCC | CCCCC; AAA | No match | No |
| Topeka | CCCCC; AAA AA; DDD DD | AAA; AAA AA; BBB; CCCCC; DDD DD | No match | No |
| Bob | BBB; CCCCC; BBB | BBB; CCCCC | No match | Yes |
The strings always come separated with a semi colon and space, and the last string will not have a semi colon.
I would like to know a way to either match strings if they are the same but out of order as with Brown, or reorganize the strings to a particular order (say alphabetically) so that Brown's String 1 & String 2 would match using a Join tool.
I'd also like to ID whether a cell has a duplicate string in it as with Bob.
Any ideas?
Hi @Rob48,
I came up with a workflow that does what you are asking.
Hope this helps! @me if you have any questions :)
Hi, @Rob48
FYI.
IIF(REGEX_CountMatches(',' + [String_1], '(,[^,]+)\1,') || REGEX_CountMatches(',' + [String_2], '(,[^,]+)\1,'), 'Y', Null())
Output:
| Key | Tile_Num | String_1 | String_2 | Match | Duplicate |
| Johnson | 1 | AAA,AAA AA,BBB,CCCCC,DDD DD | AAA,AAA AA,BBB,CCCCC,DDD DD | Y | |
| Brown | 2 | AAA AA,BBB,DDD DD | AAA AA,BBB,DDD DD | Y | |
| Smith | 3 | AAA,CCCCC,DDD DD | AAA,DDD DD | ||
| Jones | 4 | CCCCC | AAA,CCCCC | ||
| Topeka | 5 | AAA AA,CCCCC,DDD DD | AAA,AAA AA,BBB,CCCCC,DDD DD | ||
| Bob | 6 | BBB,BBB,CCCCC | BBB,CCCCC | Y |
@Rob48 - Not to take anything away from @flying008 who's answer works just fine, I was curious enough about your ask and used it to practice. I ended up with a solution that does not require Regex, if that's more your speed. Note that Regex certainly cuts down on the number of tools though!
Here's another option to what's already been provided:
@jmgross72's works great too, it just uses the Data Cleanse Pro tool that not everyone may have access to, depending on their version.
Hope this is helpful, -Jay
