Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

String/Name match formula/algorithm | Finding substring in a string

Miglani
7 - Meteor

Hi all,

I'm trying to build a string/name match formula/algorithm that gives me a match between strings such as these:

 

(Example for reference)

 

String 1 : A | MndALv5 | D | EFJFJ

String 2: MndALv5C

 

MndALv5 is a subset of MndALv5C, therefore it should give me a match. But it doesn't with this formula that I tried:

 

 

 

 

IF contains([String1],[String2]) OR contains([String2],[String1])
 THEN "Match"
 ELSE "No Match"
ENDIF

 

 

 

 

NOTE1: Fuzzy match is not an option because the population of strings is too big to consider making adjustments to fuzzy logic

 

NOTE2: Doing split to rows by "|" and then checking although is possible but some strings have 50-100 pipes per user and there are 1000 of users in the file, so this approach compromises on the speed of execution.

Looking forward to your response(s).

Thanks.

7 REPLIES 7
Aidan_K
11 - Bolide

Hi @Miglani ,

are you trying to find matches for "MndALv5" in a column of data with various strings?

If so, this would work.

(where the data is in field "String")

--

IF contains([String],"MndALv5")
THEN "Match"
ELSE "No Match"
ENDIF

--

 

An example is attached for clarity.

Regards,

Aidan

Miglani
7 - Meteor

Thanks for the reply.

It's not exactly substring because there is an extra "C" to it which is where the challenge stems from. Otherwise, it would have been straight forward.

But yes, I am searching the string is a column of strings where the column has values separated by pipes.

So unfortunately, this doesn't quite work because the Alteryx "Contains" only search from start and not from the middle.

Aidan_K
11 - Bolide

the workflow should work for the scenario described. 

Here is a new version with "MndALv5C" and "MndALv5c" and they are both flagged as match to "MndALv5" as required.

 

hopefully this is what you need 

Miglani
7 - Meteor

Enter the additional "C" in the input(second argument in contains) and remove it from the strings in the input column.
It will show as "No Match" when it should show as "Match".

Miglani_0-1611059151028.png

 

Aidan_K
11 - Bolide

So your actually looking to match both records?

I would use nested IF:-

IF Contains([String], "MndALv5")

THEN "match"

ELSEIF Contains([String], "MndALv5c")

THEN "Match"

ELSE "no match"

ENDIF

 

Attached workflow updated

Miglani
7 - Meteor

I just took that as an example (for reference).
My actual data has like 20,000+ more records.

This is not really what I'm looking for.

 

I'll try to explain it again:
I have 2 columns with a lot of values.


Column1               Column2
MndALv5C           A |  MndALv5 | B | C
No role                 General|CCP Agent|CCP Agent|General|Report Access


I want to match values of column 1 & column 2.
For example: Here in the first row, the result should show "match" and in second row it should show "no match".

Notice the extra C, which is why contains doesn't work. I already put the contains formula with my question.

Aidan_K
11 - Bolide

can column 1 record have any number of preceeding or following characters like

 

Column1                        Column2
MndALv5C                     A |  MndALv5 | B | C
ABMndALv5C                A |  MndALv5 | B | C
MndALv5DEF                A |  MndALv5 | B | C

abMndALv5CDEFGC    A |  MndALv5 | B | C

 

and can column 2 have the same ?

 

Column1                        Column2
MndALv5C                     A |  MndALv5de | B | C
ABMndALv5C                A |  BMndALv | B | C
MndALv5DEF                A |  MndALv5hbhj | B | C

abMndALv5CDEFGC    A |  abMndAL | B | C

 

if this is the case I would try to get the minimum number of characters required to make a successful "match" and go from there or "fuzzy match" but you've said this wont work?

 

You might be left with your mentioned option to parse the data on "|" and match that way, even if it may take longer, it will be more accurate

------------------------

edit -

 

Have a look at this post from which may help with the fuzzy match option.

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Quick-Question-String-Input-on-Fuzzy-M...

 

I did a quick test of Ben's workflow and I think you could use it as a starting point?

see attached.

Labels