Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to find a word to word matching percentage in 2 strings?

Nikita912
5 - Atom

I have 2 columns with 2 strings as below:

 

Col1                                  Col2                                       Match %

I am fine                            I am fine                                100

Great Wall of China          The great wall of China          80

 

In the first case, all the words match and therefore the %age is 100 however, in the 2nd case 4 out of 5 words match and the %age is 80. Col 1 can be a substring of Col2 and vice versa and there can be cases where strings are not sub string of each other/do not contain each other. The match is case insensitive.

 

Thanks in Advance!

3 REPLIES 3
binuacs
20 - Arcturus

@Nikita912  one way of doing this

 

The logic is to transpose the Col1 and Col2 then split it into words and find the length of each words. Group based on record ID and find the sum of the word length and calculate the percentage.

 

Note: The percentage I used the Round([field],10), so that it will round the result to nearest 10

 

binuacs_0-1649934948133.png

 

binuacs
20 - Arcturus

@Nikita912 Just noticed that the requirement is exact word match not length match. My solution only looking for the word length match,  I guess which is not what you are  looking for?

Nikita912
5 - Atom

I've a huge dataset and applying this logic might be a little challenging, is there any other way of doing it?

Labels