Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Match Score significant digits

sbrasfield1
5 - Atom

Greetings, how does one define how many decimal places on a fuzzy match score? The match score is returning a whole number, but I need more granularity as there are some "ties" which surely cannot be true mathematical ties, and I would like to know which rounds to 60.4 vs 60.6, for example. I've tried the Select tool after the fuzzy match tool but it only reads as 60.000. Thanks in advance!

3 REPLIES 3
alexnajm
17 - Castor
17 - Castor

I don't think this is possible natively within Alteryx - it only comes across in whole numbers in my experience

 

What level of accuracy are you hoping to obtain between 60.4 and 60.6?

sbrasfield1
5 - Atom

That's unfortunate, and surprising to be honest. Are you aware of any tricks to get around this limitation? 

 

In my use case I need to compare ICD-10 diagnosis codes, where there is a workflow that specialists are recommending exchanging one code for another, I want fuzzy match to give me a prediction, and it is working very well, except for my inability to prevent certain duplicates. 

 

For example, Alteryx says that both K21.9 and G62.9 are a 60% match for K66.0, and so in my results it's matching both to K66.0 instead of only one of them (ideally K21.9) and G62.9 should fall below the threshold and remain unmatched. I am thinking that they can't both be a perfect 60% and so if one of them is 60.x and the other 60.y, then hopefully I could get the resolution I'm looking for.

alexnajm
17 - Castor
17 - Castor

No tricks that I am aware of, unless you rebuild the entire logic of the Fuzzy Match somehow in a macro. Editing the XML doesn't seem to be an option either.

 

I would look at alternative ways to decide on the record to keep when there are duplicates - the +/-0.5 difference that the rounding is not showing is probably better served by your deciding logic of the use case!

Labels