I'm running a large workflow to compare to customer data sets. Found out that obvious matches are missed. Tried to match them separately and they were matched. Apparently the enablement of AMP engine impacts how data is fuzzy matched. The only difference is the ZIP+4 value (US addresses).
Here is the data set (anonymized for this forum):
AHN | NAME | SOURCE | ROW_ID | FULL_ADDRESS |
ABCDEF0SA | MICHAL OSUCH DVM | 1 | 1-1 | 546 N EASTERN AVE, LAS VEGAS, 89101-3481, CLARK, NV |
ABCDEF0SA | MICHAL OSUCH DVM | 2 | 2-1 | 546 N EASTERN AVE, LAS VEGAS, 89101-3486, CLARK, NV |
Workflow is as simple as that:
Configuration of fuzzy match is following:
Here's the funny part.
AMP disabled - 0 matches
AMP enabled - match found
Anyone else experienced that?
thanks,
Michał
Thank you for using the AMP Engine and for providing feedback.
I encourage you to continue to report any use case issues that you find with running workflows with AMP Engine enabled. We worked hard to identify differences from the original Engine as well as provide guidance on how to better optimize workflows to run with AMP.
That is correct, different results can be experienced between original Engine and AMP Engine with Fuzzy Matching.
https://help.alteryx.com/current/designer/alteryx-engine-and-amp-main-differences
Fuzzy Match
Fuzzy Match may have different results between the original engine and AMP. AMP records are matched using an alternative method. The order of match might be different and the output may be in reverse order as well.
In addition, I wanted to share some helpful links to other available documentation about AMP Engine:
https://help.alteryx.com/current/designer/alteryx-amp-engine
https://help.alteryx.com/current/designer/AMP-Memory-Use
https://help.alteryx.com/current/designer/tool-use-amp
https://community.alteryx.com/t5/Engine-Works/AMPlify-your-Workflows/ba-p/617590
AlterEverything Podcast: https://community.alteryx.com/t5/Alter-Everything-Podcast/66-The-Alteryx-AMP-Engine-Explained/ba-p/5...
Hi @TonyaS
I'd consider is as a "feature" when getting different match rate but not when having 96% match vs not having it at all. So it thing to work on for Alteryx.
I've enabled AMP for full workflow then one of large yxdb source files is not loading any records and no error message. Just empty.
Michal
I did a bit more of investigation here and more I try to use AMP the more questionable it gets.
Example are two records with exactly same name and two different addresses:
14 SUNDEW RD, SAVANNAH, 31411, GA and 14 ROBIN WAY, BEAUFORT, 29907, SC
Fuzzy match setting is by "company name" and "address"
With legacy engine they both wont get matched, with AMP they get 100% match but only when fuzzy match works over some 250k other records. When I filter data set just for those 2 records they do not get matched with AMP. Like the result would be completely mixed up with large data sets.
Michał