Hi There,
I have two files:
1. User List File: This contains a list of users, their subsciption plan names, plan types, purchase dates and the price they paid.
2. Plan List File: This contains details of all plans, including the plan type, plan name, price and the date range during which the price was valid.
WHAT I WANT TO DO:
I want to check if the price a user paid (from the User List File) matches the correct price from the Plan List File, based on:
CHALLENGES FACING:
The plan names in both files are not exactly same. Some names are slightly different and some only share keywords. I need a way to fuzzy match or find partial matches between plan names
I need to check if the user's purchase date falls within the correct date range in the Plan List for that plan
There are two plan types and the same plan name in each type can have different prices for the same date range. So I need to ensure the plan type is matched exactly when comparing.
I have attached a workflow with sample data for your reference. I really appreciate the effort.
Thanks in advance.
Not a complete solution - but without using Fuzzy Match, I was able to use the conditions you mentioned to get 3 of the 4 rows mapped.
I would say that to do this is more art than science. If you can somehow clean your reference list even further as a one time exercise, then the mapping organization becomes easier to manage. Perhaps you can even consider LLMs to help clean it for you further.
Otherwise if you map date first, you might get a lot of false starts, which then leads to erroneous mapping.
That's all I can at this point because the last scenario I cannot account for is having totally unique keys in the same plan name like:
It gets messy because both Unlimited & Entertainment can be seen to map with PlanID 4 as it has both items, but because it has to be contained, and counted against, this also means that there can be multiple scenarios where the matches can be of same count, then filtering it out will lead you to having more rows than you started.
Since I don't have full access to your data, I don't know if this is truly the case.
Hi @caltang,
Thanks for your prompt response.
As you rightly pointed out, it would indeed have been ideal to clean the Plan List File. However, due to the limitations of the tools and softwares approved within our organization, this is currently the best we can accomplish.
I checked the workflow, and if I understand currectly, the current logic works for the entries with identical prices in both User List and Plan List files. However, in practice, there may be discrepancies between the prices in these two files.
Apologies if my earlier message was unclear. My primary objective is to compare the prices in both files and calculate any differences.
I believe using a fuzzy match tool might be a more suitable approach for this task, though I trust your judgment on what would work best.
Once again, appreciate your support and assistance.
User | Count |
---|---|
107 | |
82 | |
70 | |
54 | |
40 |