Hi there community!
I have some financial data that I am trying to analyse, unfortunately I can't seem to find a way to combine the join and find replace tools in a way that will accomplish what I am trying to do. My current solution (see screenshot below) is very messy and manual whenever it needs to be updated.
My actual data I can't upload, but I have created a small sample dataset using different data to show what I am trying to achieve.
Data table;
| City | Shop Name |
| Berlin | KaufCentrum |
| Berlin | Max Werk NER |
| Berlin | EDEKA Markt |
| Berlin | LIDL |
| London | Billy Corner Store |
| London | Borough Market |
| London | Tescos Supermarket |
| London | Leasure Centre |
Reference Table;
| City | Text in Shop Name find | Shop Category |
| Berlin | Kauf | General |
| Berlin | ner | Hardware |
| Berlin | Markt | Supermarket |
| Berlin | Supermarket | Supermarket |
| London | Supermarket | Supermarket |
| London | Mark | Market |
| London | Corner | Convinence |
Desired Results Table ;
| City | Shop Name | Shop Category |
| Berlin | KaufCentrum | General |
| Berlin | Max Werk NER | Hardware |
| Berlin | EDEKA Markt | Supermarket |
| Berlin | LIDL | - |
| London | Billy Corner Store | Convinence |
| London | Borough Market | Market |
| London | Tescos Supermarket | Supermarket |
| London | Leasure Centre | - |
So what I want is to enrich the data table using my reference table, I want to use a join on the [City] to [City] and then simultaneously a findstring to match some amount of the [Shop Name] to the [Text in Shop Name find], returning the [Shop Category]. My actual data is ofcourse much much larger in scale and complexity than this so the solution needs to scale to ~100k Shop Names rows in the data table and ~1k [Text in Shop Name find] rows in the reference table.
The problems with my current approach is that I can either join on the City or do the findstring, but doing both at once I can't find a way, my current solution splits the data by city using 20+ filters so that I can then apply the findstring, but this is horrible to maintain.
There might be details I am missing out here, but I hope I have been clear in what I need to achieve, many thanks to anyone who helps 🙂
