I have two files that I want to join based on a common field called District. The problem is that these files are prepared manually and the formats can be different. I'm looking to resolve this problem so that I can join the files successfully. It would be great if you can provide a solution based on the following sample data:
18 Nanoi Road |
NO 18 NANOI RD. |
18th NANOI RD. |
18 NANOI ROAD |
No. 18 NANOI Rd. |
The task is basically unifying these variants so that the fields can be matched and the two filed be joined together. I recommend "18 NANOI RD" as the standard version.
ALTERNATIVELY, a RegEx or Contains formula that can help me identify if the letters are shared in these rows to determine if they're the same District would be great as well.
Solved! Go to Solution.
Hi @45179902, the solution to this will vary a lot since this is manual field and anything could be inserted in there. You can come up with a solution that will solve most of the issues, but all it is kind hard. Here is one way of correcting the issues you mentioned on your sample data. I used the Find and Replace tool so you could add more lines there of things that need to be replaced. For the TH removal, I had to use the replace formula as there was no spece in between the TH and the number.
Hi @45179902
You can achieve this using fuzzy match. Fuzzy match matches nearly common names.
Workflow:
Hope this helps : )