Address matching for data extraction
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello all,
I have a challenge. I have two separate tables, the first table called housing data contains all home addresses, the home id, the mesh block and the postcode while the second table contains only the home address. The second table contains the addresses of all homes that have enjoyed some form of benefits, but we do not know their home id.
Now, I want to derive the home id of the second table. Please note that the addresses in Table 2 may not necessarily be very clean and this is where I struggle to know how to pattern-match them to give accurate results.
I have included a sample of how the solution should look in Table 3. Also, should the tables be joined?
I will appreciate any help.
Table 1: Home addresses and home id
Home id | Home address | Mesh block | Postcode |
0001 | 17 Randolf Road, FRANKLIN, London 2684 | 0012 | 2684 |
0004 | 10 Music Avenue, MANUREWA, Highbrow 2102 | 0013 | 2102 |
0007 | 16 Sheep Road, IBIZZA, Otago 2102 | 0012 | 2102 |
0009 | 17 Sea Avenue, FORTHWITH, Auckland 1100 | 0013 | 1100 |
0010 | 12 Abbey Road, Mailand, Otago 3400 | 0107 | 3400 |
Table 2: Home addresses
Home_address | Mesh block | Postcode |
17 Randolf Road, FRANKLIN, 2684 | 0012 | 2684 |
10 Music Avenue, Highbrow 2102 | 0013 | 2102 |
16 Sheep Road, IBIZZA, Otago 2102 | 0012 | 2102 |
17 Sea Avenue, FORTHWITH, Auckland 1100 | 0013 | 1100 |
Table 3: Output
Home_id | Home_address | Mesh block | Postcode |
0001 | 17 Randolf Road, FRANKLIN, London 2684 | 0012 | 2684 |
0004 | 10 Music Avenue, MANUREWA, Highbrow 2102 | 0013 | 2102 |
0007 | 16 Sheep Road, IBIZZA, Otago 2102 | 0012 | 2102 |
0009 | 17 Sea Avenue, FORTHWITH, Auckland 1100 | 0013 | 1100 |
Solved! Go to Solution.
- Labels:
- Expression
- Fuzzy Match
- Parse
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Something like this:
It will match based on the values that match its parameters - you need a few fields to be unique for this to work.
Alternatively, you can look into Fuzzy Matching - though it's more of an art and not a science. Therefore, you may never get 100% of your results.
Hope this helps!
Best regards,
Calvin Tang
https://www.linkedin.com/in/calvintangkw/
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@DMI10000 Another option using fuzzy match
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi guys,
Thanks for the reply. After a stress test, the second one works with the Fuzzy match tool.
Much appreciated guys.
