Parse address fields
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
Need some help parsing address fields. I almost got it right but just few rows that are not as expected. The address fields in the excel file does not follow one pattern so when I use the regex, the result is not as expected for a few. The output includes Street No, Street Address,Suite,City,State and Zip. In some cases the suite column is correct with digits but in some cases it is appended with City and the word Suite/Ste needs to be in the Suite column but it is with the address. Also, when there is no suite number then city is populated instead of showing it as blank or null. Not sure how to handle these. The reg expression I have used is as follows: ((?:^\d+)|(?:\w+))\s([^\d]+)\s(.*)\s([^\d]+),\s(\u{2})\s(\d{5}$). I have attached snapshot which has the actual address field(in green) and the alteryx output fields after parsing it.
Solved! Go to Solution.
- Labels:
- Parse
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Could you also attach an example of the output you're looking to achieve please? Plus a couple of other examples of ones you're parsing correctly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @aparna0208 ,
Considering that the only options is Suite, Ste or nothing, the regex can be like this.
((?:^\d+)|(?:\w+))\s([^\d]+)\s((?:Suite|Ste)* \d*)\s([^\d]+),\s(\u{2})\s(\d{5}$)
Let me know if that works for you, and if yes, please mark the answer as correct.
Thanks,
Fernando V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Jonathan-Sherman please see attached. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Could you attach as an excel file please?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@fmvizcaino thank you for your inputs. I tried this and it worked but the only issue is if there is no suite number in the address field then all the fields are returned as null(Street no,street address,suite,city,state,zip)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @aparna0208 ,
I got your shared examples and develop the following regex. The idea was not to interfere a lot with what you have already developed.
((?:^\d+)|(?:\w+))\s([^\d]+)\s((?:Suite|Ste)*\s{0,1}\d*)\s{0,1}([^\d]+),\s(\u{2})\s(\d{5}$)
See if this works for all your cases.
Thanks,
Fernando V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@fmvizcaino output is just perfect except for 2 rows which is null. I am wondering if it's because of the zip code which is more than 5 digits. Attached are the 2 addresses I am having issues with. thank you so much for your inputs on this:)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @aparna0208 ,
You are correct!
For that, you have two options:
1. We can ignore that and erase from the address:
((?:^\d+)|(?:\w+))\s([^\d]+)\s((?:Suite|Ste)*\s{0,1}\d*)\s{0,1}([^\d]+),\s(\u{2})\s(\d{5})(?:.*)
2. Include the dash and the 4 digits in your zip code
((?:^\d+)|(?:\w+))\s([^\d]+)\s((?:Suite|Ste)*\s{0,1}\d*)\s{0,1}([^\d]+),\s(\u{2})\s(\d{5}[^a-zA-Z]*)(?:.*)
Let me know if it works for you.
Best,
Fernando V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yes, it worked. Sorry for too many questions on this but I am new to regex and still experimenting with it. Also, I tried another option and it worked. Just wanted to share. For zip code, I replaced the last part d{5} with s(.+) and it worked.
((?:^\d+)|(?:\w+))\s([^\d]+)\s((?:Suite|Ste)*\s{0,1}\d*)\s{0,1}([^\d]+),\s(\u{2})\s(.+)$
thank you once again @fmvizcaino for all your help:)
