I am trying to parse out the address data into Address, City, State, Zip columns. I have been trying to get RegEx to work but i am confuse how to do this when there not in any uniform order. Is there a simpler way then RegEx, if not how would this be done in RegEx.
Wayne Vebeen Michigan United States |
kathy Charry One Melvin Avenue Annapolis, Maryland 21401 United States |
Cornerstone Architects 7000 Bee Caves Rd. Suite 200 Austin, Texas 78746 United States |
iugene sakai 1000 South Winchester Blvd. San Jose, California 95128 United States |
Jeff Pelletier 115 15th Ave E. Suite 100 Seattle, Washington 98112 United States |
James Krisp 16 Washington Avenue Millbrook, New York 12545 United States |
300 S. Orange Avenue Sarasota, Florida 34236 United States |
Nashville, Tennessee 37210 United States |
2429 Medinah Dr. Evergreen, Colorado 80439 United States |
111 W El Prado Dr San Antonio, Texas 78212 United States |
Solved! Go to Solution.
If you don't want to use regex to parse out your data then you can use the text to columns tool.
It looks like your data always is structured as follows:
[Address], [State] [Zip] [Country]
Therefore if you use two text to column tools with the first set to split on on the comma ',' and then second to split on the space '\s' delimiter you should get the desired result.
Edit: I just realised you also want to get the City and the above solution leaves the city in with the address field. The tricky bit here is you have some cities which are two words. Can the data be exported in a way where you add an additional delimiter?
Otherwise my other solution would be to use lists of US cities and use the find and replace tool to search the string and append the city if there is a match. See attached.
Another option is the CASS address parsing tool, but I don't have this dataset but someone else who does can comment on whether this is also an option.