I came across a problem today while building a workflow using the Geocoder where I get a large number of failed geocoding attempts because the addresses being passed through have proper building names in the address field (e.g., 100 N Tryon Street Ste XXX Bank of America Corporate Ctr). The Orig_Cass_Errorstring shows up as "Unknown Street". I don't want to create a specific rule for just BoA Corp Ctr as there are far too many examples like it for that to be a viable solution.
Has anyone ever run into this problem and/or found a solution?
Thanks in advance!
David
Solved! Go to Solution.
Another issue is when only the Building Name is passed through the source file; no street address, just "Bank of America Corporate Center".
I think I will have to manually update / create rules. I have been using a fuzzy match up front as well - even a waterfalled FM with decreasing match level requirements - the problem is I am working with such a large data set the 'leftovers' are still quite large.
I wonder if you could build a process that might use the list of Street Suffix Abbreviations (as found in http://pe.usps.gov/text/pub28/28apc_002.htm) to identify where the "end" of the address might be.
What I'm thinking is to use a Find Replace tool with that list and replace anything there that matches with some unique indicator. You could then use RegEx to parse out everything before/after that indicator (assuming that most of what would be after would be building information), and then join back to your original list to get back the original Street Suffix?
Not sure how effective it would be, but if you are dealing with a large data set, it might get you a large part of the way...you could even possibly build out your rules from the "after" component of the RegEx process.
Just a thought...
Rod, thank you very much for the feedback! I created two copies of master_raw_address, used find and replace to add a pipe delimeter after the street suffixes and abreviations, then used a waterfall-logic of geocoders. 1st run uses original master_raw_address, failures move to second tier which runs address copy 1 through (has everything after 1st pipe delimiter/suffix stripped off), failures move to third tier which runs address copy 2 (everything after 2nd pipe delimiter stripped). This fixed ~70% of the errors!
Could you possibly show me how you did that? I'm basically trying to do the exact same thing right now! haha