is there a way to CORRECT the US Address (ZipCode in particular)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am running my fairly large dataset (7M customers) of US addresses through a CASS process (US Geocoder tool), and see some (about 1%) addresses not getting geocoded. Upon further investigation I found that the ZipCode in them is usually incorrect (00713 instead of 00714, or 00713 instead of 10713).
I can flag these as Invalid Address, but is there an option to correct the ZipCode since the other fields (StreetAddress, City, State) are valid?
Thank you in advance!
Solved! Go to Solution.
- Labels:
- CASS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Unless there's a consistent logical way to implement some sort of corrective measure(s), this may be difficult. For example, how can we tell that 00713 needs to be corrected to 00714 or 10713?
One thought is to use the Allocate Input tool and get the US Census Data from the Alteryx Downloads site to get a list of all valid zip codes and compare it against your data to clear up any mis-entered zip code values.
I forget if the Address tools will work without a Zip Code field, but worth a shot!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I've had success using Google's address validation API. You can pass addresses through to it and it will tell you if they're valid, and if not provide possible alternatives/official USPS addresses. The first few thousand calls are free, but after that there's a nominal cost.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you, Alex. I agree with you. The ZipCode errors are random, and I can't overwrite the incorrect ones with any logic. 00173 could be listed instead of 10173, 20173, 00177 etc. To make it worse, 00173 could be a valid ZipCode in itself, but not a correct ZipCode for a given address..... In other instances, the incorrect ZipCode is also an invalid ZipCode.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Nice @Luke_C !
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you, Luke.
I will keep this solution in mind if there are no other ways to do it in Alteryx.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @BaDaBoom
I should clarify that I use the API within alteryx using the download tools - so I pass the addresses in, get google's results and keep processing the data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@BaDaBoom are you using a zipcode as a number at any point? a zipcode is not a number - it's a string. If you store it as a number you are doing a disservice to the people of New England with their 0 leading zipcodes....
I'd recommend doing a state search for error checking. Pretty easy to see if the state is CT/MA or whatever (gets a 0) or NY (gets a 1) --- but try to diagnose the error by unit testing one entry where you see the problem and trace it back in your workflow. Why is it being handled wrong? Knowing zipcodes - this probably relates to a string-> number conversion issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@apathetichell - perhaps 00713 is not a god example, cause it's not a missing zero or conversion issue, it's an issue if incorrect ZipCode.
Appreciate the response though.
