community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Geocoding Best Practices for users without a CASS tool

Alteryx Partner

Greetings from Brazil.

 
I have found some issues when trying to geocode some Brazilian addresses with TomTom GPS Data.
 
Brazilian TomTom GPS Pack does not contain a CASS tool, so all the tips I could find here at Community doesn't apply, not only for me, but for everyone outside US or Canada...
 
I was able to find ONE useful tip (at http://community.alteryx.com/t5/Alteryx-Knowledge-Base/Geocoding-Tips/ta-p/1219) not involving CASS tool (sorting my addresses' list by ZIP code prior to submitting it to geocoder tool)...
 
But there are still some pretty basic questions that remain. I guess not only for me, but maybe for other users that live outside USA or Canada, who cannot benefit of a CASS tool...
 
1) Like in the US, here in Brazil we can identify a state by is "full" name or by its two-letter code. Which one is best for using in geocoding?
 
2) Likewise, like in the US we can use a "compacted" ZIP code (a 5-digit number), or use the "full" version of it, composed of 8-digits. Which one is better?
 
3) In Portuguese we describe addresses by first mentioning the street name and then saying the street number ("Avenida Engenheiro Luis Carlos Berrini, 1426", for instance). Should I prep data in order to invert this (whenever is it possible), as if I would describe it in English (that case would become "1426, Avenida Engenheiro Luis Carlos Berrini")?
 
4) Which processes of "cleanness" should be applied to data prior to submiting it to the geocoder? I usually trim excessive whitespaces, sometimes I apply the "UpperCase" function to "standardize" data with capital letters. Should I rip off punctuation and accents? For instance, I work in "São Paulo". Should I describe it as "Sao Paulo", without the tilde accent over the first "a" (that is a very common in Portuguese words)?
 
5) Does any of this really interferes with geocoder's ability to get lat/lon for an address?
 
6) In a broader sense, there is a lot of abreviations when dealing with addresses here in Brazil, I guess this happens all over the world. Just an example, we can mention "R." meaning "Rua" ("Street"). Is there a way of assessing (approximately, not much exact) a quality of an address component (like districts') in order to decide when is better to use it (or not) when submitting data to geocoder tool?
 
If someone is able to shed some light on these questions (at least the first 5...), I would be really thankful.
 
Thank you for your attention.
 
Bruno@Brazil
Quasar
Quasar

Perhaps the Google Maps geocoding API would work best.  You could use download tool in Alteryx to call the API.  I tested it in your street-before-number Portugese format and it seems to work fine:

https://www.google.com.br/maps/place/Av.+Engenheiro+Luís+Carlos+Berrini,+1426

 

 

DId a blog on doing this: https://jdunkerley.co.uk/2016/07/21/geocoding-and-finding-nearest-station-with-google-web-services/

 

And put a macro in the gallery here:  https://gallery.alteryx.com/#!app/Google-Geocoder/57b5c8c83df7da17848eec2e

 

API is limited to 2,500 requests per day though

Moderator
Moderator

Hi Bruno,

 

I received some feedback from one of our associates here at Alteryx and he offered the following response:

It seems that the geocoding works much better when using a single field address rather than the multiple field address option.  With the attached spreadsheet of addresses for example, I was unable to geocode rows 1-6 by specifying multiple fields in the geocoder interface.  I then created the last 5 rows by combining the previous cell data.  Using the single field option, 3 records geocoded!  Pay special attention to that format – the tool seems to like the ‘street, number, city’ format.  It may be worth exploring a little more – this could be one of those ‘less is more’ situations.

 

Attached is the file he used for testing.

 

Here are my best guesses/estimates regarding your specific questions:

  1. I don't believe there will be any difference with respect to full or abbreviated state names
  2. I would also image that 8 digit postal codes would get you more accurate results
  3. In very limited testing it didn't appear to matter if the street number was at the end or the beginning
  4. I tested some addresses with and without the different accents and didn't get any different results
  5. It wouldn't appear to

What I would strongly stress is that you run some tests with your own data, testing things like the scenarios you have mentioned. Also, as you mentioned, cleaning up the data as much as possible can go a long way. 

 

I hope some of this has been helpful.

 

Thank you, Bruno.

 

Dan Chapman
Program Manager, Customer Support
New to the community? Get started here.
Alteryx Partner

All,

 

I wish to thank you for your support on this. I'm trying to use addresses as a single field option, as suggested by DanC.

 

I still have to do a lot of tests, this topic still has some questions to be answered, most of them by ourselves, through trial and error...

 

I'll also try to use external geocoders, as suggested by some of you.

 

Once again, thanks for being involved.

Asteroid

Hello sir!

 

Just an fyi that your macro appears to be failing to load...looking forward to checking it out. Always appreciate your fantastic contributions! 

I have updated the macro in the gallery. It now will report errors and has the Field Map switched on.

https://gallery.alteryx.com/#!app/Google-Maps-Geocode/57aade953df7da17848b5dd9

 

Attached to this as well 

Alteryx Partner

So, this is my favourite tool I have found thus far. I just have one small problem: it doesn't add the fields to my existing flow.  It is a 'dead end' as far as the data goes.

 

Is there a way to modify this to add columns to the data so I can get the details of the sites I've encoded? I just need to pass-through the other data that go into the tool! Any tips would be great; I'm sure it's not that tough to do.

Alteryx Partner

I am re-joining the data on the address string I created, but that seems an awkward step ripe for record explosion. Nevertheless, since I found a solution to my own problem, I wanted to post it so that others can do the same. 


Cheers,

 

-tdm

Labels