I have local files (C: drive) that contain address information (see example below) that I want to run through 4-5 files that also contains addess information in hadoop where I use fuzzy match algorithm to see if I can get a match. I also need to automate this.
Any help is much appreciated
Example Format
123 Main Street, Chicago,IL,60609
3244 King Road Apt 3, New York,NY, 22123
6543 Down Road, Houston,TX,77300