Hi Pro User,
I have a list of cities with names that need to be corrected - extraneous characters after names, wrong spelling, etc. - based on a list the Lookup_Table. Below is a file with two tabs - the Data tab that shows the different stages of cleansing and the Lookup_Table tab that provides the correct info.
In the Data tab, Step 0 is a list of data (a-c city names). Step 1 peels out those items that have the 5 digit zip code. Step 2 removes everything starting with a character after a name. Step 3 looks up the Lookup_Table to find the correct spelling of the city names. Step 4 makes an additional correction for those with incorrect city names.
Here are the steps:
STEP 0 | STEP 1 | STEP 2 | STEP 3 | STEP 4 | ||||
ORIG DATASET | REMOVE ZIPCODE | POSTAL | REMOVE CHARS | POSTAL | TABLE LOOKUP | POSTAL | CORRECTION | POSTAL |
***WILL CALL FOR JOBSITE | ***WILL CALL FOR JOBSITE | TBD | ||||||
0000 | 0000 | 0000 | #N/A | TBD | ||||
29 PALMS, CA | 29 PALMS, CA | 29 PALMS | #N/A | TBD | ||||
93664 | 93664 | 93664 | 93664 | 93664 | #N/A | 93664 | TBD | 93664 |
ADELANTO | ADELANTO | ADELANTO | ADELANTO | ADELANTO | ||||
ADELANTO, CA | ADELANTO, CA | ADELANTO | ADELANTO | ADELANTO | ||||
AGOURA | AGOURA | AGOURA | #N/A | AGOURA HILLS | ||||
AGOURA HILLS | AGOURA HILLS | AGOURA HILLS | AGOURA HILLS | AGOURA HILLS | ||||
AGOURA HILLS, CALIFORNIA | AGOURA HILLS, CALIFORNIA | AGOURA HILLS | AGOURA HILLS | AGOURA HILLS | ||||
AGUANA | AGUANA | AGUANA | #N/A | TBD | ||||
AHAMBRA | AHAMBRA | AHAMBRA | #N/A | ALHAMBRA | ||||
ALHAMBRA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA CA | ALHAMBRA CA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA CA. | ALHAMBRA CA. | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA, CA | ALHAMBRA, CA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA, CA 91803 | ALHAMBRA, CA 91803 | 91803 | ALHAMBRA | 91803 | ALHAMBRA | 91803 | ALHAMBRA | 91803 |
ALHAMBRA, CA 91806 | ALHAMBRA, CA 91806 | 91806 | ALHAMBRA | 91806 | ALHAMBRA | 91806 | ALHAMBRA | 91806 |
ALHAMBRA, CA. | ALHAMBRA, CA. | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA, CALIFORNIA | ALHAMBRA, CALIFORNIA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALHAMBRA,CA | ALHAMBRA,CA | ALHAMBRA | ALHAMBRA | ALHAMBRA | ||||
ALISO VIEJO, CA | ALISO VIEJO, CA | ALISO VIEJO | ALISO VIEJO | ALHAMBRA | ||||
ALTADENA | ALTADENA | ALTADENA | ALTADENA | ALTADENA | ||||
ANAHEIM | ANAHEIM | ANAHEIM | ANAHEIM | ANAHEIM | ||||
ANAHEIM, CA | ANAHEIM, CA | ANAHEIM | ANAHEIM | ANAHEIM | ||||
ANYWHERE, CA | ANYWHERE, CA | ANYWHERE | #N/A | TBD | ||||
ANZA | ANZA | ANZA | ANZA | ANZA | ||||
APPLE VALLEY | APPLE VALLEY | APPLE VALLEY | APPLE VALLEY | APPLE VALLEY | ||||
APPLE VALLEY, CA | APPLE VALLEY, CA | APPLE VALLEY | APPLE VALLEY | APPLE VALLEY | ||||
ARCADIA | ARCADIA | ARCADIA | ARCADIA | ARCADIA | ||||
AS REQUESTED | AS REQUESTED | AS REQUESTED | #N/A | TBD | ||||
AUBERRY | AUBERRY | AUBERRY | AUBERRY | AUBERRY | ||||
AVALON | AVALON | AVALON | AVALON | AVALON | ||||
AVALON, CA | AVALON, CA | AVALON | AVALON | AVALON | ||||
AVALON, SANTA CATALINA IS | AVALON, SANTA CATALINA IS | AVALON | AVALON | AVALON | ||||
AZUSA | AZUSA | AZUSA | AZUSA | AZUSA | ||||
BAKERSFEILD | BAKERSFEILD | BAKERSFEILD | #N/A | BAKERSFIELD | ||||
BAKERSFIELD | BAKERSFIELD | BAKERSFIELD | BAKERSFIELD | BAKERSFIELD | ||||
BAKERSFIELD, CA | BAKERSFIELD, CA | BAKERSFIELD | BAKERSFIELD | BAKERSFIELD | ||||
BAKERSFIELD, CA. | BAKERSFIELD, CA. | BAKERSFIELD | BAKERSFIELD | BAKERSFIELD | ||||
BAKERSFIELS, CA | BAKERSFIELS, CA | BAKERSFIELS | #N/A | BAKERSFIELD | ||||
BAKERSFILED | BAKERSFILED | BAKERSFILED | #N/A | BAKERSFIELD |
I think I can do steps 1-3, but then, I'm not clear on how to approach steps 3-4. Could someone please guide me on how to approach this problem? Appreciate it.
Thanks,
kwl
Solved! Go to Solution.
Thank you @binuacs !