Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to clean the city and geographical attribute | Replace + Join

rively90
8 - Asteroid

Hi,

I have 2 questions

 

1. I'm wondering if there is any tool in Alteryx that detect the common name for the city. Maybe using AI, fuzzy matching, geocoding APIs or something? I don't know.

For example,

Correct: Los Angeles

Incorrect: Las Angelas

*Las Angelas is a typo. But, the tool will group them together to the correct one.


Second example:

Correct: Los Angeles

Incorrect: Los Angeles,CA

*I don't need to see ,CA there. Just the name of the city.

 

Third Example:

Correct: Edmonton

Incorrect: Edmonton Alberta

 *I don't need to see Alberta there. Just the name of the city.

 

Currently, the Route , Route Departure, Route Arrival, and Via are to be corrected.

 

There are a lot of typos in my data and just wondering the quickest way to handle this and almost impossible to me to correct one by one.

 

2.

Context: I have 2 files: The Airlines Route Semi Clean will be the main file, which include: Route, Route Departure, Route Arrival, and Via.

 

There are a lot of fields that using 3 digits of airport code. For example: LAX, SFO, DEN, etc...

I have another file called "Airport Information" that contains 3 digits airport code and city located.

 

and I was just wondering if there is any way that we can replace those 3 digit codes with the city.

For example: LAX is replaced to Los Angeles

 

I'm assuming we need to do the JOIN, BUT the issue is that join will remove the city name that is already there in those 4 fields.

 

My current idea is: First, we need to make sure that Alteryx recognizes the 3 digits code, which will be the primary key for the joining. In my another file also has the 3 digits airport code. Then, those 3 digits code in the primary file will be replaced to the city from the secondary file. Is it possible?

 

I attached the excel files here. Could you please send the workflow here please?  Thank you!

 

8 REPLIES 8
cmcclellan
13 - Pulsar

When I saw your post on the Tableau forums this morning I almost replied and said "use Alteryx"

 

The challenge is always the "free format text" that you're left to deal with. 

 

1) The Fuzzy Match Tool https://help.alteryx.com/current/en/designer/tools/join/fuzzy-match-tool.html could solve some/all of your typo's but don't expect it to remove Alberta

 

2) Yes, you could join (or maybe create a macro) to make this work.

 

Either way, I think you should be prepared for records that just don't match at all - and you will have to fix them by modifying the workflow every time an unmatched value is found.

rively90
8 - Asteroid

Hi, I run fuzzy match tool and it's been running for 2 hours and still keep going. What did I do wrong? what's the proper setting?

rively90
8 - Asteroid

@Qiu  Hi Qiu, do you know what's the best way to handle these cases? thank you!

Qiu
20 - Arcturus
20 - Arcturus

@rively90 
For 1st question, it is always tricky and I dont really like "Fuzzy Match" 😂

Refer this post, and maybe we can have a "correct list" of cities to perform Fuzzy Match?

https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Discussions/How-to-parse-data-using-RegEx-...

2nd question, I believe you are looking Find and Replace tool.

0323-rively90.png

rively90
8 - Asteroid

Thank you!

rively90
8 - Asteroid

@Qiu  Hi, thank you! What tool should you recommend using instead of fuzzy matching? It runs forever to my machine.

rively90
8 - Asteroid

Hi @Qiu 

I noticed there are still codes in the Route and Via columns. I've updated the secondary file for the airport attached below. How to fix this? thank you!

 

please use 'iata_code' for the code and municipality for the city

rively90
8 - Asteroid

@Qiu  Is this correct?

 

 

Labels