community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Cleansing / Mapping using Soundex ?

Asteroid

JHi,

 

this is is my first look at Alteryx ! I have a large csv file of over a million lines. One of the fields is an item ID which has very poor quality with differing issues :

 

  • leading zeros
  • special characters within the item ID
  • item description added to the item ID
  • unit of measure added to the item ID

 

Can anyone help on creating a final clean ID field ?  

 

The first issue is I don’t know which is the correct ID as I’m getting data from multiple sources and on a weekly basis so a master lookup will need to be constantly updated. Is it possible to group them using Soundex to start with to see the most likely matches? 

 

any help advice is really appreciated!

 

many thanks

Bolide
For the special characters and whitespace you can use the data cleansing tool.

For the unit of measure use the find/replace tool (most likely using a list of units in a text input) or formula tool to convert them to blanks "".

Finally, if the item description follows a specific format you could use regex in the formula tool to replace it with blanks "". If that's not right could you please attach an example of what the item description is?
Asteroid

Hi,

 

Thanks for your reply. 

 

An example of the itemID is

 

  • Speedy Supplies
  • SpeedySuppliers
  • Speedy. 
  • Speedy Supplies 10Pack
  • Speedy Supplies Pack10

 

Ideally I would need to map all these to 'Speedy Supplies'  - at the moment I'm doing this manually in Excel which is taking an age!  

 

Hope you are able to help??

 

 

 

 

Labels