How to Fix a Column of Data

Question

Afternoon,

I am still fairly new to Alteryx but need some help as I have two issues that are kind of the same - I think.

My data set is my company's wireless data which means there is a phone number, device ID, user name, device type, device model, etc.  The usual data points one would expect when working with wireless data.

Issue #1

My issue is that I need to correct the phone numbers that are showing up as they are showing up as a number such as:

Raw DataCorrected Data2,065,551,212206-555-12128,175,551,212817-555-12129,405,551,212940-555-1212

Issue #2

Some device types are encoded differently and I need to normalize the column of data.

Raw DataCorrected DataiPhone xRiPhone XRiPhone X RiPhone XRiPhone XR

iPhone XRiPhone RXiPhone XR

Can you please give me some advice on how I might be able to do this?

Thanks in advance!

Scott

jrgo · Accepted Answer

Hi @ScottC1971

I can help on the first item. The formula expression below does correct the raw data you listed in your sample

REGEX_REPLACE(
  REGEX_REPLACE([Raw Data], '\D', '')
  , '^(\d{3})(\d{3})(\d{4})$'
  , '$1-$2-$3'
)

The inner REGEX removes all characters that aren't numeric and the outer REGEX looks for and create 3 groups which are then included in the replacement value adding the dashes between each group.

This will NOT work if the raw data contains more or less than 10 numbers.

For the second issue is not as easy of a solution since I assume there's likely a wide range of strings that may have slight variations. You can create an initial correction table with common errors and then use a Find/Replace tool to update those. Find/Replace can be set to be case insensitive and/or be set to match all or part of the field.

For the remaining values that weren't corrected, Fuzzy match is probably going to be your best bet.

Hope this helps!

Jimmy