Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Simple Dollar Value Data cleaning

jhan
6 - Meteoroid

Hello

 

I am a complete novice first time user of Alteryx - and I am having some difficulty cleaning data:

For example, I have a set of dollar values in a variety of format found in a single column:

 

535,00 $
1 695,00 $
$895.00

700.00 -
$1,049.00

1,230

(800.00)

2100

 

I would like to transform them all into and put it in a single new column:

 

535.00

1695.00

895.00

-700.00

1049.00

1230.00

-800.00

2100.00

 

Any help would be appreciated.

 

Thank you,

8 REPLIES 8
estherb47
15 - Aurora
15 - Aurora

Hi @jhan ,

 

Try the ReplaceChar function in a formula tool

ReplaceChar([Field],"$)", "")

That should remove all $, and ).

You can use the Replace function to replace ( with -

And finally, the Data Cleansing tool can remove all of the extra spaces.

Cheers!

Esther

WHOOPS! Missed a few of the cleanup steps (moving the ending "-" and removing commas)

jhan
6 - Meteoroid

Thanks, @EstherB47

 

I think the biggest problem that I am having is to build a formula that can distinguish the comma that is ahead of the decimals versus comma that is denoting 1,000's.

 

 

estherb47
15 - Aurora
15 - Aurora

@jhan any time!

 

Yes, that's definitely a challenging piece of the problem. I took a few steps to clean, mostly in the Formula tool. 

Step 1: change the opening ( to a -. Replace([ToBeCleaned],"(","-")


Step 2: fix the numbers where the - follows the number itself: 

IF EndsWith([CleanedNumbers], "-") THEN "-"+[CleanedNumbers] ELSE [CleanedNumbers] ENDIF

 

Step 3: Remove all decimal indicators and everything that follows. Plan is to use a Select to convert into a number, and we can use a fixed decimal to make it 2 places. RegEx replace takes care of this. We take everything that precedes the comma or period followed by 2 zeros (or replace 00 with \d{2} if it could be any number) and then everything thereafter (-, $, spaces):
REGEX_Replace([CleanedNumbers], "(.+?)((?:\.|\,)00.*)","$1")

Step 4: Final formula step is to remove the $, commas, and ): ReplaceChar([CleanedNumbers], "$, )", "")

A select tool converts the text field into a number with 2 decimals (Fixed decimal, 10.2 for size)

Please do let me know if this meets your needs. Fun challenge!
image.png
Cheers!

Esther

jhan
6 - Meteoroid

So close!

 

Some strange things are happening:

$1,009.00    becomes 1

and $2,000.00 becomes 2

image.pngimage.png

MarqueeCrew
20 - Arcturus
20 - Arcturus

My buddy @Treyson built a macro to help you. 

 

https://gallery.alteryx.com/#!app/Number-Cleansing-Macro/5b6df1870462d71090803ac9

 

cheers,

 

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
estherb47
15 - Aurora
15 - Aurora

Ah, we are so close! I forgot that 2 zeros can occur in the number itself. 


You can fix 2 of the formulas, and it should now work. changes are in blue

First, change the second formula to IF EndsWith([CleanedNumbers], "-") THEN "-"+REGEX_Replace([CleanedNumbers], "(.+?)(-)","$1") ELSE [CleanedNumbers]
ENDIF

Next, change the third formula to: 
REGEX_Replace([CleanedNumbers], "(.+?)((?:\.|\,)\d{2}[^\d*])","$1") This adjusts so it will only remove the numbers that are actually decimals. We're saying a comma, followed by two digits, followed by anything that's not another digit should be removed.

jhan
6 - Meteoroid

Thank you Esther. 

 

It has worked beautifully!

estherb47
15 - Aurora
15 - Aurora

@jhan 

 

So happy!!!

 

FYI, here's a better version. Can capture more than zeros in the decimal places

Labels