Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Simple Dollar Value Data cleaning

jhan
6 - Meteoroid

Hello

 

I am a complete novice first time user of Alteryx - and I am having some difficulty cleaning data:

For example, I have a set of dollar values in a variety of format found in a single column:

 

535,00 $
1 695,00 $
$895.00

700.00 -
$1,049.00

1,230

(800.00)

2100

 

I would like to transform them all into and put it in a single new column:

 

535.00

1695.00

895.00

-700.00

1049.00

1230.00

-800.00

2100.00

 

Any help would be appreciated.

 

Thank you,

8 REPLIES 8
estherb47
15 - Aurora
15 - Aurora

Hi @jhan ,

 

Try the ReplaceChar function in a formula tool

ReplaceChar([Field],"$)", "")

That should remove all $, and ).

You can use the Replace function to replace ( with -

And finally, the Data Cleansing tool can remove all of the extra spaces.

Cheers!

Esther

WHOOPS! Missed a few of the cleanup steps (moving the ending "-" and removing commas)

jhan
6 - Meteoroid

Thanks, @EstherB47

 

I think the biggest problem that I am having is to build a formula that can distinguish the comma that is ahead of the decimals versus comma that is denoting 1,000's.

 

 

estherb47
15 - Aurora
15 - Aurora

@jhan any time!

 

Yes, that's definitely a challenging piece of the problem. I took a few steps to clean, mostly in the Formula tool. 

Step 1: change the opening ( to a -. Replace([ToBeCleaned],"(","-")


Step 2: fix the numbers where the - follows the number itself: 

IF EndsWith([CleanedNumbers], "-") THEN "-"+[CleanedNumbers] ELSE [CleanedNumbers] ENDIF

 

Step 3: Remove all decimal indicators and everything that follows. Plan is to use a Select to convert into a number, and we can use a fixed decimal to make it 2 places. RegEx replace takes care of this. We take everything that precedes the comma or period followed by 2 zeros (or replace 00 with \d{2} if it could be any number) and then everything thereafter (-, $, spaces):
REGEX_Replace([CleanedNumbers], "(.+?)((?:\.|\,)00.*)","$1")

Step 4: Final formula step is to remove the $, commas, and ): ReplaceChar([CleanedNumbers], "$, )", "")

A select tool converts the text field into a number with 2 decimals (Fixed decimal, 10.2 for size)

Please do let me know if this meets your needs. Fun challenge!
image.png
Cheers!

Esther

jhan
6 - Meteoroid

So close!

 

Some strange things are happening:

$1,009.00    becomes 1

and $2,000.00 becomes 2

image.pngimage.png

MarqueeCrew
20 - Arcturus
20 - Arcturus

My buddy @Treyson built a macro to help you. 

 

https://gallery.alteryx.com/#!app/Number-Cleansing-Macro/5b6df1870462d71090803ac9

 

cheers,

 

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
estherb47
15 - Aurora
15 - Aurora

Ah, we are so close! I forgot that 2 zeros can occur in the number itself. 


You can fix 2 of the formulas, and it should now work. changes are in blue

First, change the second formula to IF EndsWith([CleanedNumbers], "-") THEN "-"+REGEX_Replace([CleanedNumbers], "(.+?)(-)","$1") ELSE [CleanedNumbers]
ENDIF

Next, change the third formula to: 
REGEX_Replace([CleanedNumbers], "(.+?)((?:\.|\,)\d{2}[^\d*])","$1") This adjusts so it will only remove the numbers that are actually decimals. We're saying a comma, followed by two digits, followed by anything that's not another digit should be removed.

jhan
6 - Meteoroid

Thank you Esther. 

 

It has worked beautifully!

estherb47
15 - Aurora
15 - Aurora

@jhan 

 

So happy!!!

 

FYI, here's a better version. Can capture more than zeros in the decimal places

Labels