Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help Needed with Number Formatting and/or Data Cleansing

Chad_Cox
7 - Meteor

I need confirmation on an issue that I'm running into with a csv file that has commas in some of the number fields.  I've been trying to figure out the correct Alteryx number format to use but I think the issue is the data with commas included and that what I actually need to do is remove the commas.

 

You can see in the test file that both Unit Price and Extended Price have commas, while Invoice Total does not.  Excel's pivot table option handles this easily, but I have a lot of data that I want to process and therefore Alteryx is where I want to accomplish this.  In the side-by-side comparison you can see where Alteryx is dropping the data that occurs to the right of the comma but this doesn't happen in Excel.

 

To summarize, is it bad data that needs to be cleansed (and if so what is the best way to remove the commas but not loose the decimals) or am I not selecting the correct Alteryx number format?

4 REPLIES 4
cjaneczko
13 - Pulsar

This is one way.

 

 

 

tonumber(REGEX_Replace([_CurrentField_],",",""))

 

 

image.png 583256d3-10af-4954-a3ae-39f43a1dcf68.png

 

image.png

DavidSkaife
13 - Pulsar

Hi @Chad_Cox 

 

Alteryx is reading those fields in as string, and when you convert Extended Price to data type Double it's transforming the value there which is resulting in the issue you're seeing.

 

You have two choices (others may supply others):

 

  • change data type to Float instead of Double
  • Use a Formula tool to remove the comma from the field(s) before performing any calculations
 

image.png

Chad_Cox
7 - Meteor

The Replace Character suggestion worked for me.  Thanks!

caltang
17 - Castor
17 - Castor

You can change the comma out for both fields with a Multi-Field Formula tool.

 

IF CONTAINS(ToString([_CurrentField_]),",")
THEN ToNumber(REPLACE(ToString([_CurrentField_]),",",""))
ELSE ToNumber([_CurrentField_])
ENDIF

 

Then do your calculations! :)

 

Hope this helps.

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
Labels