Help Needed with Number Formatting and/or Data Cleansing
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I need confirmation on an issue that I'm running into with a csv file that has commas in some of the number fields. I've been trying to figure out the correct Alteryx number format to use but I think the issue is the data with commas included and that what I actually need to do is remove the commas.
You can see in the test file that both Unit Price and Extended Price have commas, while Invoice Total does not. Excel's pivot table option handles this easily, but I have a lot of data that I want to process and therefore Alteryx is where I want to accomplish this. In the side-by-side comparison you can see where Alteryx is dropping the data that occurs to the right of the comma but this doesn't happen in Excel.
To summarize, is it bad data that needs to be cleansed (and if so what is the best way to remove the commas but not loose the decimals) or am I not selecting the correct Alteryx number format?
Solved! Go to Solution.
- Labels:
- Tips and Tricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This is one way.
 
tonumber(REGEX_Replace([_CurrentField_],",",""))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Chad_Cox
Alteryx is reading those fields in as string, and when you convert Extended Price to data type Double it's transforming the value there which is resulting in the issue you're seeing.
You have two choices (others may supply others):
- change data type to Float instead of Double
- Use a Formula tool to remove the comma from the field(s) before performing any calculations
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The Replace Character suggestion worked for me. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You can change the comma out for both fields with a Multi-Field Formula tool.
IF CONTAINS(ToString([_CurrentField_]),",")
THEN ToNumber(REPLACE(ToString([_CurrentField_]),",",""))
ELSE ToNumber([_CurrentField_])
ENDIF
Then do your calculations! :)
Hope this helps.
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
