Hi,
I am trying to build a logic to parse the amounts which are present in the source data in the below formats.
Format-1 : Comma is used as thousand separator and Dot is used as decimal separator. e.g. 1,234.456
Format-2 : Dot is used as thousand separator and Comma is used as decimal separator. e.g. 1.234,456
Approach : The amounts are read as string and then regex match is used to identify the format. After format is identified, the commas are removed and decimal point is put in the appropriate place. For e.g. if Input amount is 1.234,456, output is 1234.456.
Challenge: The logic breaks when the input amount is e.g. 123.345. Because "." could be decimal or thousand separator.
It results in ambiguity when pattern is used to match the amount format.
Please let me know if you have any idea. Waiting for your answer.
Thanks,
Mohith
Hi @MohithRai
Looking at the challenge i don't think there is an easy solution for this 😅 data lacks consistency
Hi @MohithRai ,
You can try to create a Macro similar to the Currency Converter in our Public Gallery. See below
https://gallery.alteryx.com/#!app/Currency-Formatter/58e3bb54f499c713d41b3706
Thanks,
Mike
are there always 3 units after the decimal point? If not - this is rough and really a ML exercise...
Hi @apathetichell ,
Amounts can be a whole number or can contain decimal. It can have any number of units after decimal point.
Hi All,
I was trying with the below workaround and it pretty much generates the expected output. It is a bit inefficient ,but accurate as of today🙂.
Assumption:
At least one of the amount value in the source data is present in the format that contains both 'thousand separator' and 'decimal separator' i.e. 1,234.56(Format-1) or 1.234,56(Format -2).
Approach: Before parsing the amounts,
Output:-
Scenario-1:
Scenario-2:
This approach fails when the source data does not have any amount in the format as per our assumption. This is a rare scenario, when all the amounts are in hundreds or less than that.
Please share your thoughts on this approach.
Thanks @MichaelSu .
Checking on this, I will let you know if this works.
problem case is where an item is xxx.xxx/xxx,xxx and the item could be either hundreds/decimals or hundred thousands and there is no way to determine what it is... Is it possible that there is something in one of the other columns which could point to a language or other regional difference which could be used to create your logic for a format code? Perhaps there is a region data, or company names ending in "S.A." or something of that nature? If you have transaction country listings perhaps it's worthwhile to match it to a list of which countries use decimal points vs decimal commas. The swapping is fairly straight forward to implement - its the creation of a dynamic format code which is hard - especially not knowing what else is in the data.
User | Count |
---|---|
18 | |
16 | |
14 | |
8 | |
7 |