Hello,
I am working on a data extract from PDF. Since most of the PDF's are handwritten, Actual issue is the parenthesis bracket "(" is extracted as "c" in 5 out of 10 cases.
Below are the conditions:
1. If the country is China or Hong Kong then the last digit should be enclosed with a bracket. "()"
2. For countries other than China or Hong Kong there will be no changes.
3.If alphabet "c" exist in last few digits of the string then it should be replaced by bracket "("
Can anyone help on this.
Tax | Country | Updated Tax |
R2456c1) | China | R12456(1) |
1234567 | Australia | 1234567 |
S234567c5) | Hong Kong | S234567(5) |
12345 | USA | 12345 |
A12456c1) | China | A12456(1) |
Solved! Go to Solution.
Do you need more logic for #1? As far as the "c" is concerned:
REGEX_Replace([Tax], "(.*\d)(c)(\d.*)", "$1\($3")
in a Formula Tool should get the job done.
Parsing data from PDF can be very tricky I believe.
It would be better if you could provide a bigger dataset, so we can look for more variants.
Based on you curent criteria, we can first filter the "China" and "Hong Kong " first, then use the function "ReverseText" in order to replace the first occurent of "c".
Hi, @Saravanan13
Maybe there is need more data to catch your situation.
IIF([Country] in ('China', 'Hong Kong'), REGEX_Replace([Tax], '(c)(\d+?.?)$', '\($2'), [Tax])
Thank you so much. It worked.