Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx for flexible string

daophuongtrinh
8 - Asteroid

Hello everyone,

 

I've just started learning RegEx tool recently, and I'm having difficulty in finding the expression for the input data in example below:

 

value amount(S) 12.000 USD dsgho

value amount(S) 14.050.476,2364 USD grkjsfeetuee

abc zyx ghk value amount(S) 499 13.478.400.000 USD gheiurh

 

The problem is  need to parse the number in front of "USD", which means 12.000 , 14.050.476,2364 , 13.478.400.000 (not "499"), and the numbers contain both "." and ",", hence I can't find a full expression for this.

 

So far I've only come up with this:

.*\(\w+\)\s which will be regular for the string part until the "(S)". 

 

Please let me know if you have any idea for this expression.

 

Thank you very much,

Trinh

 

 

8 REPLIES 8
DavidP
17 - Castor
17 - Castor

If you can bank on the fact that the bit preceding your number sequence always ends in amount(S) and the bit after the numbers always starts with USD, you can do this

 

DavidP_0-1582546146300.png

 

daophuongtrinh
8 - Asteroid

Thank you, but there is one more issue is that in the 3rd row:

 

abc zyx ghk value amount(S) 499 13.478.400.000 USD gheiurh

 

I only want to parse the 13.478.400.000, not the number "499", do you have any solution for this?

 

Thanks much

Trinh

DavidP
17 - Castor
17 - Castor

Sorry, I thought you wanted that included. 

 

Thy this

 

DavidP_0-1582547048279.png

 

afv2688
16 - Nebula
16 - Nebula

Hello @daophuongtrinh,

 

How about this?

 

(.*\s)(.*)(\sUSD.*)

 

Untitled.png

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

DiegoParker
10 - Fireball

Hi @daophuongtrinh 

 

You can use the following expression: ([^\s]+)\sUSD.+$

 

 

DiegoParker_0-1582562548245.png

 

 

Hope this answers your question, can I ask you to mark it as the solution? this will help other users to find it and mark the thread as solved. Many thanks!

 

Best,
Diego

 

daophuongtrinh
8 - Asteroid

Hi David, afv2688 and Diego,

 

Thank you very much, the expression of David and Diego worked out very well, but for some reason afv2688's expressions didn't work out.

 

Please find my full input data here, since it's in foreign language so let me explain more about these data to you:

 

- I need to parse the number in front of "VND" and the word "VND", the output should be like in the excel file attached. The first two rows have structure: [string(S)] [the number I need] VND [string]. But other rows contain this structure: [string(S)] [other number] [the number I need] VND [string]

- Besides, the number I need contains "." and sometimes contain "," as well, for example it can be 12.000.000 or 12.000.000,1234

 

So the proper expression is .*\s(.+\sVND).* or ([^\s]+\sVND).+$

 

Thank you very much for your kind support.

 

Best regards

Trinh

fmvizcaino
17 - Castor
17 - Castor

Hi @daophuongtrinh ,

 

Would you be able to share your full dataset as your yxzp package isn't working.

 

Best,

Fernando V.

daophuongtrinh
8 - Asteroid

Hi @fmvizcaino 

 

Please find the input data and the yxmd with proper expression for this case in attachment.

 

Best regards

Trinh

Labels