Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex Parsng field with Different Structure

R_L
7 - Meteor

Hello,

 

I have a single field of data (let's called it Format) which comes in two format:

 

Format

USD123456.00

200115USD1234567.89

 

Essentially, for the above two formats, I want them to be parsed into three columns:

i.e for USD123456.00 this would be:

  • USD would fall into a Currency Field
  • 123456.00 would fall into an Value Field.

And for 200115USD1234567.89

  • 200115 would fall into field called Date
  • USD would fall into a field called Currency
  • 1234567.89 would fall into a field called Value Field

 

Simply, put there should be three columns that should only be outputted.

 

Example:

 

DateCurrencyValue
 USD123456.00
200115USD1234567.89

 

But also worth bearing in mind is that, sometimes a data-set will only have the Format field in 200115USD1234567.89 only, but should be made to be prepared for the possibility of USD123456.00.

 

 

I've taken a look at this example: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Parse-name-out-with-2-different-format...

But it appears to only work when outputting into 2 columns?

 

Thanks for any help!

 

 

 

3 REPLIES 3
fmvizcaino
17 - Castor
17 - Castor

Hi @R_L ,

 

You can use the parse method of the regex tool with the following expression:

(\d*)([^\d]*)(.*)

 

Each group of parenthesis means a group

\d* = digit zero or more times

[^\d]* = 'everything but a digit' zero or more times

.* = anything zero or more times

 

To learn more about regex, I suggest you https://regex101.com/

 

Example attached as well.

Best,

Fernando Vizcaino

R_L
7 - Meteor

Thanks @fmvizcaino 

 

I did the following just now, which worked - but your solution is faster!

IF
REGEX_Match([Format], "(\d{6})(\w{3})(.+)") THEN REGEX_Replace([Format], "(\d{6})(\w{3})(.+)", '$1|$2|$3')

elseif
REGEX_Match([Format], "(\d{0})(\w{3})(.+)") THEN REGEX_Replace([Format], "(\d{0})(\w{3})(.+)", '$1|$2|$3')

 

Else [Format]
ENDIF

fmvizcaino
17 - Castor
17 - Castor

Hi @R_L ,

 

That's great and it seems like you are pretty great in regex already!!

 

Using your expression, you almost got the most simplified solution there.

You only need to insert one thing and that will consider both possibilities.

(\d{0,6})(\w{3})(.+)

 

Best,

Fernando Vizcaino

 

Labels