In case you missed the announcement: Alteryx One is here, and so is the Spring Release! Learn more about these new and exciting releases here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regex Special Criteria Extract

jake8281
8 - Asteroid

I am trying to have 3 Columns built using Regex.  but currently I can only have 2 at  a time in one regex 

 

 

as shown below 

Spoiler
Currency
Capture.PNGCapture.PNG
value & %

Capture.PNGCapture.PNG

 How Can I have all the 3 column together without having having the issue  of the  percentage column disappear when I add currency.  

Capture.PNG

 

 

Spoiler
Second Issue --->   How can I prevent Regex from extracting anything that is not currency in currency column 

Capture.PNG

 

5 REPLIES 5
GMG0241
8 - Asteroid

Hi Jake,

 

I'm hoping I've solved both problems for you, but I'm not certain as I've made some assumptions about the data.

 

Firstly, use the following regex in your tool: (\d+%)|((?:\d+,)?\d+\.\d{2}\s.*?)\s And change the name value to joint currency (or something else, or leave it the same, it's up to you). Next, add a split to columns tool on Joint Currency and select 2 as the number of columns. Finally use a select to create more sensible names.

 

Secondly, my assumptions are as follows:

1.  the percentage appears in place of the currency information (i.e. you never have both a percentage and currency information)

2. the currency information is always NUMBER [SPACE] [CURRENCY CODE] (i.e. currency code appears AFTER the number)

 

Please see the attached workflow with my suggested configurations and let me know if the workflow doesn't solve the problem

 

jake8281
8 - Asteroid

For the first issue my code is working. I am just trying to make it more efficient by having only one regex tool instead of 2. I need help Combining the first 2 ( %, $) --->  (\d+%)|(\w\d,\d+.\d+|\s\d,\d+.\d+)  and Currency column --> \d(\s\u+) to have 3 columns pop up without messing the data so I just need a way to add  (\d+%)|(\w\d,\d+.\d+|\s\d,\d+.\d+) |\d(\s\u+) without currency picking up wrong criteria  and without giving  null for  45% in % column

Spoiler
 
Capture.PNG

 

 

 

How did you solve the second issue ? 

 

can you break down explanation for this 

(\d+%)|((?:\d+,)?\d+\.\d{2}\s.*?)\s

flying008
15 - Aurora

Hi, @jake8281 

 

FYI.

 

1-  Assuming that your [Value] field column always ends with . 00
2- You must unchecked  'Case Insensitive.'

Spoiler
(?<=,\s)(\d+%)?(\d[\d,]*?.00)?\s([A-Z]{2,})?

录制_2025_03_29_10_33_12_543.gif

 

******

If can help you get your want, please mark it as a solution and give a like or more share.

 

 

 

jerry_Tomson
5 - Atom

 

The image shows a Regular Expression (Regex) configuration for data extraction. The Regex pattern being used is:

 

matlab
CopyEdit
(\d+%)|(\d{1,3}(,\d{3})*(\.\d+)?)(\s+)(\w+)
 

Breakdown of the Regex Pattern:

  1. (\d+%) → Captures percentage values (e.g., "45%").

  2. (\d{1,3}(,\d{3})*(\.\d+)?) → Captures numeric values, allowing:

    • Thousands separator (e.g., "10,216.00").

    • Decimal values (e.g., "400.50").

  3. (\s+) → Captures whitespace between the number and the currency.

  4. (\w+) → Captures the currency (e.g., "USD", "Bcus").

Extracted Fields in the Output:

  • % → Percentage values.

  • value → The extracted numerical values.

  • currency → The extracted currency text.

Observations from Output Table:

  • The regex correctly extracts:

    • "45%" from "Bcus - Susie1 Bcus Plank 45% Alllllal".

    • "400.50 USD" from "Bcus - automobil Bcus, 400.50 USD Alllsali".

    • "10,216.00 USD" from "Bcus - submit Bcas, 10,216.00 USD Adddad: dfkl...".

Possible Enhancements:

  1. Allow Spaces in Currency (if needed):

    • Modify (\w+) to ([A-Za-z\s]+) to accommodate multi-word currencies like "Hong Kong Dollar".

  2. Handle Edge Cases Like Negative Values:

    • Update numeric pattern to (-?\d{1,3}(,\d{3})*(\.\d+)?) to capture negative numbers.

 

GMG0241
8 - Asteroid

Hi Jake,

 

My approach makes 2 columns to ensure that 45% doesn't give null, and then I split the 2nd column by a space to create the 3rd column. 

 

The Regex does the following:

(\d+%)|((?:\d+,)?\d+\.\d{2}\s.*?)\s

 

(\d+%) - 1st capture group is capturing the percentage

| - or

((?:\d+,)?\d+\.\d{2}\s.*?)\s - 2nd capture group, (?:\d+,)?\d+\.\d{2} has a non-capture group and the regex checks if the currency value has a comma in it, \s.*?)\s is matching everything in between the space after the currency and the next space that appears after that, so the 2nd capture group captures the currency value, a space, and the information after the space and before the next space.

Labels
Top Solution Authors