Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parsing with regex

jenner85
8 - Asteroid

Hi - 

I want to get rid of all of the numbers and spaces after these labels - I tried removing numbers with data cleansing tool and was left with the commas in the numbers, and I can't remove punctuation because I need the dashes in the words:

 

SALES                                          2,015                 341,912.81           60,758               9,911,261.62
SALES CLEARING ADJUSTMENTS                         0                       0.00               56                  22,213.72-
FINANCE CHARGE DEBIT ADJ-CLEARING                  0                       0.00                0                       0.00

 

Thanks,

Jen

9 REPLIES 9
atcodedog05
22 - Nova
22 - Nova

Hi @jenner85 

 

I really wanted to used regex on this but there is a issue for some reason. Here is a workaround. Basically i am finding double spaces and replaicing them ! by using them as reference i am keeping the first column.

atcodedog05_0-1608228838308.png

I hope this is what you were looking for


Hope this helps 🙂 Feel to ask if you have any questions

 

leon2020
7 - Meteor

Not used Regex, but I would use a formula tool here... first looking for the first double-space in the string, then using the LEFT function to read that number of characters.

 

leon2020_0-1608229261677.png

 

 

atcodedog05
22 - Nova
22 - Nova

Interesting approach @leon2020 🙂

jenner85
8 - Asteroid

thanks everyone! This worked for me also: 

 

^(\D+) \s\s

atcodedog05
22 - Nova
22 - Nova

Happy to help 🙂 @jenner85 

 

Cheers and Happy Analyzing 😀

 

Feel free to reach out if you face any issues 🙂

estherb47
15 - Aurora
15 - Aurora

@jenner85 

Great work in finding a solution on your own. I was just writing that suggestion myself - am a big fan of \D in regex to find non-numbers in a group. Then the double space defines where you cut off the information.

 

Cheers!

Esther

atcodedog05
22 - Nova
22 - Nova

I am still familiarizing myself with \D 😅

djaipras
7 - Meteor

That looks like it targets the character strings at the start of the string instead of removing digits and punctuation.

 

If wanting to remove digits and punctuation (assuming that the string might not always start with a non-digit), this could also work using the formula tool:

 

Trim(REGEX_Replace([Text], "\d+|\,+|\.+|\-$", ""))

 

[Text] would be the input string

 

It replaces the following with an empty string

1. any number of digits  = \d+

2. any number of commas = \,+

3. any number of periods = \.+

4. any dash at the end of a string = \-$

 

The the Trim function removes the trailing and leading spaces without touching the space between character strings

JesseMoyer
5 - Atom

This should work using the formula tool.

 

REGEX_Replace(uppercase([Text]), "\>\s+\d.*", "")

 

  • \> - End of any word
  • \s+ - Followed by one or more spaces
  • \d.+ - Followed by a digit and one or more of any character
Labels