Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Extract Number and String from a String

ali52786
5 - Atom

Hi

 

I really need help with figuring out how to extract a specific string and number from a description. The format of the text is as follows: GOLDEN TEMPLE 9KG. This is the similar structure for other string descriptions. I am looking to extract 9 and KG in 2 separate columns. How do I go about doing this? In addition, one other complexity the number can be either one digit or 2 digits and there can be a space between the 9 and KG in some cases as well

7 REPLIES 7
atcodedog05
22 - Nova
22 - Nova

Hi @ali52786 

 

It should be possible with regex

 

Can you please provide some more examples.

atcodedog05
22 - Nova
22 - Nova

Hi @ali52786 

 

Here is what you can do. You can use Regex tool parse mode to catch the values.

 

Workflow:

atcodedog05_0-1611505196797.png

 

Regex explanation

(\d+) catch number

\s? space optional in between

(\w+) catch word after it

 

Hope this helps 🙂 Feel to ask if you have any questions

 

ali52786
5 - Atom

Hi @atcodedog05 

 

This is great and works as expected. However as I am going through my data there are instances of when the string looks as follows: MEZBAN 20PCS 1600GMS. In this case I am looking to extract the 1600gms which is the weight of the item (the weight in my product descriptions are generally in kg, lb, gm, gms). How would I have to modify the above in order to pick the weight out when preceded by the above abbreviations?

 

Thanks for your help and quick response

ali52786
5 - Atom

Hey @atcodedog05 

 

In addition the weight can also be as 82.5GMS in the description. When this happens, the current formula shows it as 8 in one column and 2 in another column. In this case I would want to show 82.5 and GMS in 2 separate columns

 

Thanks,

mhey01
7 - Meteor

Hey @

 

Is this the output you're expecting?

 

regex_weight.png

 

I've added a small piece of RegEx to what @atcodedog05 supplied:

 

(\d+)\s?(kg)|(\d+.?\d+)(gms)

 

@atcodedog05 has already explained the first half, here's an explanation for the second:

 

| -> or

\d -> capture digit

+ -> one or more repetitions

. -> capture decimal for gms

? -> makes decimal optional

gms -> explicit statement to find string 'gms'

 

You'll notice that the output supplied specifically returns the pieces of regex enclosed in brackets. Brackets are used as groups in regex, and when used in Alteryx, this provides your output.

 

If you'd like to look more into regex, do a Google search for 'regexone' (I would supply the URL but it won't allow me to submit).

 

Hope this helps! 🙂

BretCarr
10 - Fireball

It would really help if you could give us a sample set of data to work with as opposed to spoon feeding us new situations! 😂haha

 

The regex expression pattern I came up with should handle your various examples. You may want to add any and all weight descriptions to my OR list (it's the one with the pipes).

 

 

 

 

\s(?<weight>[\d\.]*)\s?(?<unit>(?:KGS|KG|LBS|LB|GMS|GM))

 

 

 

 

As I've mentioned before, I really like labeling my parsing sections so you can see where each extraction or parse grouping nice and organized.

atcodedog05
22 - Nova
22 - Nova

Interesting approaches by @mhey01 and @BretCarr 🙂.

Hi @mhey01 I would request you to also provide the workflow. So that you can help the users much more in understanding the solution 😀

 

Here is my take on the regex. Updation to my previous solution.

 

Workflow:

atcodedog05_0-1611561179179.png

 

Regex explanation

([\d\.]+) catch number including decimal point

\s? space optional in between

(kg|gms|gm|lb)catch words kg,gms,gm,lb

 

Hope this helps 🙂 Feel to ask if you have any questions

Labels