Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

To Extract between words

jerometyl
7 - Meteor

Dear Experts,

 

I would like to extract "PR8470" and "PR8475" and "PR8474"  from the below string. 

I have multiple row of data 

 

OUR REF : PR8470YOUR REF : NST000ECI21000206 <(>&<)> NST000ECI21000213PO AMENDED as at 23 July 2021 1.

OUR REF : PR8475YOUR REF : T/CG/10/FY19PO TO BE READ WITH 

OUR REF : PR8474YOUR REF : NST000ECI21000207OPTIONAL AWARD TO BE EXERCISED IF REQUIRED 

7 REPLIES 7
DataNath
17 - Castor

Hey @jerometyl, will the rows always be in this format? At least the beginning, up until the part you want to extract, and with a 6-character reference? If so, you can use something like this but if you have any problems when applying this to a larger data set, let us know and we can revisit:

 

:\s(.{6})

 

DataNath_0-1660211438947.png

AuraRautoiu
5 - Atom

Hi,

 

If you will have the same format always, you can use also the below. If changes, you can change the formula at which character should start and how many characters to extract.

AuraRautoiu_0-1660216254450.pngAuraRautoiu_1-1660216270283.png

AuraRautoiu_2-1660216292874.png

 

MilindG
12 - Quasar
Spoiler
MilindG_0-1660220257614.png

 

jerometyl
7 - Meteor

Hi DataNath, some time it is 6 characters, sometimes 7. Basically, behind will end with Your Ref: I need it to be after Our Ref: and before your ref: 

DataNath
17 - Castor

@jerometyl in that case, the following should work. I've updated the expression and shown an example of a reference containing loads of characters just to show it'll pick up anything between the Our Ref: and Your Ref:

 

:\s(.+)YOUR REF

 

DataNath_0-1660221614683.png

MarqueeCrew
20 - Arcturus
20 - Arcturus

@jerometyl ,

 

I believe that you have structured data when I see the examples.  You start with a FIXED string.  Follow it with a VARIABLE string. Follow it with a known (FIXED) string and follow that with "stuff".  Here's what I see:

 

[STUFF to IGNORE][Good STUFF}[STUFF to IGNORE]

 

The variability of "Good Stuff" makes the challenge more interesting.  I like where @AuraRautoiu was going with the Substring function.  That function allows you to START at a fixed position and he/she then applied a FIXED length of 6.  Heck, you could replace "YOUR  REF:" (note the 2 spaces between YOUR and REF) with a PIPE (|) and this problem is easy to solve with a text to columns tool.  But back to the string function ...

 

Substring([Field1],10,FindString([Field1], "YOUR REF :") -10)

 

The findstring() function allows you to search for "YOUR  REF :" and see it in position 16 most of the time.  That means that you want what STARTS in position 10 and ends 6 characters later (16 minus 10).  When the target is 7 digits, YOUR will appear in position 17.  We know that 17 minus 10 equals 7 and so you will get your longer/shorter values with this expression.

 

A regular expression, RegEx, will work for you as @DataNath provides.  I've answered many RegEx posts and know it fairly well but in your case I wouldn't choose this as a solution path.  I wonder how @Emil_Kos would solve the challenge.  I've tagged him and we'll see how he goes about this.  I don't disagree with DataNath.  If the data was more irregular or if you wanted to learn RegEx you could explore something like:

 

Regex_Replace([Field1],".*?(PR\d{4,})\D.*",'$1')

 

Look for ANYTHING until you find the FIRST appearance of "PR" followed by 4 or more digits (\d).  Grab that PR9999....9 as a group.  When you see your first non-digit (\D), ignore the rest. 

 

In that expression, the result doesn't depend on a fixed input.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Emil_Kos
17 - Castor
17 - Castor

Hi @MarqueeCrew,

 

Thank you for flagging me. The answer always depends on the data. If the amount of data is relatively small, I will go with using find and replace tool. I like having all the fields I am searching for in the input tool and in my experience having a workflow that is easy to maintain is almost always my number one priority. 

 

I am assuming that @jerometyl is interested in only those particular REF numbers regex might be quite tricky. If the data structure is always the same, it is very possible that I would propose the same approach as you but not only because of the speed but because it would be easier to maintain by the end user. 

 

If it isn't the same I would go with writing IF contains formulas with several IF function. 

 

Labels