Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract data/words (varying positions) from long text strings based on other texts

Vv_Falherity
5 - Atom

Hello! 

 

I must be extremely obtuse because I have far spent too long trying to work this out (from what was seemingly quite a simple issue or so I thought!) using solutions posted for similar queries/regex resources to no avail 😞 Therefore would really appreciate any help I can get please and thank you!!

 

Essentially this came about because I parsed some PDFs to data to be consumed, so everything has ended up in one string. The stage I'm at now is trying to extract certain words from 1 string following some specific patterns within that string. I think this table below describes my problems better than another paragraph: 

ProblemDescriptionString textDesired result in this example data
1Extract the 2 words after "Actual Requirement" (doesn't change ever) and 2 words (that changes). Note: positions and other terms within the string change and vary but the values of interest I wish to extract are always the 2 words away from "Actual Requirement Amount" in every string.B00B6957764XZ221 Project Tee Actual Requirement Amount as at 12/12/2022 99,509.09 5% buffer | GPX | 112/12/2022 99,509.09 
2Extract the 1 word before "Total Final Value" (Note: position within string changes but the amount of interest always is the word prior to "Total Final Value" text)IDer: GB00B6957764(6,262),GB00B6957764(8,770),GB6957764313(202,300) £338,643.20 £338,643.20 Total Final Value GP7H1£338,643.20

 

As noted above, there are patterns to these strings with respect to the desired data I wish to extract - thus I have looked into regex (formula, tools, everything). However, for the life of me, I cannot work it out!

 

I can upload examples of the data like the above to other forms/worksheet or anything else that might it any easier for you to help me. Otherwise I would be extremely grateful for any ideas, many thanks! 

3 REPLIES 3
vsoni
Alteryx
Alteryx

regex is your friend here; you'll want to use Parsing and identifying the bits before / after the types of values you want. 

 

the attached example will give you want you want.

PhilipMannering
16 - Nebula
16 - Nebula

@vsoni 's solution looks good for these examples. A more generic solution might be useful. See attached.

PhilipMannering_0-1633336541972.png

 

Vv_Falherity
5 - Atom

Thank you both!! This community is amazing.  

 

@vsoni Thank your positing this solution - it is allowing me to close out this immediate issue that has been keeping me from completing a workflow/project (that I've been sitting on for far too long now), so it's been immense help.  

 

@PhilipMannering Thanks so much for adding to the solution - I've been trying to look at different ways of addressing the general data problems that I encounter so this great resource over the next couple of days as I encounter different but generally similar data issues. Really appreciate it!

Labels
Top Solution Authors