Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Find a certain substring in a string and return the substring that comes before or after i

apaicanada
7 - Meteor

Hi all,

 

I am trying to separate out the data in a PDF-to-text output and it seems the best way to do it so far is to use an IF CONTAINS formula and then return the nth string in the line (will be different for each text).

 

This is my dataset:

 

EMPLOYEE1 154812 REGULAR 3750 36750 77406 758575 GOVT PEN 7615 41534 EI CONT 2508 13794

 

So two questions, 

 

Is there a formula to say: if the string contains "EMPLOYEE1" then return the substring right after "EMPLOYEE1" (which would be 154812)?

 

AND

 

Is there a formula to say: if the string contains "EMPLOYEE1" then return the second substring right after "EMPLOYEE1" (which would be REGULAR)?

 

Thanks so much, any help at all is appreciated!

7 REPLIES 7
AndrewS
11 - Bolide

Hi @apaicanada ,

 

You could use a Getword formula.

 

IF (GetWord([Field1], 0) = "EMPLOYEE1" )then GetWord([Field1],1) else null() endif

 

Alternatively, a text to column using a \s delimiter.

 

 

apaicanada
7 - Meteor

Thanks for the reply! The problem is however I'm not sure if a certain word will be at that exact spot everytime in the string. For example, REGULAR may not always be number 2 in the string. Do you have any other ideas that could identify just the word and return the word before it?

AndrewS
11 - Bolide

How about this - 

Getword([Field1],(FindString([Field1], "EMPLOYEE1") + 1))

Edit - this doesn't work

AndrewS
11 - Bolide

This is getting away from a formula, but you could send the data to rows and use a multi row?

Thableaus
17 - Castor
17 - Castor

Hi @apaicanada 

 

I might have overcomplicated this, but you could try:

 

IF Contains([Field1], "EMPLOYEE1") THEN
GetWord(
Right(Field1,
Length([Field1]) -
FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")), 1)

ELSE Null() ENDIF

 

I'm gonna try to open this up for you:

Replace([Field1], "EMPLOYEE1", "|") - Replace the word for a single character "|", which is easier to be identified by FindString function.

 

FindString(Replace([Field1], "EMPLOYEE1", "|"), "|") - Returns the position of this "|" character.

 

Length(Field1) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")  - Returns the number of characters we want to bring with the Right function.

 

Right(Field1, Length([Field1]) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")) - Return the "n" characters to the right of your whole string. This isolates the string so we can start with "EMPLOYEE1" being our first string always.

 

GetWord(Right(Field1, Length([Field1]) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")), 1)

 

Get our 2nd word in the string, the 1st after "EMPLOYEE1". If you want the 3rd, just add 1.

 

Cheers,

apaicanada
7 - Meteor

This is a great solution - it's getting very close to solving my problem.

 

I wonder, however, what if the base word was REGULAR instead of EMPLOYEE1 (since the name of the employees will always be changing)? Is there a way of getting 154812 using REGULAR as the source?

 

 

Thanks so much,

Akarsh

apaicanada
7 - Meteor

Nevermind, I got it! Just had to change the Right to Left.

 

Thank you so much! This is a superb solution. 

 

IF Contains([Data], "REGULAR") THEN
GetWord(
Left([Data],
Length([Data]) -
FindString(Replace([Data], "REGULAR", "|"), "|")),1 )

 


ELSE Null() ENDIF

Labels