Alteryx Designer Desktop Discussions

apaicanada · ‎09-10-2019

Hi all,

I am trying to separate out the data in a PDF-to-text output and it seems the best way to do it so far is to use an IF CONTAINS formula and then return the nth string in the line (will be different for each text).

This is my dataset:

EMPLOYEE1 154812 REGULAR 3750 36750 77406 758575 GOVT PEN 7615 41534 EI CONT 2508 13794

So two questions,

Is there a formula to say: if the string contains "EMPLOYEE1" then return the substring right after "EMPLOYEE1" (which would be 154812)?

AND

Is there a formula to say: if the string contains "EMPLOYEE1" then return the second substring right after "EMPLOYEE1" (which would be REGULAR)?

Thanks so much, any help at all is appreciated!

AndrewS · ‎09-10-2019

Hi @apaicanada ,

You could use a Getword formula.

IF (GetWord([Field1], 0) = "EMPLOYEE1" )then GetWord([Field1],1) else null() endif

Alternatively, a text to column using a \s delimiter.

apaicanada · ‎09-10-2019

Thanks for the reply! The problem is however I'm not sure if a certain word will be at that exact spot everytime in the string. For example, REGULAR may not always be number 2 in the string. Do you have any other ideas that could identify just the word and return the word before it?

AndrewS · ‎09-10-2019

How about this -

Getword([Field1],(FindString([Field1], "EMPLOYEE1") + 1))

Edit - this doesn't work

AndrewS · ‎09-10-2019

This is getting away from a formula, but you could send the data to rows and use a multi row?

Thableaus · ‎09-10-2019

Hi @apaicanada

I might have overcomplicated this, but you could try:

IF Contains([Field1], "EMPLOYEE1") THEN
GetWord(
Right(Field1,
Length([Field1]) -
FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")), 1)

ELSE Null() ENDIF

I'm gonna try to open this up for you:

Replace([Field1], "EMPLOYEE1", "|") - Replace the word for a single character "|", which is easier to be identified by FindString function.

FindString(Replace([Field1], "EMPLOYEE1", "|"), "|") - Returns the position of this "|" character.

Length(Field1) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|") - Returns the number of characters we want to bring with the Right function.

Right(Field1, Length([Field1]) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")) - Return the "n" characters to the right of your whole string. This isolates the string so we can start with "EMPLOYEE1" being our first string always.

GetWord(Right(Field1, Length([Field1]) - FindString(Replace([Field1], "EMPLOYEE1", "|"), "|")), 1)

Get our 2nd word in the string, the 1st after "EMPLOYEE1". If you want the 3rd, just add 1.

Cheers,

apaicanada · ‎09-10-2019

This is a great solution - it's getting very close to solving my problem.

I wonder, however, what if the base word was REGULAR instead of EMPLOYEE1 (since the name of the employees will always be changing)? Is there a way of getting 154812 using REGULAR as the source?

Thanks so much,

Akarsh

apaicanada · ‎09-10-2019

Nevermind, I got it! Just had to change the Right to Left.

Thank you so much! This is a superb solution.

IF Contains([Data], "REGULAR") THEN
GetWord(
Left([Data],
Length([Data]) -
FindString(Replace([Data], "REGULAR", "|"), "|")),1 )

ELSE Null() ENDIF

Alteryx Designer Desktop Discussions

Find a certain substring in a string and return the substring that comes before or after i