Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

parse string using other strings

Mikis
8 - Asteroid

Hello,

 

I have a list of strings with an underscore seperator like the following:

ftThreefinalgrps_brd4_co10
ftif5_co10
ftqn5_bra1

I also have a list of "identifiers", that tell me what each part of the string means.

please note that the number of identifiers and their name is variable

br

ft

co

 

The goal is to parse the string using the identifiers.

Using my first variable, I want to find everything after "br" and before the underscore (if there is any), in this case "d4".

Then I want everything after "ft" and before the underscore, being "Threefinalgrps"

Finally, I want everything after co and before the underscore, so "10"

 

My ideal final result would be:

variableftbrco
ftThreefinalgrps_brd4_co10Threefinalgrpsd410
ftif5_co10if5 10
ftqn5_bra1qn5a1 

 

Does anyone have a suggestion on how to start?

 

Thanks!

Mikis

14 REPLIES 14
ponraj
13 - Pulsar

Here is the sample workflow for your case. Hope this is helpful. 

 

Workflow and resultsWorkflow and results

CharlieS
17 - Castor
17 - Castor

I've attached an example solution that's a little different from the elegant route @ponraj took. We both relied on the '_' splitting each identifier, but I used the list of strings and their solution parsed all strings as identifiers.

jdunkerley79
ACE Emeritus
ACE Emeritus

2018-11-28_15-43-55.png

 

I would jump to Regex:

REGEX_Replace(
   [variables],
   ".*?(" + [Field1] + "([^_]+))?.*?",
   "$2")

But it is pretty hard so an alternative is:

IIF(Contains([variables], [Field1]),
    Substring([variables], FindString([variables], [Field1])+length([Field1]),1000), 
    NULL())

This will trim the string down and then

IIF(Contains([Match], "_"), Left([Match], findstring([Match],"_")-1), [Match])

To chop off at the first _

 

Sample of both attached.

Mikis
8 - Asteroid

Woah, some excellent answers.

 

They all worked, but I think I'm going with @jdunkerley79's suggestion because it doesn't rely on the underscore splitting.

The problem is that I have absolutely no idea how many underscores the field will have, and not having to make the assumption sounds lovely.

 

It'll take some time before I understand why that regex thing works, but it works great!

 

What do you mean however with "its pretty hard"? Do you mean it's a difficult formula? Because you already wrote it, copying it is easy :)

 

Cheers!

Mikis

Mikis
8 - Asteroid

@jdunkerley79 there's still a small issue in the script.

 

If a string by coincidence contains one of the codes, then it would be split incorrectly.

 

for example the following string

ftThrebrefinalgrps_brd4_co10

Then the "br" would split "efinalgrps"

 

So I would actually want to split on "_br" unless it's the first 2 characters.

Does that make sense?

 

I changed your code to

REGEX_Replace(
   [variables],
   ".*?(" + "_" + [Field1] + "([^_]+))?.*?",
   "$2")

But I would want to make an exception if it's the first 2 letters :)

 

I hope I explained this properly...

 

Regards,

Mikis

Mikis
8 - Asteroid

Got it :)

It's not very graceful, but it works.

(I renamed "field1" to "dimensions")

i added a helper column "before_dimension" to see what comes before the field

TRIM(Left([variables], FindString([variables], [dimensions])))

And then I check: "before_dimension" should either be null (if it's the first word) or the last symbol should be an "_"

if Length([before_dimension]) = 0 then [Match] elseif Right([before_dimension], 1) = "_" then [Match] else null() endif

correctedMatch.PNG

jdunkerley79
ACE Emeritus
ACE Emeritus

Sorry to be slow replying:

REGEX_Replace(
   [variables],
   ".*?((?=^|_)" + [dimensions] + "([^_]+))?.*?",
   "$2")

Should fix the issue. It does a look behind to check at a _ or the start of a string

 

Mikis
8 - Asteroid

[I said it worked perfectly, but I found an issue later]

Mikis
8 - Asteroid

@jdunkerley79

Hm, now I have a new issue.

It no longer finds anything after the underscore.

i don't want to be lazy and find the solution myself, but **bleep** regular expressions are complicated :O

 

missing.PNG

Labels