Alteryx Designer Desktop Discussions

hellyars · ‎08-24-2018

Trying to parse an outlined document.

The bulk of my rows include [Orig] between the outline and the text I want to parse. I can parse this with a Regex Tool using (\d.+) (\[Orig\] .+) (thanks to the Community). This is NOT my question.

042-1 [Orig] General

My problem is outlined below. Pun totally intended.

There are a few "reject" rows that lack the [Orig].

233-4 Inversion Requirement
626-2.1.1 Seating Characteristics Blah Blah

How can I insert a "[Add]" to match the standard configuration and enable it to be parsed with a similar Regex Tool (\d.+) (\[Add\] .+)????

The outline is 1-N. So, its not just a matter of inserting it X positions from the left.

Thanks

neilgallen · ‎08-24-2018

I might think about this differently. Rather than specifically calling out "Orig" in the regex pattern, I would ask about the spacing.

Are the outline values always consecutive characters with no spacing? If so, then your delimiter isn't the [Orig], it's the space between them.

Your regex string could be ^.+\s.+

Then you wouldn't need the configuration for the [Orig]

hellyars · ‎08-24-2018

@neilgallen

So, I tried with marked groups. I want the outline in Col 1 and the Text that either begins or does not begin with [Orig] in Col 2.

( ^.+)\s(.+)

No joy. It did not throw an error, just [Null]s.

I also tried a variation of the expression I am currently using (.+) (\[Orig\] .+), without the "Orig" reference.

(.+) \s (.+)

No joy, more [Null]s.

neilgallen · ‎08-24-2018

apologies, as sometimes regex is difficult to do off the top of my head without a few testing runs.

([^\s]+)\s(.*$)

should work. This would parse the full outline to begin the string, and everything AFTER the first space, regardless of the presence of [Orig]

hellyars · ‎08-24-2018

@neilgallen

([^\s]+)\s(.*$)

Sorry, still no joy. All [Nulls] and I tried a few mods.

Here is real data.

075-1.9 [Orig] All stainless steel fasteners including bolts, threaded nuts, holes and inserts shall be lubricated with a thin coat of anti-seize lubricant, Tef-Gel or equal, prior to assembly.

043-10.1.1 Dimensions given in inches and fractions: blah blah. More blah blah blah.

neilgallen · ‎08-24-2018

see attached. parsing using all non-whitespace seems to get you there.

hellyars · ‎08-24-2018

It works. But, what is the significance "\S" vs. "\s"?

I can decipher...

^ start at the beginning of line

\S ???

+ one or more

\s space - (the space between the outline # and what follows)

. any single character (this the start of the text)

*zero or more (characters)

$ until the end

neilgallen · ‎08-24-2018

\S is non-white space. Essentially the difference of \s

Alteryx Designer Desktop Discussions

How to insert text after a variable outline # and before text?