Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex parsing help needed with Special Characters "<" and, ">"

sdodero
6 - Meteoroid

Hello Community,

 

I am new with Alteryx and I am having issue parsing a column that contains the special characters "<"  and ">" among other special characters which I am able to parse so far. However,  I am stuck with these two "<" and ">"

 

Here's an example of the expression

[data-for-column_A](data-for-column_B)<data-for-column_C>|data-for-column_D

 

I am using the Parse function with these expressions

Column A: \[(.*)\]

Column B \((.*)\)

Column C \[x60](.*)\[x62]

But so far no luck field for column C comes empty

 

7 REPLIES 7
Kenda
16 - Nebula
16 - Nebula

Hey @sdodero!

 

This may not be the most efficient way to accomplish what you're looking for, but the below expression worked for me in a Formula tool:

regex_replace(Replace(Replace([Field1], "<", ":"),">",":"),".*:(.*):.*","$1")

Basically, it replaces each of the < and > symbols with colon symbols then uses RegEx to get the data for column c (or, the data within the colons).

 

Hope this helps!

Claje
14 - Magnetar

Hi,

I think that RegEx only qualifies the < and > characters as metacharacters when they are prefixed with a \.

Therefore, I think the following works for column C

 

<(.*)>

You should be able to use the following to get a b and c all together

 

\[(.*)\]\((.*)\)<(.*)>
CharlieS
17 - Castor
17 - Castor

@Kenda @Claje Nice work!

 

Another way to think about this, is rather than focusing on the delimiter characters, focus on the column values: If you assume that "-" and "_" are a part of the data, this RegEx can Tokenize your example. This might be a good option if you're more confident in the contents of column values.

 

[\w-]+

 

 

sdodero
6 - Meteoroid

Thanks all for your prompt responses. The tricky part here is that the data between each special character contains data to be use in a separate column. However, Alteryx's interpretation of \< and \> means word. 

 

So when for column C

<abc123>

If if user the regular expression

\<(.*)\>

The results are Null


What is most frustrating is that when I use the ascii value for < 

\x60(.*)\x62

still I get the Null values for column C

 

 

Kenda
16 - Nebula
16 - Nebula

@sdodero Did you try the formula I provided? It is working with some sample data I created. By changing the < and > symbols to : then using RegEx, Alteryx will recognize it as your symbol and only keep what's inside.

 

See the attached workflow where I separate each of your fields into distinct columns.

 

sdodero.PNG

Claje
14 - Magnetar

I've attached an example using the regex i created which is able to isolate the value <abc123> for column C

sdodero
6 - Meteoroid

Thanks a lot it worked !!

Labels