Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex parsing help needed with Special Characters "<" and, ">"

sdodero
6 - Meteoroid

Hello Community,

 

I am new with Alteryx and I am having issue parsing a column that contains the special characters "<"  and ">" among other special characters which I am able to parse so far. However,  I am stuck with these two "<" and ">"

 

Here's an example of the expression

[data-for-column_A](data-for-column_B)<data-for-column_C>|data-for-column_D

 

I am using the Parse function with these expressions

Column A: \[(.*)\]

Column B \((.*)\)

Column C \[x60](.*)\[x62]

But so far no luck field for column C comes empty

 

7 REPLIES 7
Kenda
16 - Nebula
16 - Nebula

Hey @sdodero!

 

This may not be the most efficient way to accomplish what you're looking for, but the below expression worked for me in a Formula tool:

regex_replace(Replace(Replace([Field1], "<", ":"),">",":"),".*:(.*):.*","$1")

Basically, it replaces each of the < and > symbols with colon symbols then uses RegEx to get the data for column c (or, the data within the colons).

 

Hope this helps!

Claje
14 - Magnetar

Hi,

I think that RegEx only qualifies the < and > characters as metacharacters when they are prefixed with a \.

Therefore, I think the following works for column C

 

<(.*)>

You should be able to use the following to get a b and c all together

 

\[(.*)\]\((.*)\)<(.*)>
CharlieS
17 - Castor
17 - Castor

@Kenda @Claje Nice work!

 

Another way to think about this, is rather than focusing on the delimiter characters, focus on the column values: If you assume that "-" and "_" are a part of the data, this RegEx can Tokenize your example. This might be a good option if you're more confident in the contents of column values.

 

[\w-]+

 

 

sdodero
6 - Meteoroid

Thanks all for your prompt responses. The tricky part here is that the data between each special character contains data to be use in a separate column. However, Alteryx's interpretation of \< and \> means word. 

 

So when for column C

<abc123>

If if user the regular expression

\<(.*)\>

The results are Null


What is most frustrating is that when I use the ascii value for < 

\x60(.*)\x62

still I get the Null values for column C

 

 

Kenda
16 - Nebula
16 - Nebula

@sdodero Did you try the formula I provided? It is working with some sample data I created. By changing the < and > symbols to : then using RegEx, Alteryx will recognize it as your symbol and only keep what's inside.

 

See the attached workflow where I separate each of your fields into distinct columns.

 

sdodero.PNG

Claje
14 - Magnetar

I've attached an example using the regex i created which is able to isolate the value <abc123> for column C

sdodero
6 - Meteoroid

Thanks a lot it worked !!

Labels
Top Solution Authors