Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extracting data within a string field split into different columns

Lauren_Holmes
8 - Asteroid

Hello,

I am trying to extract some data within a couple of string fields. I have used text to columns, however i am not sure if there is a better way to do this? I have two columns TEXT 1 and TEXT 2 these both contain codes within the hashtags. I would like to extract these codes so the outputs are like columns "PARSE TEXT 1" "PARSE TEXT 2". I would then like to extract from TEXT 1 the date into it's own column.

 

I would be very grateful for any ideas on the best way to use this - i did wonder whether RegEx would work?

 

Example below.

 

TEXT 1TEXT 2PARSE TEXT 1PARSE TEXT 2DATE EXTRACT TEXT 1
LH1234 - 12 Jun 2021 - #ABC#notified customer #JKL#ABCJKL12 JUN 2021
LH1234 - 10 Jun 2021 - #DEF#notified customer #MNO#DEFMNO10 JUN 2021
LH1234 - 01 Jun 2021 - #GHI#notified customer #PQR#GHOPQR01 JUN 2021
11 REPLIES 11
messi007
15 - Aurora
15 - Aurora

@Lauren_Holmes,

 

You can do this using regEx 🙂

Please see below,

messi007_0-1625063299830.png

Attached the workflow,

 

Hope this helps!

 

Regards,

apathetichell
18 - Pollux

I kept in the "#"s in the original - here's a corrected version.

Jonathan-Sherman
15 - Aurora
15 - Aurora

Hi @Lauren_Holmes,

 

You could use three Regex_Replace() functions to solve this, or use a regular Regex tool either method would work! I've gone for the former:

 

Formula 1: Extracting ABC

REGEX_Replace([TEXT 1], '.*?#(.*?)#.*', '$1')

 

Formula 2: Extracting JKL

REGEX_Replace([TEXT 2], '.*?#(.*?)#.*', '$1')

 

Formula 3: Extracting Date

TRIM(REGEX_Replace([TEXT 1], '.*?-(.*?)-.*', '$1'))

 

I've attached my workflow for you to download if needed!

 

Kind regards,

Jonathan

DawnDuong
13 - Pulsar
13 - Pulsar

hi @Lauren_Holmes 

Your hunch is right, Regex can solve it. Here is one possible way.
Dawn.

KarolinaRoza
11 - Bolide

hi @Lauren_Holmes 

 

You can use Regex.

Please find my solution:

 

KarolinaRoza_0-1625063698252.png

 

 

cheers,

Karolina

Lauren_Holmes
8 - Asteroid

Hi @KarolinaRoza do you have any suggestions say when a row has more than one code?

Currently it is keeping the last # and the first # for example output ABC# #DEF however i would ideally have below:

 

TEXT 1IDEAL OUTPUT
LH1234 - 30 JUN 2021 - #ABC# #DEF#ABC DEF
LH1234 - 30 JUN 2021 - #ABC# #DEF# #GHI#ABC DEF GHI
DawnDuong
13 - Pulsar
13 - Pulsar

hi @Lauren_Holmes 

You can use the Regex tool, set on tokenise (i.e. it picks up everything that matches a pattern) and input to rows, follow by summarize manipulation. Something like the below is one possible way.

Cheers,
Dawn.

RegexTokenise.PNG

 

apathetichell
18 - Pollux

In a formula tool you can use:

replace(REGEX_Replace([TEXT 1],".*?(\#.*\#).*","$1"),"#","")

 

That's a fairly different problem than the initial query.

Lauren_Holmes
8 - Asteroid

Thank you so much :)!

 

My workflow oddly adds a comma in-between each code where as yours does not so I will need to investigate why that is! 

Labels