Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract date from text

emil
7 - Meteor

Hi,

 

I am trying to extract text from a string and not sure what is the best way here.

 

The input is :

 

manhatan road left turn
manhatan road right
manhatan road

 

and the output should be.

 

manhatan roadleft turn
manhatan roadright
manhatan road 

 

Basically I need to separate identical text in a column and the extra in a second one.

 

Thanks for your inputs! 

5 REPLIES 5
BenMoss
ACE Emeritus
ACE Emeritus
Hi!

In order to provide you with a robust solution could you share more examples of what the text may look like.

Ben
emil
7 - Meteor

Hi Ben,

 

thank you for the reply. Here is a more specific set of data:

 

Input:

 

ISOPROPANOL INDUSTRIAL
ISOPROPANOL KML
ISOPROPANOL PH
ISOPROPANOL PH KML
ISOPROPANOL PURE
ISOPROPANOL PURE PREMIUM
ISOBUTANOL
ISOBUTANOL   SYN
ISOBUTANOL KML
ISOBUTANOL NO
TERTIARY BUTANOL
TERTIARY BUTANOL SOL
DIACETONE ALCOHOL BE
DIACETONE ALCOHOL BE   NS

 

Output:

 

ISOPROPANOL INDUSTRIAL
ISOPROPANOL KML
ISOPROPANOL PH
ISOPROPANOL PH KML
ISOPROPANOL PURE
ISOPROPANOL PURE PREMIUM
ISOBUTANOL 
ISOBUTANOLSYN
ISOBUTANOLKML
ISOBUTANOLNO
TERTIARY BUTANOL 
TERTIARY BUTANOLSOL
DIACETONE ALCOHOL BE
DIACETONE ALCOHOL BE   NS

 

Hope this helps.

 

Many thanks,

Emil

BenMoss
ACE Emeritus
ACE Emeritus
This is a difficult challenge I feel; but happy for someone to prove me wrong.

Take your last example, the logic of identifying the same words and trimming after. It matches all the way to BE so how would the logic know that this should be parsed out to the second field eleven though it's a repeated element so should actually exist in the first.
jdunkerley79
ACE Emeritus
ACE Emeritus

That's a really interesting problem...

 

Have a simplified solution but it can't cope with the BE issue @BenMoss pointed out. Basically identifies the longest block of common words from the start in the list and keeps that.

 

You could add a filter to exclude 2 letter words or something like that but without knowing more, hard to do 

 

Have attached a sample get start of common text - possibly over engineered couldnt think of easier way to do.

 

emil
7 - Meteor

Thank you for the feedback. I does help but as you said it does not solve the entire problem. I will keep digging into it.

Labels