Hi,
I am trying to extract text from a string and not sure what is the best way here.
The input is :
| manhatan road left turn | 
| manhatan road right | 
| manhatan road | 
and the output should be.
| manhatan road | left turn | 
| manhatan road | right | 
| manhatan road | 
Basically I need to separate identical text in a column and the extra in a second one.
Thanks for your inputs!
Solved! Go to Solution.
Hi Ben,
thank you for the reply. Here is a more specific set of data:
Input:
| ISOPROPANOL INDUSTRIAL | 
| ISOPROPANOL KML | 
| ISOPROPANOL PH | 
| ISOPROPANOL PH KML | 
| ISOPROPANOL PURE | 
| ISOPROPANOL PURE PREMIUM | 
| ISOBUTANOL | 
| ISOBUTANOL SYN | 
| ISOBUTANOL KML | 
| ISOBUTANOL NO | 
| TERTIARY BUTANOL | 
| TERTIARY BUTANOL SOL | 
| DIACETONE ALCOHOL BE | 
| DIACETONE ALCOHOL BE NS | 
Output:
| ISOPROPANOL | INDUSTRIAL | 
| ISOPROPANOL | KML | 
| ISOPROPANOL | PH | 
| ISOPROPANOL | PH KML | 
| ISOPROPANOL | PURE | 
| ISOPROPANOL | PURE PREMIUM | 
| ISOBUTANOL | |
| ISOBUTANOL | SYN | 
| ISOBUTANOL | KML | 
| ISOBUTANOL | NO | 
| TERTIARY BUTANOL | |
| TERTIARY BUTANOL | SOL | 
| DIACETONE ALCOHOL | BE | 
| DIACETONE ALCOHOL | BE NS | 
Hope this helps.
Many thanks,
Emil
That's a really interesting problem...
Have a simplified solution but it can't cope with the BE issue @BenMoss pointed out. Basically identifies the longest block of common words from the start in the list and keeps that.
You could add a filter to exclude 2 letter words or something like that but without knowing more, hard to do
Have attached a sample get start of common text - possibly over engineered couldnt think of easier way to do.
Thank you for the feedback. I does help but as you said it does not solve the entire problem. I will keep digging into it.
