Hi,
I am trying to extract text from a string and not sure what is the best way here.
The input is :
manhatan road left turn |
manhatan road right |
manhatan road |
and the output should be.
manhatan road | left turn |
manhatan road | right |
manhatan road |
Basically I need to separate identical text in a column and the extra in a second one.
Thanks for your inputs!
Solved! Go to Solution.
Hi Ben,
thank you for the reply. Here is a more specific set of data:
Input:
ISOPROPANOL INDUSTRIAL |
ISOPROPANOL KML |
ISOPROPANOL PH |
ISOPROPANOL PH KML |
ISOPROPANOL PURE |
ISOPROPANOL PURE PREMIUM |
ISOBUTANOL |
ISOBUTANOL SYN |
ISOBUTANOL KML |
ISOBUTANOL NO |
TERTIARY BUTANOL |
TERTIARY BUTANOL SOL |
DIACETONE ALCOHOL BE |
DIACETONE ALCOHOL BE NS |
Output:
ISOPROPANOL | INDUSTRIAL |
ISOPROPANOL | KML |
ISOPROPANOL | PH |
ISOPROPANOL | PH KML |
ISOPROPANOL | PURE |
ISOPROPANOL | PURE PREMIUM |
ISOBUTANOL | |
ISOBUTANOL | SYN |
ISOBUTANOL | KML |
ISOBUTANOL | NO |
TERTIARY BUTANOL | |
TERTIARY BUTANOL | SOL |
DIACETONE ALCOHOL | BE |
DIACETONE ALCOHOL | BE NS |
Hope this helps.
Many thanks,
Emil
That's a really interesting problem...
Have a simplified solution but it can't cope with the BE issue @BenMoss pointed out. Basically identifies the longest block of common words from the start in the list and keeps that.
You could add a filter to exclude 2 letter words or something like that but without knowing more, hard to do
Have attached a sample get start of common text - possibly over engineered couldnt think of easier way to do.
Thank you for the feedback. I does help but as you said it does not solve the entire problem. I will keep digging into it.