Hi,
I am trying to extract text from a string and not sure what is the best way here.
The input is :
| manhatan road left turn |
| manhatan road right |
| manhatan road |
and the output should be.
| manhatan road | left turn |
| manhatan road | right |
| manhatan road |
Basically I need to separate identical text in a column and the extra in a second one.
Thanks for your inputs!
Solved! Go to Solution.
Hi Ben,
thank you for the reply. Here is a more specific set of data:
Input:
| ISOPROPANOL INDUSTRIAL |
| ISOPROPANOL KML |
| ISOPROPANOL PH |
| ISOPROPANOL PH KML |
| ISOPROPANOL PURE |
| ISOPROPANOL PURE PREMIUM |
| ISOBUTANOL |
| ISOBUTANOL SYN |
| ISOBUTANOL KML |
| ISOBUTANOL NO |
| TERTIARY BUTANOL |
| TERTIARY BUTANOL SOL |
| DIACETONE ALCOHOL BE |
| DIACETONE ALCOHOL BE NS |
Output:
| ISOPROPANOL | INDUSTRIAL |
| ISOPROPANOL | KML |
| ISOPROPANOL | PH |
| ISOPROPANOL | PH KML |
| ISOPROPANOL | PURE |
| ISOPROPANOL | PURE PREMIUM |
| ISOBUTANOL | |
| ISOBUTANOL | SYN |
| ISOBUTANOL | KML |
| ISOBUTANOL | NO |
| TERTIARY BUTANOL | |
| TERTIARY BUTANOL | SOL |
| DIACETONE ALCOHOL | BE |
| DIACETONE ALCOHOL | BE NS |
Hope this helps.
Many thanks,
Emil
That's a really interesting problem...
Have a simplified solution but it can't cope with the BE issue @BenMoss pointed out. Basically identifies the longest block of common words from the start in the list and keeps that.
You could add a filter to exclude 2 letter words or something like that but without knowing more, hard to do
Have attached a sample get start of common text - possibly over engineered couldnt think of easier way to do.
Thank you for the feedback. I does help but as you said it does not solve the entire problem. I will keep digging into it.
