Hi everyone,
I hope all of you are having a good start of the year.
I was wondering that this might be actually simple but I don't know how to implement that. Here is the example dataset:
-----------------------------
ADV
Bravo
Bravo Tech
Gamma group
Gamma Industries Co
------------------------------
Now there are only 3 companies in this field (ADV, Bravo and Gamma). Idea is to merge this field into three rows only (3 companies).
- I know atleast for the first 4 rows. I could do something like string matching and if string of row matches row+1 (something like fuzzy match with a threshold or matching the first word without spaces), then the following row+1 can be replaced with the row string. Any idea how to do this?
- For the last two rows, I am not sure if that is even possible to handle such cases. I tried removing words such as "group" "co" and many more but this takes forever and lot of visual inspection of field which has more than 10,000 companies.
A help will be much appreciated.