Hi Community,
I'm trying to identify a list of phrases from a field that is only text. See below for example data set.
ID | Response |
1 | Acid tests to determine pH level |
2 | Advertising opportunities to capitalize on market |
3 | Capital Expenditures to promote growth |
4 | Projections for upcoming year |
5 | Marketing tasks |
Basically, I would like to take the "response" field above, and identify/match phrases from a running list of common phrases/words (ideally located in another file). For example, the running list would look like the below:
Common Phrases |
Acid Test |
Advertising |
Capital Expenditure |
Projection |
Ideally, I would like to have my final output be as follows so that I can summarize and determine how many of the same phrase occurred in the dataset:
ID | Response | Identified Phrases/Words |
1 | Acid tests to determine pH level | Acid Test |
2 | Advertising opportunities to capitalize on market | Advertising |
3 | Capital Expenditures to promote growth | Capital Expenditure |
4 | Projections for upcoming year | Projection |
5 | Marketing Tasks | Null |
I've tried several tools (i.e. find & replace, etc) but many seem to only work on exact matches and I need some level of non-exact matching to account for singular vs. plural words, etc.
Let me know if you can help!
Solved! Go to Solution.
This post may get you started join files by partial field similarities
I know I'm very late to the party here, but did you every find a clever way to do this? 🙂