I need to search text for matches against a list of key words.
Each row of text contains 3 text fields. Text_1 is searched for hits. If a hit is registered, it receives a score of 1 and the entire row passed to an output union tool (with no need to search Text_2 or Text_3). Entries that fail to generate a hit in the Text_1 filter are passed to the Text_2 filter where Text_2 is searched for keywords hits... At the end, I union the results of the 3 filter passes and Score tells me which filter generated the hit.
As currently configured. Score tells me which Text field generated the hit, but I do not know which keyword or keywords generated the hit.
How could you generate an output that tells you which keywords generated the hit -- and as an extra bonus a field that included the 5 words before and 5 words after the keyword?
The sample text below contains the keywords PLANES and unmanned air systems. Other keywords might include tacos, Cervelo, or EW. A text field may generate zero hits; it may generate multiple separate hits; or it may generate multiple hits of the same keyword.
A generic field...
TEXT FIELD |
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod PLANES incididunt ut labore et dolore magna aliqua. Id porta nibh venenatis cras sed felis eget. Amet est placerat in egestas. Sed augue lacus viverra vitae congue. At lectus urna duis convallis convallis. Neque unmanned air systems uisque egestas diam in. Ornare lectus sit amet est placerat in. Elementum nibh tellus molestie nunc non blandit massa. Dictum non consectetur a erat nam at. Cras tincidunt lobortis feugiat vivamus at augue eget arcu. Egestas integer eget aliquet nibh praesent tristique magna sit amet. Elit ullamcorper dignissim cras tincidunt lobortis feugiat vivamus at augue. Molestie a planes at erat pellentesque adipiscing. Sagittis nisl starship kloi rhoncus urna neque viverra justo. Mi proin sed libero enim sed faucibus turpis. |
A real world example where the keyword might be hypersonic
Similarly, advancements in material design, processing and manufacturing are enabling novel material architectures that can further enhance performance and resilience in structures such as leading edges, windows and apertures, propulsion systems, and space structures. Exemplar areas of research within the Materials for Extreme Environments thrust include the following: 1) high temperature materials for hypersonic platforms; 2) high temperature window and aperture materials; 3) radiation and/or electromagnetic pulse (EMP) hardened electronics for space platforms; and 4) coatings for platform survivability in corrosive environments. |
@danilang That's the challenge, especially with S&T text. Words such as advanced, material, information, communications, etc. are used left and right so its difficult to discern communications from secure communications; intelligence from artificial intelligence; information from information assurance, or even electronic from electronic warfare.
Hey - Kelly on the highlighting keywords one of my text boxes has multiple keywords anyway to highlight all the keywords identified.
P.S thanks to everyone on this thread. my team used to review about 20k line items using an excel formula to identify key words.
=COUNTIF(TextH:H, C3) where C3 would be the keyword e.g. *God*
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |