Text analytics - abstracting a user defined text from a pdf

Im analysing multiple pdf files scrapping through to find specific words. The project is to understand the number of times a specific word is mentioned in a pdf. For example, we are looking at reports about mine closure, and we want to get some data from the PDF to assess if the following words have been mentioned and how many times: "mine closure", "incident", "impact". If we can get some data on the frequency of these data fields then we can rate each report on risk. 


I believe I would start with PDF input - then should I look at Intelligence Suite or Regex? Thanks