ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now

Alteryx Analytics Hub

Find answers, ask questions, and share expertise about Alteryx Analytics Hub.

Text analytics - abstracting a user defined text from a pdf

5 - Atom

Im analysing multiple pdf files scrapping through to find specific words. The project is to understand the number of times a specific word is mentioned in a pdf. For example, we are looking at reports about mine closure, and we want to get some data from the PDF to assess if the following words have been mentioned and how many times: "mine closure", "incident", "impact". If we can get some data on the frequency of these data fields then we can rate each report on risk. 


I believe I would start with PDF input - then should I look at Intelligence Suite or Regex? Thanks