Alteryx Designer Desktop Discussions

MichaelOC · ‎04-14-2021

Im analysing multiple pdf files scrapping through to find specific words. The project is to understand the number of times a specific word is mentioned in a pdf. For example, we are looking at reports about mine closure, and we want to get some data from the PDF to assess if the following words have been mentioned and how many times: "mine closure", "incident", "impact". If we can get some data on the frequency of these data fields then we can rate each report on risk.

I believe I would start with PDF input - then should I look at Intelligence Suite or Regex? Thanks

RishiK · ‎04-15-2021

@MichaelOC The Intelligent Suite could help you here.

Have a look into the Topic Modeling features:

https://help.alteryx.com/current/designer/topic-modeling

https://community.alteryx.com/t5/Data-Science/Getting-to-the-Point-with-Topic-Modeling-Part-1-What-i...

Alteryx Designer Desktop Discussions

Text analytics - abstracting a user defined text from a pdf