Hi all -
I'm trying to create a workflow that analyzes PDFs and returns key topics around certain words. I'm working with about 200 PDFs so manual analysis isn't plausible. For example, I want to be able to analyze if the word "revenue" is used in a sentence, which other words are frequently used in the same sentence.
I've already tried to the topic modeling tool, but was having a hard time grasping what each topic represented and gaining the level of insight I'm seeking.
Best,
Zach
Solved! Go to Solution.
If you're mainly looking for words, so you can do analysis based on frequency. This will work for VERY general topics and keywords like you outlined. The topic modeling tool is better though if you're doing analysis on actual topics, since they take into consideration frequency with other words (thus forming topics).
It sounds like you have access to the Intelligence Suite. I'd start by using the text pre-process to remove punctuation, stop words, and convert to root word. From there, you can split each word to a separate row using the text to columns and a space as a delimiter. Once you've removed all of the "noise" from your data, you can do a group by on the word and a count. You may be able to decipher topics based on frequency.
See attached for an example. Hope this helps!
Thank you so much for your help!