Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Can you use text mining to compare text between two separate documents for similarities?

lmack
5 - Atom

I am assessing cyber security requirements between two documents - one is an internal assessment, and the other is an assessment hosted by a third-party. I have been told that the topics/information included in each document are the same, but at first glance, it doesn't look like that is true. So, I am trying to find a way to efficiently compare the two separate documents to validate the accuracy of this statement (i.e.; see just how much overlap there is in the requirements/questions within each document). One of the documents is in Word, the other is in PDF, but either or could easily be converted. I am wondering if I could use text mining to compare the documents to each other for overlap, or if there is a better solution. 

1 REPLY 1
adamweaver39
9 - Comet

Hey @lmack , I'm not sure if there is a better alternative, but you could definitely use Text mining for this!

Including a pretty simple Alteryx demo vid in case it helps.

https://www.youtube.com/watch?v=40iYJe_zd2A 

The Alteryx Intelligence Suite helps you uplevel your analysis to develop deeper insights about today and create accurate predictions for tomorrow. Watch how text trapped in PDF files or in long-form fields becomes usable data, and discover how you can quickly reveal the latent sentiments and ...
Labels