I am assessing cyber security requirements between two documents - one is an internal assessment, and the other is an assessment hosted by a third-party. I have been told that the topics/information included in each document are the same, but at first glance, it doesn't look like that is true. So, I am trying to find a way to efficiently compare the two separate documents to validate the accuracy of this statement (i.e.; see just how much overlap there is in the requirements/questions within each document). One of the documents is in Word, the other is in PDF, but either or could easily be converted. I am wondering if I could use text mining to compare the documents to each other for overlap, or if there is a better solution.
Hey @lmack , I'm not sure if there is a better alternative, but you could definitely use Text mining for this!
Including a pretty simple Alteryx demo vid in case it helps.