Rather specific use case, looking for thoughts.
Could Alteryx be used to read multiple PDF files and essentially try to ferret out key contract terms to populate a contract register. So things like stated price increases or charge exemptions and agreed pricing?
I would imagine a combination of PDF tools and RegEx....
Looking to find someone who has done it or tried it to get a sense of if it's possible but also, what challenges were faced in trying to parse the data etc...
Thanks!
Hi @ChrisMelck,
Yes this is possible; as you say, after converting the .pdf document it requires some regex components to bring it into a data structure.
My colleague Ollie Clarke, has created a macro that inputs the data, and in his documentation has included an example of the parsing component, but as I'm sure you appreciate this is unique to every usecase so it would be unbelievably complex to build a mechanism that did that part!
Here is a link to the macro: https://gallery.alteryx.com/#!app/PDF-Input/5b685aff0462d710907f7a3b
And the documentation is linked in the description.
You'll need to have the predictive toolkit installed as this uses the R tool; and you'll need to then install a certain package, again this process is detailed in the documentation page.
Ben