Hello
Can Alteryx parse a pdf file using regex? If so, is there a way to parse the attached file? To the format below?
Item Sequence | Deposit Date | Check # | Deposit Amount | Check amount | Return Reason | Disposition |
1234 | 2021-07-07 | 12345 | 2800000 | $280.00 | Stop Payment | X |
Many thanks
hey @anom23
I'm not sure about Regex - but you can parse a PDF using the Text mining tools - specifically the image to text tool within the Intelligence Suite (https://help.alteryx.com/20213/designer/image-to-text)
Cheers,
TheOC
Hi @anom23
You can use R, Python or the Intelligence Suite to parse a PDF. For example, the following would parse your pdf,
In this case it requires that you install the Python module pdfplumber
And then the rest of the data manipulation would be easiest in Alteryx,
Solution attached.
Thank you as this might work. We don't have R or the Intelligence Suite only Python. I will confirm with our IT group if we can download pdfplumber.