Hi,
I am trying to convert PDF to excel using PDF input tool but I am getting the data somewhat here and there. Is there any option to convert PDF to Excel or Text using OCR in Alteryx.
Solved! Go to Solution.
Edit: Exactly what issues are you having? is the tool or the data misbehaving?
Data is misbehaving. I want to keep my input as dynamic. and I want to create separate excel for each pdf. In output excel some of the data is used in header part of output excel e.g. Date of generation of pdf.
so I thought OCR will be a better option.
I would suggest inputting everything from the PDF then using other Alteryx tools to parse and arrange the data as you desire. You could set a single field of the entire page, or use one of the pdf tools from the public Gallery to read the characters and input them to work with.
https://gallery.alteryx.com/#!search/undefined/pdf
So the first step is getting the entire pdf document input into Alteryx. After that, let us know if you need help with the parsing and formatting. Posting a sample workflow with a Text Input from your pdf is usually best to share.
Hi @CharlieS ,
It looks like the above link no longer takes you anywhere. Is this what you were referring to?
https://community.alteryx.com/t5/Public-Community-Gallery/PDF-Input/ta-p/887038
Additionally, are there any other OCR techniques in Alteryx? I am looking to scan a postcard and read the pdf/image into Alteryx to pull off a tracking code that was on the postcard.
Thank you!
Trevor