Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

PDF to Excel using OCR

Dhananjay_Galphat1
7 - Meteor

Hi,

 

I am trying to convert PDF to excel using PDF input tool but I am getting the data somewhat here and there. Is there any option to convert PDF to Excel or Text using OCR in Alteryx.

4 REPLIES 4
CharlieS
17 - Castor
17 - Castor

Hi @Dhananjay_Galphat1 

 

Edit: Exactly what issues are you having? is the tool or the data misbehaving? 

Dhananjay_Galphat1
7 - Meteor

Data is misbehaving. I want to keep my input as dynamic. and I want to create separate excel for each pdf. In output excel some of the data is used in header part of output excel e.g. Date of generation of pdf.

 

so  I thought OCR will be a better option.

CharlieS
17 - Castor
17 - Castor

I would suggest inputting everything from the PDF then using other Alteryx tools to parse and arrange the data as you desire. You could set a single field of the entire page, or use one of the pdf tools from the public Gallery to read the characters and input them to work with. 

 

https://gallery.alteryx.com/#!search/undefined/pdf 

 

So the first step is getting the entire pdf document input into Alteryx. After that, let us know if you need help with the parsing and formatting. Posting a sample workflow with a Text Input from your pdf is usually best to share. 

AlteryxTrev
10 - Fireball

Hi @CharlieS ,

 

It looks like the above link no longer takes you anywhere. Is this what you were referring to?

https://community.alteryx.com/t5/Public-Community-Gallery/PDF-Input/ta-p/887038

 

Additionally, are there any other OCR techniques in Alteryx? I am looking to scan a postcard and read the pdf/image into Alteryx to pull off a tracking code that was on the postcard.

 

Thank you!

Trevor

Labels
Top Solution Authors