Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

PDF Input Macro Inconsistent Import

trettelap
8 - Asteroid

Using the PDF Input macro from the gallery, and for my workflow the spacing of the PDF is important. However, it looks like the imported rows are not maintaining their position on the page. 

 

Has anyone run into this an had a solution?

6 REPLIES 6
fmvizcaino
17 - Castor
17 - Castor

Hi @trettelap ,

 

I'm not sure which PDF tool you are talking about, but I would suggest you 2 PDF input tools that I found the best ones.

 

PDF to tabular: Best when your PDF is not an image. This type of PDF, you are able to select words inside the pdf with your mouse

https://gallery.alteryx.com/#!app/PDF-to-Tabular/5fc290ac0462d71998cc0fb8

 

PDF reader: Best when PDF is a image. It needs to transform images to text, but it seems to work for PDFs with high quality

https://gallery.alteryx.com/#!app/PDF-Input--Text-and-Image-/5be5ec8d0462d71ffce6deaa

 

 

Best,

Fernando Vizcaino

patrick_mcauliffe
14 - Magnetar
14 - Magnetar

Occasionally I've run into this.  

It typically has to do with the way the PDF was written.

The best way around this, I've found is to determine what the correct location should be for each piece of text.  

I my case, it was PDF copies of spreadsheets that I was turning back into a data table.
Because of this, I knew that the first character of every column should line up.

So, I pulled in the data, pivoted the rows into name : value pairs, calculated the length of each row, split to rows by single character

::middle stuff::

data output.

Make sense? 🙂

I'll see if I can sanitize a sample.

trettelap
8 - Asteroid

Thank you for those suggestions! getting an error on the tabular macro...will have to work through that. I did try the second but for some reason it isn't reading the pdf clearly...pdf is pretty clear though.

fmvizcaino
17 - Castor
17 - Castor

Hi @trettelap ,

 

It is been a long time since I first used that tool, but I think you need to open Alteryx as an administrator in order for python to install the libs that is needed for this tool to work.

trettelap
8 - Asteroid

I did try this and still an error....

trettelap
8 - Asteroid

Wanted to close the loop on this but it looks like I had install Ghostscript and add it to the windows path....not sure of the technical aspects but it is working great!

Labels
Top Solution Authors