Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Read PDF and Extract Data in Tabular Format

Sasthana25
8 - Asteroid

Hi, 

 

I have a use case where I get a PDF file from client and extract the data from it and populate details in an excel sheet with prefixed headers. I have to automate this process end to end attached is the dummy PDF file for your reference. 

 

Here are few things I want to acheive: 

1. Extract the data from PDF to Excel 

2. I read somewhere that it requires OCR tool integration which is not supported in my org. so is there a way to convert the PDF to text or docx or any other readable format from where I have pull the details in excel template. 

 

Any help on this is massively appreciated. 

 

Thanks,
Swati 

 

18 REPLIES 18
Sasthana25
8 - Asteroid

Alteryx Admin Designer
Version : 2019.3.6.20285

mceleavey
17 - Castor
17 - Castor

Try the one attached.

 

M



Bulien

Sasthana25
8 - Asteroid

Thank you so much @mceleavey I am able to import the workflow. However, this is my 1st time working on such use case so i am having difficulty to understand from the file input. Below is the attached image for reference; also I haven't encountered this question mark tool before. Could you please explain how this workflow functions as currently it shows an error : The entry point is invalid. 

 

Thanks again for your kind support & response. 

mceleavey
17 - Castor
17 - Castor

@Sasthana25 ,

 

you don't have any of the tools.

 

First, download the PDF Reader macro and install it.

Close and re-open Alteryx and you should see the first tool now. The black squares with question marks means you don't have that tool.

 

Next, replace the second and third black squares with the "Data Cleansing" tool:

mceleavey_0-1624014210473.png

 

Your workflow should look like this:

 

mceleavey_1-1624014240372.png

In the first tool simply replace the filepath to where your pdf is:

 

mceleavey_2-1624014278318.png

 

This should have all been done when you installed the last package I sent you but something has gone wrong.

 

Try that and let me know.

 

M.

 

 

 



Bulien

mceleavey
17 - Castor
17 - Castor

@Sasthana25 ,

 

also...UPGRADE ALTERYX!

Your version is almost two years out of date.



Bulien

Sasthana25
8 - Asteroid

I know right - this is so sad to work on old versions but it is what it is within org. 

But again thank you so much for you kind support here.

mceleavey
17 - Castor
17 - Castor

no problem.gif



Bulien

priya_mohana_dhl
7 - Meteor

Hi,

I want to read a pdf. Tried pdf example given here. Have downloaded  the PDF Reader macro and installed it. My Alteryx
Version: 2021.1.4.26400

 

When I run the workflow, I get the following error.

priya_mohana_dhl_0-1636442303201.png

 

Thanks,

Priya.

qgpl
8 - Asteroid

Hi @mcleavey - I tried to download your workflow example however I'm getting this error. Do you know where we can download this plugin please? Sorry I'm a beginner!

 

qgpl_0-1669982095523.png

 

 

Thank you

Labels
Top Solution Authors