Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Read the Table format data from the pdf as it is i.e. create columns in Alteryx workflow.

Mohd-Siddiqui1
8 - Asteroid

Hi there,

 

I have a pdf's page which is containing the text in below mentioned format.

 

 

Some dummy text and paragraph on the page of pdf. Some dummy text and paragraph on the second page of pdf.

Some dummy text and paragraph on page of pdf. Some dummy text and paragraph on the page of pdf.

Some dummy text and paragraph on the page of pdf. Some dummy text and paragraph on page of pdf.

 

First Column name is coming

in two lines

Second Column

Name is coming

in four

lines

Third Column

Name is

coming in

five

lines

Serial no

ABcdefg


Pqrst

 

1. ab

2. dummy name

3. dummy name 2

4. dummy name 3

5. dummy name 4

6. dummy name 5

7. dummy name 6

 

Some more dummy text lines over here. Some more dummy text lines over here.

Some more dummy text lines over here. Some more dummy text lines over here.

 

Page 2 of 32

 

 

Expected Output: Same Columns (Columns present in the table which is present on the 2nd page on pdf) should get created in the Alteryx.

 

I am using pdf Reader tool and Text to Column Tool (delimiter as "\n") but I am not able to get the output as per my expectations.

 

Could you please help.

 

Thanks

3 REPLIES 3
gabrielvilella
14 - Magnetar

Hi @Mohd-Siddiqui1, the approach you are taking by using the PDF reader tool is one way of achieving this. However, this approach may require some complex data parsing to get to that table view. Another path is using the Computer Vision tools. With them you can create the table much more easily. Those tool are part of the Intelligen Suite package, if you don't have that, please contact your Alteryx sales representative. 

Mohd-Siddiqui1
8 - Asteroid

Hi @gabrielvilella 

 

As of now, I am not having Intelligence Suite Package and it will take some time to get it installed.

Could you please help me to resolve this issue with the help of PDF Reader Tool only.

gabrielvilella
14 - Magnetar

We can help you with specific questions regarding the workflow you are trying to build. You can create a new post with that specific data parsing question. If you could provide a sample data set would be helpful. 

Labels