Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regex for PDF

anom23
7 - Meteor

Hello

 

Can Alteryx parse a pdf file using regex?  If so, is there a way to parse the attached file? To the format below?

 

Item SequenceDeposit DateCheck #Deposit AmountCheck amountReturn Reason Disposition 
12342021-07-07123452800000$280.00Stop PaymentX

 

Many thanks

3 REPLIES 3
TheOC
15 - Aurora
15 - Aurora

hey @anom23 

I'm not sure about Regex - but you can parse a PDF using the Text mining tools - specifically the image to text tool within the Intelligence Suite (https://help.alteryx.com/20213/designer/image-to-text)

Cheers,
TheOC


Bulien
PhilipMannering
16 - Nebula
16 - Nebula

Hi @anom23 

 

You can use R, Python or the Intelligence Suite to parse a PDF. For example, the following would parse your pdf,

PhilipMannering_0-1640079089937.png

In this case it requires that you install the Python module pdfplumber

 

And then the rest of the data manipulation would be easiest in Alteryx,

PhilipMannering_1-1640079131043.png

 

Solution attached.

 

anom23
7 - Meteor

Thank you as this might work.  We don't have R or the Intelligence Suite only Python.  I will confirm with our IT group if we can download pdfplumber.

Labels