Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regex for PDF

anom23
7 - Meteor

Hello

 

Can Alteryx parse a pdf file using regex?  If so, is there a way to parse the attached file? To the format below?

 

Item SequenceDeposit DateCheck #Deposit AmountCheck amountReturn Reason Disposition 
12342021-07-07123452800000$280.00Stop PaymentX

 

Many thanks

3 REPLIES 3
TheOC
16 - Nebula
16 - Nebula

hey @anom23 

I'm not sure about Regex - but you can parse a PDF using the Text mining tools - specifically the image to text tool within the Intelligence Suite (https://help.alteryx.com/20213/designer/image-to-text)

Cheers,
TheOC

Cheers,
TheOC
Connect with me:
LinkedIn Bulien
PhilipMannering
16 - Nebula
16 - Nebula

Hi @anom23 

 

You can use R, Python or the Intelligence Suite to parse a PDF. For example, the following would parse your pdf,

PhilipMannering_0-1640079089937.png

In this case it requires that you install the Python module pdfplumber

 

And then the rest of the data manipulation would be easiest in Alteryx,

PhilipMannering_1-1640079131043.png

 

Solution attached.

 

anom23
7 - Meteor

Thank you as this might work.  We don't have R or the Intelligence Suite only Python.  I will confirm with our IT group if we can download pdfplumber.

Labels
Top Solution Authors