Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regex for PDF

anom23
7 - Meteor

Hello

 

Can Alteryx parse a pdf file using regex?  If so, is there a way to parse the attached file? To the format below?

 

Item SequenceDeposit DateCheck #Deposit AmountCheck amountReturn Reason Disposition 
12342021-07-07123452800000$280.00Stop PaymentX

 

Many thanks

3 REPLIES 3
TheOC
16 - Nebula
16 - Nebula

hey @anom23 

I'm not sure about Regex - but you can parse a PDF using the Text mining tools - specifically the image to text tool within the Intelligence Suite (https://help.alteryx.com/20213/designer/image-to-text)

Cheers,
TheOC

Cheers,
TheOC
Connect with me:
LinkedIn Bulien
PhilipMannering
16 - Nebula
16 - Nebula

Hi @anom23 

 

You can use R, Python or the Intelligence Suite to parse a PDF. For example, the following would parse your pdf,

PhilipMannering_0-1640079089937.png

In this case it requires that you install the Python module pdfplumber

 

And then the rest of the data manipulation would be easiest in Alteryx,

PhilipMannering_1-1640079131043.png

 

Solution attached.

 

anom23
7 - Meteor

Thank you as this might work.  We don't have R or the Intelligence Suite only Python.  I will confirm with our IT group if we can download pdfplumber.

Labels
Top Solution Authors