Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cleaning a scrambled format PDF to excel converted file

ibesmond
8 - Asteroid

Hello,

 

I was given this pdf to begin with.

KY Pdf.png

 

A colleague was able to get it into this format for me: Now I have absolutely no idea where to go from here. I've used parsing, regex and text to column before, but I have never used them to this extreme of having data formatted in various forms going column to column and row to row. The real wild card is having sectional rows that appear between each set of records.

 

KY Excel.png

 

I basically broke it into 4 parts. A section header and field (part 1) and a record headers and fields section (parts 2-4)

 

KY yxmd shot.png

 

Any recommendations? Thank you.

 

 

 

 

2 REPLIES 2
T_Willins
14 - Magnetar
14 - Magnetar

PDFs are not fun, but getting it into an Excel format helps.  Attached is a workflow that organizes your data.  Since data can come in in different fields, there are steps to normalize the data into consistent fields so rules can be applied to them.  I added annotations, but let me know if you have questions. 

 

Edit - realized County Name wasn't in final Join, so it has been added.

 

PDF Formatting.png

 

 

ibesmond
8 - Asteroid

Wow!  This is amazing @T_Willins.  Thank you for taking the time to share this solution.  I almost didn't think it was possible.  Thanks for breaking down each step.  The more I see accomplished with this software, the more excited I get! 

 

Labels