Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Return output in Expected format

HW1
9 - Comet

I have an output from a pdf parser tool that generates output from a matrix in a pdf to separate rows.

 

Please find attached.

 

I am unable to correctly parse the Site, Address and the Description column as per the expected output.

 

I want the output in the way the expected output is.

 

Will regex work to extract the data in the correct format? If so, how?

 

Thanks.

 

1 REPLY 1
TrevorS
Alteryx Alumni (Retired)

Hello @HW1 
So looking at your workflow there are a few things going on here.

 

1. When you filter out the data, you are left with rows (like Rows 1,2, and 22) that appear to be new headers.

2. The data within is not separated with the same characters, for example, Line 4 looks like "31/12/20 120L Clinical Waste Bin for the month of January Bin Rent 2 4.33 8.66"

But, Line 3 looks like "15/12/20 | JOB-2776383-N61T7 120L Clinical Waste Bin Service 1 34.45 34.45" 


This adds another level of data prep where you need the same delimiters to separate your data.


I would recommend addressing #1 first though, as each of these kinds of rows appear to be a new dataset, If so, what is the importance?

 

Community Moderator
Labels
Top Solution Authors