Alteryx Designer Desktop Discussions

HW1 · ‎01-14-2021

I have an output from a pdf parser tool that generates output from a matrix in a pdf to separate rows.

Please find attached.

I am unable to correctly parse the Site, Address and the Description column as per the expected output.

I want the output in the way the expected output is.

Will regex work to extract the data in the correct format? If so, how?

Thanks.

TrevorS · ‎01-19-2021

Hello @HW1
So looking at your workflow there are a few things going on here.

1. When you filter out the data, you are left with rows (like Rows 1,2, and 22) that appear to be new headers.

2. The data within is not separated with the same characters, for example, Line 4 looks like "31/12/20 120L Clinical Waste Bin for the month of January Bin Rent 2 4.33 8.66"

But, Line 3 looks like "15/12/20 | JOB-2776383-N61T7 120L Clinical Waste Bin Service 1 34.45 34.45"

This adds another level of data prep where you need the same delimiters to separate your data.

I would recommend addressing #1 first though, as each of these kinds of rows appear to be a new dataset, If so, what is the importance?

Community Moderator

Alteryx Designer Desktop Discussions

Return output in Expected format

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...