We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

PDF to text: Extracting table with empty columns in between

MT1107
5 - Atom

Hi Alteryx community,

 

I'm having problems with trying to extract the table from the pdf because of the empty columns in between. See Below:

PDFPDF

 

 I am able to extract the information using the pdf input tool, but it would shift the amounts into a different column because the of empty columns in between. Below is the result.

 

PDF input tool outputPDF input tool output

 

I have tried using a RegexReplace formula ('\s{10,}') to attempt to accommodate for the spaces in between the amounts, but it would not give the result that I want. Essentially I would like to add a delimiter of |0| if there are more than 10 spaces, but the result would vary for each row since the spaces are not consistent at time. I have tried playing around with the number of spaces within the expression to try and get the desired output (below), but to no avail.

 

Desired resultDesired result

 

Is there a better way to separate the strings to columns in consideration of the long white spaces in between the columns?

 

Any help/tips would be helpful. Thanks in advance!

 

Marc

0 REPLIES 0
Labels
Top Solution Authors