Hi Alteryx Community,
I’m working on a workflow to extract a specific table from a batch of PDFs using the "PDF to Text" tool with the "Line" method.
In the attached Excel file:
The "Lines method" tab shows the raw extracted data from the PDFs.
The "Table snips" tab includes screenshots of the tables from the PDFs, provided for reference.
The "Expected Result" tab shows the desired output format.
I'm struggling to format the extracted data into a structured table. The main issue is that the relevant values are not always aligning correctly under the appropriate headers, likely due to inconsistencies in spacing or formatting in the original PDF files.
Could anyone guide me on how to transform the data in the "Lines method" tab into the desired format shown in the "Expected Result" tab? Any suggestions or example workflows would be greatly appreciated.
Thank you in advance for your support!
Best regards,
Buddhi