Auto detecting a table of data from a PDF using the new Computer Vision tools from Intelligence Suite is a great new functionality, but figuring out how to make it work was not intuitive for me. I did a lot of trial and error figuring out how to extract a table of data from a pdf. Here's how I made it work.
the output contains a Markup field - connect the output to the T anchor of the Image to Text tool
the output contains one pipe-delimited field called table0 (there may be additional table fields depending on the structure of the input pdf.)
My workflow (version 2021.3) and the PDF of titanic data are attached. I hope you find this helpful!
terry10
Thanks for sharing! It's good to know I'm not the only one who struggled to work this out.
The 'Parse Table Columns' container of this workflow is operating properly. When the text to columns tool is run, it only grabs the first line of the Table0 column which is 0|1|2|3.. etc. and does not grab any of the actual table data. I'm not sure how to get the actual underlying data into row format to then continue processing.
Hello, I tried above but unable to extract it in tabular format, How do I extract table into Alteryx for further blending ?
Appreciate your support.
Hi Juuustin,
Are you able to get the underlying data, I couldn't as well.