Hi, I just got the Intelligence Suite and am trying to pull data from PDF's. I have a table (5 columns) that starts on page 1 and continues on page 2 (5 columns). How do I go about annotating both pages into one table instead of 2? I see that the Image Template tool has the arrows to go to the second page but it makes me put in another name for the columns I select/annotate. How do I go about just selecting the columns from both pages and naming it 1 name so it comes in all together? Thank you.
Solved! Go to Solution.
Hi @ELPC,
If it's table data I'd suggest leveraging the automatic table detection first, you can refer to the example in the 'image to text' tool.
@Luke_C Thanks for the suggestion but I do not see an example in the Image to Text tool. Normally when I click on a tool, there should be a link to "Open Example" but I don't see it under any of the Computer Vision tools.
@ELPC Not sure what version you're running but I'm on 2022.1.
You can set-up table detection mode by connecting your image input to the optional input of the image template tool, then connecting it to the T anchor of the Image to Text tool. You should see a message like this:
Otherwise, in the past if you wanted to apply the same template to multiple pages you would use a formula tool to update the 'page' field from the image input tool, but I think in newer versions its a bit more complicated.
I tried using the Image Input tool and then connecting to the Image Template tool, but what would I do next to be able to extract two pages of information without having to set up specific extract areas for each page with the Image Template tool? Thanks.
Hey @ELPC,
Here's one way of setting your workflow up. Alteryx will automatically detect tables across all the pages in your document and output them in different rows. You would then be looking at parsing the contents in the final column.
Hope this helps!
Thank you! That method worked. The final piece I needed, that I found in a video, helped me understand how to parse out that Table0 column into separate rows by using a Formula tool to replace the "carriage returns" with ~ and then using the Text to Column tool to take the ~ and split the one row per page into the appropriate multiple rows of the PDF.
can you please send the link to the video that you saw how to parse the data from column 0