Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract table (13f filing) from PDF

craigja
8 - Asteroid

Hi,

I need to extract the data in the table from the PDF below, can anybody help with this?  I see there is now a tool available in Alteryx to do this but I dont have access to it (Not licensed) so Im hoping there is another way to do it 🙂

 

https://www.sec.gov/files/investment/13flist2021q2.pdf

 

2 REPLIES 2
JoaoLeiteV
10 - Fireball

Hello @craigja,

 

Those tools require an additional license to use. On the other hand, you can use some Macros made in R to extract the data. Beware that you'll still need to clean and parse it.

 

You can use the PDF Input (Text and Image) macro to read both text and image PDF files, but I find this macro harder to clean the data. I like using the PDF Input macro, all the data will be split into rows, then you just filter what you want and clean it.

 

Both require the Predictive tools to be installed in Alteryx. Check-in your search bar or on the Developer Tools tab if you have the "R" tool block.

 

JoaoLeiteV_0-1626780210387.png

 

Let me know if this was helpful! 

craigja
8 - Asteroid

Will give the macros a try, I do have the predicitve tools installed so they should work

Labels