Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Dev Space

Customize and extend the power of Alteryx with SDKs, APIs, custom tools, and more.
SOLVED

Input PDFs (they are images)

clant
8 - Asteroid

Hello all!

 

I am a bit stuck, I originally posted this in the designer forums but did not get many responses. 

 

The problem I have is: We currently receive pdf documents which have been faxed to us. These are hand written order forms. We need to take these forms and run some form of handwriting OCR on them to help our data input guys out.

 

My idea is to write a python tool which can convert the pdf to a jpg. We can then take this jpg and upload it to either azure or google ocr then get the results back. We have tested the azure ocr and it worked great.

 

So far I have written the python which does the pdf to jpg and am trying and failing to make this into a tool in alteryx (At the moment I am getting a "typeError:__init__ () takes 2 positional arguments but 4 were give" but i think i can work this out).

 

Can someone please advise if what I am doing will actually work or if there is a better way to do this?

 

Thank you!

 

Cheers

 

Chris

29 REPLIES 29
tlarsen7572
11 - Bolide
11 - Bolide

Hey @Awesomeville, so I ended up taking a shot at this Azure service today.  I was able to sign up for the free tier and start testing things out.

 

I was able to get a custom tool working that sends images and PDFs to the Azure endpoint, waits for Azure to process the files, and then downloads and parses the results.  I tested this on some handwritten sentences I wrote and scanned to PDF for testing, and am amazed at how well it works.  I can see a huge potential use case for my department regarding things like contract analysis.  This is a powerful OCR service Microsoft provides.  The tool is attached to this message if you want to try it out.  Let me know if you run into issues.  Also, you can view the code here on GitHub.

 

If you want to talk about how it works, feel free to start a discussion and I can walk you through the code.

Nick612Haylund
10 - Fireball
10 - Fireball

Not too shabby at all @tlarsen7572  (awesome)

 

TestingOCR.png

MattDuncan
7 - Meteor

Thanks for the awesome work!

 

I can't input the tool so will need to use the code on GitHub. Can you walk me through how to put this into a workflow? I'm quite new to the Python SDK world. 

 

If you attach an example workflow showing how to convert a PDF into data, that would be perfect

tlarsen7572
11 - Bolide
11 - Bolide

Hey @MattDuncan, welcome to the Python SDK world!

 

The easiest place to start would be installing the tool from the yxi.  What do you mean by, 'I can't input the tool'?  Inside the zip should be a yxi file.  Extract it and open it from Alteryx.  Alteryx will present an installation dialog.  Once you install the tool you can find it in the Laboratory tab:

OCR1.PNG

 

If you cannot find the Laboratory tab, click the plus sign at the right of the tabs and make sure Laboratory is selected:

OCR2.PNG

 

Once the tool is installed, start your workflow by creating a list of file paths you want converted.  I usually use the Text Input tool or the Directory tool for this:

OCR3.PNG

 

Add the OCR tool and configure it with the endpoint and key from your Azure portal:

OCR4.PNG

 

The easiest way to get the endpoint and key is to go to the Overview or Quick start sections on Azure.  This is what my Quick start looks like.  I can copy the endpoint and key right from this page and paste it into the Alteryx tool:

OCR5.png

 

And that should be it.  The beauty of the Python SDK is that there is no configuration required on your end beyond installing the tool with the YXI file.  If you are having an error doing so, let us know and we can troubleshoot.

Jamie12
5 - Atom

Hi @

 

Thank you for sharing! I was able to successfully install the OCR tool in Alteryx. However, I've been having trouble locating the endpoint to use in the configuration since my Quick Start section in Azure doesn't look like yours in the screenshot. In an attempt to create an endpoint, I added a virtual machine in Azure with a static IP address and tried to use that as the endpoint. Though, I'm not sure if that is correct or necessary.

 

I was also unsure if the Subscription Key needed for the configuration is the same as the Subscription ID that I see in Azure. I would greatly appreciate any tips you have on how to overcome this!

tlarsen7572
11 - Bolide
11 - Bolide

Hi @Jamie12!  Did you create a Computer Vision resource in your Azure portal?  I just checked my Quick Start and it hasn't changed it's appearance.

 

From the home page of your Azure portal, click 'Create a resource'

Computer Vision 1.JPG

 

Search the marketplace for 'computer vision'.  You should see something like below

Computer Vision 2.JPG

 

Once you create the Computer Vision resource, you should have access to the Quick Start page that looks like mine and which will provide you with the key and the endpoint.

 

Does that help, or are you still unable to access the API?

Jamie12
5 - Atom

That did the trick! Thank you so much @

trettelap
8 - Asteroid

Awesome tool! @tlarsen7572. Is there any way to see the backend code behind the tool? I know you can do this for macros but I can't figure out in this case. I am guessing this was developed using the python SDK? 

tlarsen7572
11 - Bolide
11 - Bolide

Hi @trettelap, glad you like the tool!  It certainly was developed using the Python SDK.

 

The code is available on GitHub here.  Also, you can see the code on your local PC at one of the following paths:

 

If you installed the tool as admin: C:\ProgramData\Alteryx\Tools\OCR

If you installed the tool user-specific: C:\Users\Your User Name\AppData\Roaming\Alteryx\Tools\OCR

agendel
5 - Atom

Hey @tlarsen7572 I downloaded the OCR tool and was able to input a pdf into alteryx. However, it seems that it's not taking any pdf documents above 200 KB, do you have any idea why and how I could fix this if possible? Thanks 🙂