In case you missed the announcement: The Alteryx One Fall Release is here! Learn more about the new features and capabilities here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

PDF input to text includes random letters

qishao2
5 - Atom

Hi,

 

I'm trying to extract data from PDF forms that have these small separators as shown in the screenshot. Alteryx recognizes them sometimes as the letter i (or sometimes as uppercase I) and in some cases I got a "b" or a "8" from empty spaces for no apparent reasons... Is there a way to get rid of them?

 

edit: I think the thresholding tools should be able to do what I need since I only need words written in black and not anything in light blue... but I couldn't figure out how to set the threshold above the blue color in the background. Somebody help lease!

0 REPLIES 0
Labels
Top Solution Authors