Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

PDF input to text includes random letters

qishao2
5 - Atom

Hi,

 

I'm trying to extract data from PDF forms that have these small separators as shown in the screenshot. Alteryx recognizes them sometimes as the letter i (or sometimes as uppercase I) and in some cases I got a "b" or a "8" from empty spaces for no apparent reasons... Is there a way to get rid of them?

 

edit: I think the thresholding tools should be able to do what I need since I only need words written in black and not anything in light blue... but I couldn't figure out how to set the threshold above the blue color in the background. Somebody help lease!

0 REPLIES 0
Labels
Top Solution Authors