Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Re: Inputting Data in Chinese, Japanese and Korean Characters

Marcel_Gavrila
8 - Asteroid

Hello,

 

Do you now how can read CJK from PDF? I have a R code, but since I am new with R I don't know how to amend it in order to transform data in Unicode. below is my code.

 

cond.install <- function(package.name){
options(repos = "http://cran.rstudio.com") #set repo
#check for package in library, if package is missing install
if(package.name%in%rownames(installed.packages())==FALSE) {
install.packages(package.name)}else{require(package.name, character.only = TRUE)}}

cond.install("pdftools")
cond.install("tesseract")

file <- "C:\\Users\\PDF\\file.pdf
pngfile <- pdftools::pdf_convert(file,dpi = 200)
text <- tesseract::ocr(pngfile)
write.Alteryx(text, 1)

write.Alteryx(file,2)

 

Thnak you,

 

Marcel

0 REPLIES 0
Labels
Top Solution Authors