Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Re: Inputting Data in Chinese, Japanese and Korean Characters

Marcel_Gavrila
8 - Asteroid

Hello,

 

Do you now how can read CJK from PDF? I have a R code, but since I am new with R I don't know how to amend it in order to transform data in Unicode. below is my code.

 

cond.install <- function(package.name){
options(repos = "http://cran.rstudio.com") #set repo
#check for package in library, if package is missing install
if(package.name%in%rownames(installed.packages())==FALSE) {
install.packages(package.name)}else{require(package.name, character.only = TRUE)}}

cond.install("pdftools")
cond.install("tesseract")

file <- "C:\\Users\\PDF\\file.pdf
pngfile <- pdftools::pdf_convert(file,dpi = 200)
text <- tesseract::ocr(pngfile)
write.Alteryx(text, 1)

write.Alteryx(file,2)

 

Thnak you,

 

Marcel

0 REPLIES 0
Labels
Top Solution Authors