In case you missed the announcement: The Alteryx One Fall Release is here! Learn more about the new features and capabilities here
ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests after December 31, 2025. Set up your security questions now so you can recover your account anytime, just log out and back in to get started. Learn more here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Re: Inputting Data in Chinese, Japanese and Korean Characters

Marcel_Gavrila
8 - Asteroid

Hello,

 

Do you now how can read CJK from PDF? I have a R code, but since I am new with R I don't know how to amend it in order to transform data in Unicode. below is my code.

 

cond.install <- function(package.name){
options(repos = "http://cran.rstudio.com") #set repo
#check for package in library, if package is missing install
if(package.name%in%rownames(installed.packages())==FALSE) {
install.packages(package.name)}else{require(package.name, character.only = TRUE)}}

cond.install("pdftools")
cond.install("tesseract")

file <- "C:\\Users\\PDF\\file.pdf
pngfile <- pdftools::pdf_convert(file,dpi = 200)
text <- tesseract::ocr(pngfile)
write.Alteryx(text, 1)

write.Alteryx(file,2)

 

Thnak you,

 

Marcel

0 REPLIES 0
Labels
Top Solution Authors