Re: Inputting Data in Chinese, Japanese and Korean Characters

Hello,

Do you now how can read CJK from PDF? I have a R code, but since I am new with R I don't know how to amend it in order to transform data in Unicode. below is my code.

cond.install <- function(package.name){
options(repos = "http://cran.rstudio.com") #set repo
#check for package in library, if package is missing install
if(package.name%in%rownames(installed.packages())==FALSE) {
install.packages(package.name)}else{require(package.name, character.only = TRUE)}}

cond.install("pdftools")
cond.install("tesseract")

file <- "C:\\Users\\PDF\\file.pdf
pngfile <- pdftools::pdf_convert(file,dpi = 200)
text <- tesseract::ocr(pngfile)
write.Alteryx(text, 1)

write.Alteryx(file,2)

Thnak you,

Marcel

Accepted answers

All comments

There are no accepted answers yet

Quick Links

Unanswered
Community Events
Groups

This months top contributors

atcodedog05 19598

Qiu 15879

binu_acs 15708

MarqueeCrew 13708

apathetichell 13703