I'm running the R tool with the following code for reading in a PDF, converting it to a png, and running an OCR four times while flipping the image 90 degrees each time.
# read in the PDF file location which must
# be in a field called FullPath
File <- read.Alteryx("#1", mode="data.frame")
# Use pdf_text() function to return a character vector
# containing the text for each page of the PDF
pngfile <- pdftools::pdf_convert(file.path(File$FullPath), dpi = 600)
pngfile <- magick::image_read(pngfile)
# convert the character vector to a data frame, Write to Alteryx output 1
text <- magick::image_ocr(pngfile)
cat(text)
write.Alteryx(text, 1)
# rotate image 90 degrees and write to seperate Alteryx Outputs
pngfile <- magick::image_rotate(pngfile, 90)
text <- magick::image_ocr(pngfile)
cat(text)
write.Alteryx(text, 2)
pngfile <- magick::image_rotate(pngfile, 90)
text <- magick::image_ocr(pngfile)
cat(text)
write.Alteryx(text, 3)
pngfile <- magick::image_rotate(pngfile, 90)
text <- magick::image_ocr(pngfile)
cat(text)
write.Alteryx(text, 4)I'm using this with a batch macro set to read in a folder of scanned PDFs and parse out the significant data from them. Each time an iteration runs, it leaves behind some massive temp files from the magick functions (3+ GB per iteration) that R should automatically delete. This ends up filling my /temp folder and causes later iterations to error out due to lack of temp space. Each iteration creates a unique tmp folder, and most of the files automatically delete upon completion of the iteration, just not the magick ones.
Is there any way to automatically clear these between iterations since R is not doing it itself?
I'm using Alteryx 2018.2 so I'm limited to R 3.4 and I'm using magick version 2.0