Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Running R script in Alteryx workflow to convert PDF to excel

Ekta
8 - Asteroid

Dear All, 

I am trying to convert the PDF to excel  and then use the converted data from the output anchor in the following Alteryx tools 

Currently i am using R script outside of Alteryx to convert PDF to Excel and then taking that converted Excel as input to run the alteryx workflow.

 

Now, i am trying to add this step of conversion in the Alteryx workflow.

 

Expected Alteryx workflow is this  ->

Input PDF -> R script -> other tools to process the converted data 

 

R script to configure is this -

install.Rtools()
library(pdftools)
library(stringr)
library(xlsx)
library(rjava)
install.packages('openxlsx')
library(openxlsx)

tx <- pdf_text("C:/Users/Combined file.pdf")
tx2 <- unlist(str_split(tx, "[\\r\\n]+"))
tx3 <- str_split_fixed(str_trim(tx2), "\\s{2,}", 5)

tx3
write.xlsx(tx3, file="C:/Users/Combined file.xlsx")

 please help me with this

 

TIA

1 REPLY 1
BrandonB
Alteryx
Alteryx

Is there a reason that you are trying to write it to Excel as part of your script? You can write a data frame right back into an Alteryx workflow so the Excel step isn't necessary. This is an example of a macro that leverages R to bring in PDFs that seems in line with what you are attempting: https://gallery.alteryx.com/#!app/PDF-Input--Text-and-Image-/5be5ec8d0462d71ffce6deaa

 

However, if you are looking for additional advanced features from drag and drop tools, I would also recommend taking a look at the Alteryx Intelligence Suite that was recently released: https://www.alteryx.com/products/alteryx-platform/intelligence-suite

 

It not only allows you to bring in PDFs, but you can use templates to specify regions to extract across multiple PDFs which helps you avoid needing to use regex or a bunch of parsing rules to get to what you need. It also has a variety of text analysis tools and assisted modeling functionality that comes with it. 

Labels