R - Tool , Output Mutiple Fields on Same output branch
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
How would I go about using write.Alteryx to output 2 columns in the same branch?
Or if that is not possible, what is a good method to get the file data to pass to the output.
Thank You
Code Sample:
metainfo <- read.AlteryxMetaInfo("#1")
data <- read.Alteryx("#1", mode="data.frame")
fileinfo <- file.path(data$FullPath)
fileinfostr <- toString(fileinfo, width = 255)
txt <- pdftools::pdf_text(file.path(data$FullPath))
df_txt <- data.frame(txt)
df_metainfo <- data.frame(metainfo)
df_fileinfo <- data.frame(fileinfo)
#Would like the file name or file path
#to be included in output one - preferably as a separate column
write.Alteryx(df_txt,1,source=fileinfostr)
write.Alteryx(df_metainfo,2)
write.Alteryx(df_fileinfo,3)
Solved! Go to Solution.
- Labels:
- R Tool
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @BGDAR,
If I'm reading that correctly, it kind of looks like the input is something that came from a Directory tool? If so, and if it was used to pinpoint a single PDF, then the following would read in that PDF and output the fileName, fileDirectory, and pdf text layer:
data <- read.Alteryx("#1", mode="data.frame")
file_name <- toString(data$FileName, width = 255)
file_dir <- toString(data$Directory, width = 255)
file_txt <- pdftools::pdf_text(file.path(data$FullPath))
dfOut <- t(as.data.frame(c(file_name, file_dir, file_txt)))
names(dfOut) <- c("file_name","file_dir","file_txt")
write.Alteryx(dfOut,1)
Very similar to what you had; to output everything as one data frame, just create the data frame in R. There may be different ways to accomplish this same thing.
Aside, it would be easy to put this in a loop and have the Directory tool send in all PDFs found in all subfolders of a given path, too:
data <- as.data.frame(read.Alteryx("#1", mode="data.frame"),stringsAsFactors=F)
dfOut <- data.frame(character(),character(),character(),stringsAsFactors=FALSE)
for (i in 1:nrow(data)) {
dataRow = data[i,]
file_name <- toString(dataRow$FileName, width = 255)
file_dir <- toString(dataRow$Directory, width = 255)
file_txt <- pdftools::pdf_text(file.path(dataRow$FullPath))
row = t(c(file_name,file_dir,file_txt))
dfOut <- rbind(dfOut,row)
}
names(dfOut) <- c("file_name","file_dir","file_txt")
write.Alteryx(dfOut,1)
Hope that helps!
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank You John, your summary of the problem was spot on. Your solution works very well.
My next step was to work through the loop, which you also provided.
Thank You!
