Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
ALTER.NEXT:

Join us on Dec 2 for a half-day virtual analytics + data science event!
US & CA customers only

SAVE YOUR SPOT
It's the most wonderful time of the year - Santalytics 2020 is here! This year, Santa's workshop needs the help of the Alteryx Community to help get back on track, so head over to the Group Hub for all the info to get started!

How to get a PDF that is being treated like one column to be treated as 3 separate columns

Highlighted
5 - Atom

Hello,

 

I am pretty new alteryx and still learning about what all of the tools can do. I have attached the file I am looking at in alteryx. It is currently being treated as one column and I would like to be able to parse it out into 3 separate columns. The first column would contain Part I and Part II, the second column would contain 1 through 14 and the third column would contain 15 through 20.

 

I have read about possibly using the RegEx tool and the text to columns tool, but I am not sure I am understanding how to use them correctly.

 

Any guidance would be greatly appreciated.

 

he-spas

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @he-spas!

 

How's your file being input to alteryx?

Could you please share this file containing a single column?

And are the field names for 2nd and 3rd column the same? In the PDF looks like it is, but Alteryx does not allow repeated field names.

 

Cheers,

 

 

Highlighted
5 - Atom

Hi @Thableaus,

 

I am inputting the file using the Text Input tool as well as a bit of r code:

data <- read.Alteryx("#1", mode="data.frame")

txt <- pdftools::pdf_text(file.path(data$FullPath))

df_txt <- data.frame(txt)

write.Alteryx(df_txt,1)

 

Instead of acknowledging all of the boxes separately, alteryx is reading straight across the page. All of the field names will always be the same for the 2nd and 3rd columns.

 

Thanks

Labels