This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am pulling in PDF information using the PDF input tool located in the Alteryx Gallery. The PDF I am importing has a bit of a unique structure where one record is broken into four stacked lines. The PDF also has repeating headers, and various summary totals throughout the file. I've attached a snippet of the information below. Would anyone be able to help in effectively parsing this detail? I was thinking if I could use the "text to columns" tool using the dashes delimiting the columns in the PDFreport, remove the repeating column headers, use the sample tool to sample every 4th line to "unstack" the information into different streams, and then join everything back together I would have usable Excel data.