I have a .csv file that I'm bring in which is quite ugly. The data has a lot of pages and a single record will wrap onto 3 lines on a page. My data looks like it should be tab delimited but when I bring it in, it looks like Alteryx is reading it as space delimited. I've got the data far enough to get it on one line, but then I can't get the columns to line up right. If I can get the columns to line up right, I can handle cleaning the null spaces and repeated report headers and footers. Each page looks similar to the sample data with the page headers, etc. I need to capture the data between the headers and the records.
I feel like my solutions are close but they don't get me there.
Solved! Go to Solution.
Hi @lwolfie
I would take this approach, for me it seems like your report is parsed based on the position of the characters. This means that your company or the provider of the file must have some documentation saying (example, 'from the character 1 to the character 10 we have the first column, from the character 11 to the character 15 we have the second column'
Configuring the input file as .flat, you can parse the file based on the positions that the doc provide.
After knowing the lengh of each column you can parse it and do whatever it is necessary. I did it just by what i saw, it seems correct but look out for the official documentation for this file.
Thank you for that sample. That's close, but the description is wrapped below itself on the lines below. So the line almost needs to parse after the Cur Period Units column. Your workflow doesn't account for the description column label being blank on the next two lines but it needs to have a spot.
Felipe_Ribeir0
How would you suggest wrapping each record onto one line at that point?
I appreciate both responses and took a combined approach. I brought each record onto one line and then parsed out based on column length. Thank you both!