Hi there,
I have a large (2GB) CSV file, and unfortunately for some reason the first 3 rows contain some notes. The headers are in row 4, and the data is all below that. Is there anyway in Alteryx to import this data and skip the first 3 rows and use row 4 for headers?
Thanks so much
Ben
Solved! Go to Solution.
You have to read it by hand, but its not that hard.
There's a post on the Ideas page that deals with roughly the same issue. Here's a link: Specify-the-row-where-data-starts
As usual, Ned has a different way of accomplishing the task because he's awesome!
I agree Alteryx is very powerful! One of my favorite things about Alteryx is that while you wait for features to be added (I hope you starred the Idea entry linked above!), you can often build a macro to do the job. I spent about 20 minutes building this out and testing it. I made some assumptions here (1. your file has headers and 2. you know what row they are on) but I believe this shows how in one tool (at least in regards to appearance and number of drag-and-drops) this can be solved. Please feel free to open up the macro to see exaclty how I accomplished this. It is by no means comprehensive of all possible configurations, but I believe it demonstrates the capability. You could also add additional Interface tools and connect them appropriately to make it more flexible!
One of the things I love about Alteryx, the forum is active! Awesome solution to the very common problem of junk rows at the start of a file!
This solution comes sooooo close for me. My problem is that all of our internal reports get dumped out as .prn. I've set up Alteryx to read is as a csv with no delimeters and parse it using numerous tools, but this macro won't recognize the file. I'm still hoping for a built-in solution similar to your macro where you can just tell Alteryx where the column headers are and it will go from there, regardless of whether the file is .xls or .csv or .txt or, as in my case .prn. Perhaps a delimeter would still need to be specified in that case, but if the import begins where the headers are, that shouldn't be a problem. I could open up all the files and save them as .csv, but at that point I may as well delete the headers too. My multiple tool system is working, and I may play around with your macro to see if I can tweak it and make it work in my case. I just think the whole thing could be made so much simpler in the software.
In the macro that @CailinS provided, the initial "File Browse" tool can be modified to pull in .prn files (or any type of file) using the "arbitrary file specification."
I tried that... the macro gives an error saying it's not a known file format. In my multi-step process, I have to trick Alteryx into reading it as a csv instead. Not sure yet how to make the macro do that, but I'll keep playing with it.
Her macro also has a CSV in the defaul input tool's configuration; perhaps replacing that with a sample .prn file will help get Alteryx out of the csv mindset? (Sorry I'm also unsure how it's fitting into your multi-step process).