Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Import CSV that doesn't start on the first row?

buchanan_ben
5 - Atom

Hi there, 
I have a large (2GB) CSV file, and unfortunately for some reason the first 3 rows contain some notes. The headers are in row 4, and the data is all below that. Is there anyway in Alteryx to import this data and skip the first 3 rows and use row 4 for headers?

Thanks so much
Ben

15 REPLIES 15
Ned
Alteryx Alumni (Retired)

You have to read it by hand, but its not that hard. 

  1. Use a CSV input with the delimiter set to \0 to read it raw (make sure to specify a large data size too)
  2. Use a sample tool to skip 1st 3 rows
  3. Use a Text to Columns tool to parse on Comma
  4. Use a select tool to remove the original field
  5. Use a Dynamic Rename tool to take fields from 1st row
tom_montpool
12 - Quasar

There's a post on the Ideas page that deals with roughly the same issue. Here's a link: Specify-the-row-where-data-starts

As usual, Ned has a different way of accomplishing the task because he's awesome!

cbridges
11 - Bolide
OK, I'm new here, but I think this issue is ridiculous. I think Alteryx is a very powerful tool, and its inability to do something so very simple is crazy.
@thizviz
CailinS
Alteryx
Alteryx

I agree Alteryx is very powerful! One of my favorite things about Alteryx is that while you wait for features to be added (I hope you starred the Idea entry linked above!), you can often build a macro to do the job. I spent about 20 minutes building this out and testing it. I made some assumptions here (1. your file has headers and 2. you know what row they are on) but I believe this shows how in one tool (at least in regards to appearance and number of drag-and-drops) this can be solved. Please feel free to open up the macro to see exaclty how I accomplished this. It is by no means comprehensive of all possible configurations, but I believe it demonstrates the capability. You could also add additional Interface tools and connect them appropriately to make it more flexible!

Cailin Swingle
Customer Experience
Shaunak
5 - Atom

One of the things I love about Alteryx, the forum is active!  Awesome solution to the very common problem of junk rows at the start of a file!

cbridges
11 - Bolide

This solution comes sooooo close for me. My problem is that all of our internal reports get dumped out as .prn. I've set up Alteryx to read is as a csv with no delimeters and parse it using numerous tools, but this macro won't recognize the file. I'm still hoping for a built-in solution similar to your macro where you can just tell Alteryx where the column headers are and it will go from there, regardless of whether the file is .xls or .csv or .txt or, as in my case .prn. Perhaps a delimeter would still need to be specified in that case, but if the import begins where the headers are, that shouldn't be a problem. I could open up all the files and save them as .csv, but at that point I may as well delete the headers too. My multiple tool system is working, and I may play around with your macro to see if I can tweak it and make it work in my case. I just think the whole thing could be made so much simpler in the software.

@thizviz
JohnJPS
15 - Aurora

In the macro that @CailinS provided, the initial "File Browse" tool can be modified to pull in .prn files (or any type of file) using the "arbitrary file specification."

cbridges
11 - Bolide

I tried that... the macro gives an error saying it's not a known file format. In my multi-step process, I have to trick Alteryx into reading it as a csv instead. Not sure yet how to make the macro do that, but I'll keep playing with it.

@thizviz
JohnJPS
15 - Aurora

Her macro also has a CSV in the defaul input tool's configuration; perhaps replacing that with a sample .prn file will help get Alteryx out of the csv mindset?  (Sorry I'm also unsure how it's fitting into your multi-step process).

Labels