Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

csv import where the starting row changes

Carolyn
12 - Quasar
12 - Quasar

I have a csv report format where the first x rows are things like report name, layout name, who ran it, when they ran it, which filters they had, then there's a blank row, and then the data actually starts. The report can be >1M lines (and Excel "helpfully" modifies my data), so Save As-ing as xlsx isn't an option

 

I can import this csv into Alteryx by setting it start the data on row 10. However, I'm trying to make the workflow more dynamic so if someone adds an extra filter, which would push the data down, or removes a filter, so it would pull the data up, Alteryx would just automatically adjust without me needing to modify the "start import on row" line. Also, the number of columns may change. 

 

I can import the data by using the "\0" delimiter and then (theoretically) figure out the RegEx to parse it out. 

 

I saw an answer about using an Analytics App for this, but because of how we're currently running our workflows, it's not an option (happy to elaborate)

 

I'm wondering if there's a simpler way. I want Alteryx to just look at it and recognize that the data doesn't start until row 10 and just happily consume the report starting on row 10

 

Report example.png

8 REPLIES 8
SPetrie
13 - Pulsar

Is every row in column b populated after the header?

If so, you could use a filter tool to remove any row where b is empty and then a dynamic rename to add the appropriate headers.

 

alexnajm
18 - Pollux
18 - Pollux

I would read in the data, use a Filter to move those extra rows (something like [Field_2] = Null() AND [Field_3] = Null() or whatever logic would work well), then use a Dynamic Rename to move that first row (which presumably you want to be your header) to be the header row!

Carolyn
12 - Quasar
12 - Quasar

@alexnajm @SPetrie , 

Thanks for the reply. 

 

I don't know if every row in Column B will be populated and don't want to assume. But I figure I can do a Filter to exclude if Col D and Col E and Col F is null or some version of that (like alexnajm) described

 

My issue with just reading everything in and then ignoring the top rows is I can't get the csv to import if the first row is set to 1

 

Untitled.png

alexnajm
18 - Pollux
18 - Pollux

There are input data settings you can adjust to make sure it avoids those errors - however to best help can you provide a sample dataset?

Carolyn
12 - Quasar
12 - Quasar

If I do an import with the delimiter set to "\0", I can get the data in fine and then did a Multi Row + Filter to exclude the Header rows. Which works but then I'm stuck with trying to figure out how to parse it out, since I need to ignore commas in double quotes and fields sometimes have double quotes but sometimes don't, and I haven't figured out how to do that with RegEx yet

 

Untitled.png

 

alexnajm
18 - Pollux
18 - Pollux

There’s an option in Text to Columns to ignore delimiter in double quotes!

Carolyn
12 - Quasar
12 - Quasar

OMG THERE IS!! THANK YOU!!!!

alexnajm
18 - Pollux
18 - Pollux

Happy to help!

Labels
Top Solution Authors