community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Dynamic "Start data Import on Line"

Highlighted
Meteoroid

So I have this case and can't figure out a way to overcome the situation.

 

I'm reading different excel files from one directory and each excel contains X number of tabs (might differ from file to file) (Using this solution), turns out that I need to read parts of each tab and union them into one single output. The challenge is that each tab has a different line to start (Row number) the reading of the rows cause the first N rows could contain information that is not important for my purposes, the only common thing that each Tab has is that the portion of the information that I need to start has the same Title on top.

As a brief, I have , X number of Files, Y number of tabs. Each Tab has a Text that Reads "Examples Below" and after that text is the information I need to copy into one single master file, caveat here: Title "Examples Below" could be in Row 10 in one Tab, and Row 400 in the next one and so on..

 

As you can imagine my first approach was to try to use the Start data Import on Line feature in an input data tool, but I see this as a hard number not able to change dynamically.

 

Thanks in advance, I know this one might be difficult to achieve.

Alteryx
Alteryx

I have overcome this before by getting creative with the formula tool, a running total, and a filter. I use a formula tool that checks a certain column for text that would indicate it is the first row of data and if it does, puts the number 1 into a new column. Then, I use a running total tool on that column, and finally a filter that says running total is greater than or equal to 1. You can also do a "group by" in the running total tool, so if you output the file name as a field, you could group by the file name and create the running total as described above, so only the records following the starting point row would be output. Please let me know if you would like further clarification!

Meteoroid

HI,

 

Thank you for the input, I am kinda new to Alteryx but I will try to follow the logic in the comment and come back with results or questions.

 

Thank you again!

Alteryx
Alteryx

I have attached an example workflow to help you visualize the example logic:

 

Changing line input.png

Alteryx
Alteryx

Step by step walkthrough:

 

 

1.) Data starts on row 5 in this example

 

1.png

 

 

 

 

 

Step 2.) Create formula that puts a flag where the field 1 contains text that would indicate the field name at the start of your data

2.png

 

 

 

 

 

 

 

Step 3.) Create a running total that works off of the flag column. This is useful because every row after the start of your data will have a value at least equal to 1 which sets us up to use a filter tool after. 

 

3.png

 

 

 

 

 

 

 

 

Step 4.) Filter rows where the run total flag is greater than or equal to 1

4.png

 

 

 

 

 

Step 5.) Use a select tool to uncheck the flag and running total columns because they are no longer needed.

 

5.png

 

Finally, a dynamic rename tool can be used at the end with the setting "Take Field Names from First Row of Data" if you want to make the first row of data your new column headers. 

Meteoroid

Wow,

 

Thank you very much, this totally helps me with the case I presented.

 

Thanks a lot

Meteoroid

Hey Brandon,

 

Quick question here. Does some solution comes to your mind when we don't know the name of the Column to look for the Start of Data value?

 

Was implementing the solution and I realized that I am looping through different Excel Files and Tabs and each Tab will have a different Header name for the column where the text I need is located. So using your example, my first Tab's column might be "Field1", the second could be "Donald Duck", the next one "Field1" again and so on every time will be unknown. 

 

Any ideas around this one?

 

Thanks again!

Alteryx
Alteryx
Yep! In the file input tool just check the box that first row contains data. That way the column names will be the same across all fields because it doesn’t treat the first row as column headers. Let me know if that works!
Meteoroid

Worked as expected.. thank you

Labels