Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove extra Rows and Columns

cireost
8 - Asteroid

Hi - I have a table that will be always in the same format.  However, the position of the table in excel can vary (e.g. could be random # of rows on top or columns to the left.)  Does anyone have a sample workflow that can dynamically remove unneeded rows and columns so that the data starts with the table?  

 

Example excel attached.  I would like the fields to be Sample Number, Property Type, and Tenant Name

12 REPLIES 12
rarmstrong
8 - Asteroid

When I try it in Alteryx, it automatically starts the table where it should, no matter how many NULL columns/rows I have to the left or above the table.

If it isn't doing that for you, then you can give this a try:

 

- In the input data tool, select First Row Contains Data

- Add Auto Field Tool (this changes fields that are all NULL to boolean)

- Add Dynamic Select Tool. In the drop down, select 'Select Field Types' and deselect Bool

(At this point, all fields that contain only NULL cells should be removed)

To remove NULL rows:

- Add Dynamic Rename tool. In the Rename Mode dropdown, select Formula. Then make sure all fields are selected, including the Dynamic or Unknown Fields box

The expression should be 1 (this will change the name of the 3 fields that you have left over to 1, 1_2, and 1_3)

- Add a Filter tool. The expression should be: 

!IsNull([1])
AND
!IsNull([1_2])
AND
!IsNull([1_3])

- Add another Dynamic Rename tool and in the rename mode select 'Take Field Names from First Row of Data'. Again, make sure all fields are selected including the Dynamic or Unknown Fields box.

 

I attached a copy of the sample workflow I created

 

 

 

cireost
8 - Asteroid

Great thank you.

 

Would this still work if the rows and columns that are unneeded have some junk data in there? I ask because this can occur (apologies for not mentioning it originally) and therefore, not sure if In the input data tool, select First Row Contains Data would work in these situations.

rarmstrong
8 - Asteroid

Can you give an example of what kind of junk it may contain? 

This only works if those extra fields are NULL. You may have to add a Data Cleansing tool after your input to remove all Punctuation. The cells that used to contain punctuation are now empty instead of NULL so the existing Dynamic Select tool would need to be updated to reflect this difference. In the Dynamic Select tool, change the dropdown to "Select via a Formula". And the Expression should be [Size] > 1.

 

I updated the sample workflow to reflect these changes.

If you have more than just punctuation in those extra columns, then we may have to look into something other than the Data Cleansing tool to remove the unwanted characters.

cireost
8 - Asteroid

Here is an updated example.  There is now random text above, below, and to the left of the table that I am trying to use.

 

Thanks.

cireost
8 - Asteroid

The table is always consistent, so not sure if there is way to write it so that once we find the row with the word 'Sample number' then the header/data starts there?

danrh
13 - Pulsar

Attached is a solution that gets you halfway there. It finds where "Sample number" is in the data and grabs everything to the right and below of that. Unfortunately, it looks like you have additional text below the table. Is there some sort of pattern at the bottom/on the right side that would let us filter out that noise as well?

cireost
8 - Asteroid

This is awesome.  There is no text under tenant name in all situations so would we just throw in a tool at the end of the workflow to filter out Empty and Null?  

 

If you could annotate some of the tools that would be greatly appreciated!

 

Specific question I have:

 

-Multi-Field Formula Tool - looks like we are looking for Nulls and then converting them to 'real' Nulls?  What exactly is this doing?  Does this work if a field is empty instead of Null or both?  Does this affect the actual data in the table if there are Nulls?

 

Thanks!!

 

danrh
13 - Pulsar

Sounds like a filter at the end would work well.

 

The Multi-Field Formula tool was just me being lazy. The way your data copied into the Text Input created a bunch of fields with '[Null]' instead of actual nulls. I could have done the same thing by manually deleting the contents. Chances are you don't need that tool at all.

cireost
8 - Asteroid

Thanks.  what exactly is the diff b/w '[Null]' and actual nulls?  How does that even happen? 

Labels