Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Row Deletion

Inactive User
Not applicable

Hi everyone,

How to remove unwanted rows from a data sheet, find sample data below:

 

Name  num

A              1

A               1

A               1

A               2

A               2

A               2

 

How can i remove first 3 row or last 3 row or all rows, can anyone help ?

 

Thanks in advance

15 REPLIES 15
DavidSkaife
13 - Pulsar

Hi @Inactive User 

 

Depending on the structure of all your data you have several options:

 

To remove just the first 3 or last 3 (or any number) of rows the best option is to use the Sample tool, example setup for first 3 rows below.

DavidSkaife_0-1655123777739.png

This will work dynamically so no matter what your data looks like/how many rows it will always remove the specified number of rows.

 

Another option is to use the Filter tool and filter on Column 'Num' not equal to 1, or 2, etc. This will not be dynamic IF you have rows that contain the number 1 or 2 that you do not want removed.

 

In terms of removing all rows i'm not sure why you'd want to do that as no data will then flow through, but, you could again use a Filter tool and filter on column 'Name' does not equal to A (based on your example dataset). If the dataset contains other data you can just not carry that datastream onwards.

 

Cheers,

 

David

binuacs
20 - Arcturus

@Inactive User  One way of doing this with the help of the tile tool

 

binuacs_0-1655123915235.png

 

 

DataNath
17 - Castor

Hey @Inactive User, for the small sample provided here, you could just use a filter with the following expression:

 

[num] != 1

 

The 'True' anchor would remove the first 3 rows as these = 1, the 'False' anchor would drop the last 3 rows as these = 2.

 

However, if you're looking for something to apply to a larger data set where you just want to remove 'x' rows from the top/bottom, not dependent on a certain condition, you can use the 'Sample' tool within Alteryx.

 

https://help.alteryx.com/20221/designer/sample-tool

 

In your case, to remove the first 3 rows, you would just select 'Skip first N rows' and make 'N' = 3.

To remove the bottom 3, you can look at how many records you have and just take 3 off this. For example, if you had 100 rows and wanted to drop the last 3, you would just select 'First N rows' and make 'N' = 97.

 

There's also a CReW macro for 'Skip last N rows', which does exactly what it says on the tin! You can download the CReW macro package, containing that and loads of other useful tools, here: http://www.chaosreignswithin.com/p/macros.html

sbatra116
8 - Asteroid

@Inactive User 

Are you looking for Unique data OR need to remove specific rows. 

 

for Unique data:    Unique tool.

 

for specific rows: use the solution mentioned by @DavidSkaife 

 

 

Inactive User
Not applicable

Thank you all for your rply, but i have large data set 50,000+ rows which contains multiple values for multiple fields, first need to find duplicate value from those fields in the basis of some criteria then need to remove it. Kindly help to resolve the issue.

binuacs
20 - Arcturus

@Inactive User Can you provide an updated sample file and expected output ?

DataNath
17 - Castor

In that case, you'll want to use the Unique tool as mentioned by @sbatra116:

 

When you say criteria, you'll need to select (within the tool configuration) which fields you're going to use to determine whether or not an entry is a duplicate.

 

For example, if you had the table provided in your original post:

 

Name  num

A              1

A               1

A               1

A               2

A               2

A               2

 

If you were to add a Unique tool to this, with 'Name' selected as the unique identifier, Alteryx would get rid of everything apart from the first row, as the programme looks for the first instance of each name and removes any others that are the same, judging these as duplicates (and in your case they're all 'A').

 

Name  num

A              1

 

However, if you select both 'Name' and 'num', Alteryx would keep the first of each unique combination of these two fields, i.e. the first and 4th row, leaving you with:

 

Name  num

A              1

A               2

 

Hope this helps, more info here: https://help.alteryx.com/20221/designer/unique-tool

Inactive User
Not applicable

Hi,

 

PFB sample data

 

Region  City   Category

East       Boston   Bars

East       Boston   Bars

East       Boston   Bars

East       LA           Bars

East       LA           Bars

East       LA           Bars

 

City should be same, if it is different then i have to remove all these rows(6), i have data of 50000+rows.

sbatra116
8 - Asteroid

@Inactive User let me understand your requirement correctly.

 

you are looking for output as below : 2 records 

 

Region  City   Category

East       Boston   Bars

East       LA           Bars

 

BUT if the input changes as below 

 

Region  City   Category

East       Boston   Bars

East       Boston   Bars

East       Boston   Bars

East      Boston   Bars

East      Boston   Bars

East      Boston   Bars

 

then output should be: 1 record

 

Region  City   Category

East       Boston   Bars

Labels