Hi everyone, How to remove unwanted rows from a data sheet, find sample data below: Name num A 1 A 1 A 1 A 2 A 2 A 2 How can i remove first 3 row or last 3 row or all rows, can anyone help ? Thanks in advance

@Inactive User * Summarize: Group By: Region, Group By: Category, CountDistinctNonNull_City * Append CountDistinctNonNull_City to input * Fillter: Custon Filter expression: [CountDistinctNonNull_City ] = 1

Row Deletion

unknown

Hi everyone,

How to remove unwanted rows from a data sheet, find sample data below:

Name num

A 1

A 2

How can i remove first 3 row or last 3 row or all rows, can anyone help ?

Thanks in advance

Help

Accepted answers

All comments

davidskaife

Hi @Inactive User

Depending on the structure of all your data you have several options:

To remove just the first 3 or last 3 (or any number) of rows the best option is to use the Sample tool, example setup for first 3 rows below.

This will work dynamically so no matter what your data looks like/how many rows it will always remove the specified number of rows.

Another option is to use the Filter tool and filter on Column 'Num' not equal to 1, or 2, etc. This will not be dynamic IF you have rows that contain the number 1 or 2 that you do not want removed.

In terms of removing all rows i'm not sure why you'd want to do that as no data will then flow through, but, you could again use a Filter tool and filter on column 'Name' does not equal to A (based on your example dataset). If the dataset contains other data you can just not carry that datastream onwards.

Cheers,

David

binu_acs

@Inactive User One way of doing this with the help of the tile tool

Tile_13_06.yxmd

DataNath

Hey @Inactive User, for the small sample provided here, you could just use a filter with the following expression:

[num] != 1

The 'True' anchor would remove the first 3 rows as these = 1, the 'False' anchor would drop the last 3 rows as these = 2.

However, if you're looking for something to apply to a larger data set where you just want to remove 'x' rows from the top/bottom, not dependent on a certain condition, you can use the 'Sample' tool within Alteryx.

https://help.alteryx.com/20221/designer/sample-tool

In your case, to remove the first 3 rows, you would just select 'Skip first N rows' and make 'N' = 3.

To remove the bottom 3, you can look at how many records you have and just take 3 off this. For example, if you had 100 rows and wanted to drop the last 3, you would just select 'First N rows' and make 'N' = 97.

There's also a CReW macro for 'Skip last N rows', which does exactly what it says on the tin! You can download the CReW macro package, containing that and loads of other useful tools, here: http://www.chaosreignswithin.com/p/macros.html

sbatra116

@Inactive User

Are you looking for Unique data OR need to remove specific rows.

for Unique data: Unique tool.

for specific rows: use the solution mentioned by @davidskaife

unknown

Thank you all for your rply, but i have large data set 50,000+ rows which contains multiple values for multiple fields, first need to find duplicate value from those fields in the basis of some criteria then need to remove it. Kindly help to resolve the issue.

binu_acs

@Inactive User Can you provide an updated sample file and expected output ?

DataNath

In that case, you'll want to use the Unique tool as mentioned by @sbatra116:

When you say criteria, you'll need to select (within the tool configuration) which fields you're going to use to determine whether or not an entry is a duplicate.

For example, if you had the table provided in your original post:

Name num

A 1

A 2

If you were to add a Unique tool to this, with 'Name' selected as the unique identifier, Alteryx would get rid of everything apart from the first row, as the programme looks for the first instance of each name and removes any others that are the same, judging these as duplicates (and in your case they're all 'A').

Name num

A 1

However, if you select both 'Name' and 'num', Alteryx would keep the first of each unique combination of these two fields, i.e. the first and 4th row, leaving you with:

Name num

A 1

A 2

Hope this helps, more info here: https://help.alteryx.com/20221/designer/unique-tool

unknown

Hi,

PFB sample data

Region City Category

East Boston Bars

East LA Bars

City should be same, if it is different then i have to remove all these rows(6), i have data of 50000+rows.

sbatra116

@Inactive User let me understand your requirement correctly.

you are looking for output as below : 2 records

Region City Category

East Boston Bars

East LA Bars

BUT if the input changes as below

Region City Category

East Boston Bars

then output should be: 1 record

Region City Category

East Boston Bars

unknown

Hi @sbatra116

I wanna apply below mention condition

City should be same, if it is different then i have to remove all these rows(6).

Thanks

DataNath

@Inactive User when you mention 'City should be the same' - what are you referencing, i.e. same as what? I'm not sure what you're trying to achieve as, in the example you have provided, you say that you should remove all 6 rows, which would leave you with nothing. Surely in your real data of 50,000 rows, you also have different cities, and so if you remove all of them you'll also be left with nothing? Can you provide a little more detail around the logic of the comparison?

unknown

Hi Datanath,

Thanks for your rply, there may be 2000 or 4000 row matches the given criteria and rest do not match the same criteria, so will retain only those rows who doesnt match criteria and delete rest all.

DataNath

Is your criteria fixed? Or does it come from another field?

If it's fixed, i.e. you just want to keep rows where the target field isn't 'A', then you could just use a filter with:

[Target field] != 'A'

Where your 'Target field' is the column you're checking for a match.

If it's dynamically coming from another field then you could also use a filter like so:

[Checking field] != [Criteria]

Where 'Checking field' is the column you want to check, and 'Criteria' is the field you are comparing it against. As you're using !=, the outcome you're looking for will come out of the top (True) anchor of the filter.

HomesickSurfer

@Inactive User

Summarize: Group By: Region, Group By: Category, CountDistinctNonNull_City
Append CountDistinctNonNull_City to input
Fillter: Custon Filter expression: [CountDistinctNonNull_City ] = 1

sbatra116

@Inactive User

provide sample data in good format.

Quick Links

This months top contributors

LanisC 335

hhaigh1 280

erich_shiino 280

FláviaB 262

JORGE4900 235