Hi everyone,
How to remove unwanted rows from a data sheet, find sample data below:
Name num
A 1
A 2
How can i remove first 3 row or last 3 row or all rows, can anyone help ?
Thanks in advance
Hi @Inactive User
Depending on the structure of all your data you have several options:
To remove just the first 3 or last 3 (or any number) of rows the best option is to use the Sample tool, example setup for first 3 rows below.
This will work dynamically so no matter what your data looks like/how many rows it will always remove the specified number of rows.
Another option is to use the Filter tool and filter on Column 'Num' not equal to 1, or 2, etc. This will not be dynamic IF you have rows that contain the number 1 or 2 that you do not want removed.
In terms of removing all rows i'm not sure why you'd want to do that as no data will then flow through, but, you could again use a Filter tool and filter on column 'Name' does not equal to A (based on your example dataset). If the dataset contains other data you can just not carry that datastream onwards.
Cheers,
David
@Inactive User One way of doing this with the help of the tile tool
Hey @Inactive User, for the small sample provided here, you could just use a filter with the following expression:
[num] != 1
The 'True' anchor would remove the first 3 rows as these = 1, the 'False' anchor would drop the last 3 rows as these = 2.
However, if you're looking for something to apply to a larger data set where you just want to remove 'x' rows from the top/bottom, not dependent on a certain condition, you can use the 'Sample' tool within Alteryx.
https://help.alteryx.com/20221/designer/sample-tool
In your case, to remove the first 3 rows, you would just select 'Skip first N rows' and make 'N' = 3.
To remove the bottom 3, you can look at how many records you have and just take 3 off this. For example, if you had 100 rows and wanted to drop the last 3, you would just select 'First N rows' and make 'N' = 97.
There's also a CReW macro for 'Skip last N rows', which does exactly what it says on the tin! You can download the CReW macro package, containing that and loads of other useful tools, here: http://www.chaosreignswithin.com/p/macros.html
@Inactive User
Are you looking for Unique data OR need to remove specific rows.
for Unique data: Unique tool.
for specific rows: use the solution mentioned by @davidskaife
Thank you all for your rply, but i have large data set 50,000+ rows which contains multiple values for multiple fields, first need to find duplicate value from those fields in the basis of some criteria then need to remove it. Kindly help to resolve the issue.
@Inactive User Can you provide an updated sample file and expected output ?
In that case, you'll want to use the Unique tool as mentioned by @sbatra116:
When you say criteria, you'll need to select (within the tool configuration) which fields you're going to use to determine whether or not an entry is a duplicate.
For example, if you had the table provided in your original post:
If you were to add a Unique tool to this, with 'Name' selected as the unique identifier, Alteryx would get rid of everything apart from the first row, as the programme looks for the first instance of each name and removes any others that are the same, judging these as duplicates (and in your case they're all 'A').
However, if you select both 'Name' and 'num', Alteryx would keep the first of each unique combination of these two fields, i.e. the first and 4th row, leaving you with:
Hope this helps, more info here: https://help.alteryx.com/20221/designer/unique-tool
Hi,
PFB sample data
Region City Category
East Boston Bars
East LA Bars
City should be same, if it is different then i have to remove all these rows(6), i have data of 50000+rows.
@Inactive User let me understand your requirement correctly.
you are looking for output as below : 2 records
BUT if the input changes as below
then output should be: 1 record
Hi @sbatra116
I wanna apply below mention condition
City should be same, if it is different then i have to remove all these rows(6).
Thanks
@Inactive User when you mention 'City should be the same' - what are you referencing, i.e. same as what? I'm not sure what you're trying to achieve as, in the example you have provided, you say that you should remove all 6 rows, which would leave you with nothing. Surely in your real data of 50,000 rows, you also have different cities, and so if you remove all of them you'll also be left with nothing? Can you provide a little more detail around the logic of the comparison?
Hi Datanath,
Thanks for your rply, there may be 2000 or 4000 row matches the given criteria and rest do not match the same criteria, so will retain only those rows who doesnt match criteria and delete rest all.
Is your criteria fixed? Or does it come from another field?
If it's fixed, i.e. you just want to keep rows where the target field isn't 'A', then you could just use a filter with:
[Target field] != 'A'
Where your 'Target field' is the column you're checking for a match.
If it's dynamically coming from another field then you could also use a filter like so:
[Checking field] != [Criteria]
Where 'Checking field' is the column you want to check, and 'Criteria' is the field you are comparing it against. As you're using !=, the outcome you're looking for will come out of the top (True) anchor of the filter.
provide sample data in good format.