Hi,
I have a dataset with many rows. I ultimately add the columns to total the amounts across. I want to filter out the rows that contain zeroes across each column of each respective row, how can I accomplish this?
Solved! Go to Solution.
transpose with a record idand add a summation on the fields grouped by record id. filter where the sum is zero. Join on record to the original dataset. voila.
Hi @janety0127
If your data set includes negative values you have to add in the extra step to handle the case where the values add up to zero, but the individual values are not zero. In the following data set you want to remove row 4 but keep row 5 since it's values are not all zero
Add a RecordID, and transpose as @apathetichell suggested. In the summarize tool add in the following operations
The extra Count Distinct operation counts the number of different values in the row. Since the the only combination of values where the sum=0 and the values are all the same is when all the values are 0, add a filter with the following condition
[Sum_Value] != 0 or ([Sum_Value]=0 and [CountDistinct_Value] != 1)
This will pass through the values where the sum is not equal to 0 or the sum equals 0 and the values are not all 0. Join on Record ID. The L output gives the rows where the values are all 0. The J output gives the rows that you want to keep.
Dan
Hi @janety0127 ,
@danilang raised a very good point about the negative values. I'm a fan of the 'remove null rows' option in data cleansing tool hence I will suggest the below.
First, use Multi-field formula to set all values = 0 to NULL() for all columns.
Then, use Data Cleansing Tool to remove null rows and Replace Nulls with 0 (Numeric Fields).
Lastly, the steps to obtain the sums - create Record ID, Transpose with Record ID as key and Sum the values by Record ID.
Hope this helps.
Cheers,
Lelia
Thank you! These were all really helpful!