I have a large set of data in which I am trying to filter out rows based on certain criteria. The data is grouped by a unique number and included in the data is a code that indicates whether we want to keep or remove the rows. Below is a simplified version of what i am looking at. Every grouping that has even one occurrence of "gold" needs to stay. Then I will have the rows with no "gold" removed. I figured I would just use a basic filter for that, unless there is an easier way.
Solved! Go to Solution.
Hey @Crstone09! The way I approached this was to add a column that is 1 if the color is gold and null otherwise. I could then use a Summarize to count the non null values and then filter to only keep the counts greater than or equal to one. Lastly, I joined by ID to get all of the rows back. Hope this helps!
Hi @Crstone09;
I think this is the easiest way (with the filter):
Just: a Summarize to concatenate, the "contains(Color,''Gold") filter and a split to row.
Generally I prefer not use join with large dataset because they lack in performance.