Hello! Any help is appreciated (feel free to refer me to another discussion if my question has already been answered).
I have a large dataset with parts as the rows, and attributes as columns (think length, material, etc.) I want to return a subset of the full list of parts that have three or more attributes that match (there are ~800 attributes in total). What is the best way to go about this - wasn't sure because of the fact that any three of the attributes can match, rather than asking for specific attributes to match. Nulls do not count as matches.
Thanks!
If you transpose the attributes and then use a summarise to count or countNonNull, you can get a total of how many attributes per part. There's many different ways you can cut that once transposed and dealing with all the attributes in Name/Value. You can easily join that back on to the original dataset as well if you need the info back.