Hi,
New to the community here.
I have a file with few character fields. I'm trying to get a count distinct of 'Key' variable grouped by every individual field. So count of key by customer number, count of key by sport type code and so on. Post that, I'd like to collapse categories from customer number / sport type code by a logic - if count is < 5, replace the level with "Other". For instance, ARCH within sport type code should change to other. How would i do this for all fields and re-create the same file. Thanks in advance.
Solved! Go to Solution.
Hi @aman_goswami ,
you can count the number of records for each customer/sports_type using the Summarize tool, join the count to your original data and then replace, if count < 5. You can do this for selected character fields individually, but also for a combination of fields. Does this help? I've attached a sample workflow.
Best regards
Roland
Hi Roland,
Thanks for the quick response. I was able to do it in the similar manner for a single field. I'm looking for a way to automate this. Ideally, for any dataset I'd like a count of one field via others and then manipulate each one of them with a logic. Basically, use count of key via all others individually (without manual selection) and then use the same logic to manipulate the individual field.
Would this be possible ?
Thanks
I think this is what you're looking for, a count of Key grouped by every other field with "other" applied if the count <5
The key here is to transpose the data so that the columns/values end up in one pair of name/value fields. From there, you count all the keys grouped by Name(original column name) and value. Use a formula to replace the value with Other in the count is less than 5 and then Sum the "Other" values.
The results look like this, showing that BS is the dominant Value for Catalog_code with 22 BN values and 3 that fall into the other category. These results are repeated for each of the original field/value combinations which now show up as rows
Dan