Alteryx Designer Desktop Discussions

Hagar-Usama · ‎07-04-2023

Hello,

I am trying to create a density column based on another column. Specifically, if the count of a current item exceeds a certain threshold, then the corresponding value in the density column will be designated as "high"; otherwise, it will be designated as "low".

For example,

given col1, col2 (Density) , threshold = 2

Col 1	Col 2 (Density)
A	High
A	High
A	High
B	Low
C	Low

Can I use summarize in formula tool to get the count?

Thanks

DataNath · ‎07-04-2023

Hey @Hagar-Usama, you can use the Summarize to group by [Col 1] and perform a count. We then just write an expression to assign the 'High' label if this is above the threshold of 2. After that, we can just join the labels back to the correct Col 1 value:

Hagar-Usama · ‎07-04-2023

Hey @DataNath,

Thanks you, works fine!

I'm wondering if it's possible to utilize the total count (original input size) in order to set the threshold as a percentage, rather than a fixed number

BCD · ‎07-04-2023

Hello,
You can use a summarize to count values then a join or find&replace.

DataNath · ‎07-04-2023

Hey @Hagar-Usama, yep that's also possible! Here we'd want to use a second Summarize tool, with no grouping and just take an overall Count. We would then append this to the data set (post-join), so that this overall count is present in every row. From there, we can again just write a simple Formula expression for if <Grouped Count>/<Total Count> is above the threshold then assign 'High'. I've currently set the percentage to 50% (0.5), but you can just adjust this as you see fit.

Alteryx Designer Desktop Discussions

Creating a Density Column Based on Thresholds of Item Counts