Hello, I am trying to iterate through certain # of loans based on branch. My goal is to select random 25% of loans from each branch every week. The number of loans, the branches that has that consumer loan can vary from week to week. For instance, this week I can have branch A, B, C with different count of loans and next week I can have B, D, E branches with different count of loans. Here is my sample data:
Desired Output:
Randomly select 25% from each branch:
Application# Branch
55 Detroit
4312 Contact Center
1111 Contact Center
Solved! Go to Solution.
Try the Sample tool in the Preparation category
Tool Mastery article: https://community.alteryx.com/t5/Tool-Mastery/Tool-Mastery-Sample/ta-p/36758
Chris
@ChrisTX I did try that earlier with first N% of rows where N = 25% but, I am picking loans that give value of 0.5 too, for instance, in my data set I don't want to pick Howell, Brighton branches as they are not more than 1, but 1 loan gets picked as it is considering 0.5 result as 1 somehow. Any tips on how to eliminate these loans?
I also tried your 1 in 4 chance for every row, but it is picking up all branches with just 1 loan also, i dont want to go that route of selecting 1 loan out of reported 1 loan, thank you.
For any rows that you don't want selected by the 25% sample rule, just filter those out before the Sample tool.
You could use a Summarize tool to get the count by Branch, then Filter for Branch count = 1, then a Join tool back to the original data joining on Branch, and only continue processing the left output anchor from the Join tool.
Chris