This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hi, Joanna. Interesting problem. I'm not sure if this satisfies the strictest definition of randomness, but here is a solution that might work for you using two random numbers and comparing them to one another to determine whether or not to include them in the sample. This method ensures you get a different random count for each column.
Based on my Interpretation of your requirements, here's a possible way to go about this
The 2 controls on the Generate data set container just generate 1M records with random values assigned to one of 3 groups, also at random. The DataRow column is the unique key in this list. I generated 1M records to to ensure that this method would run in a reasonable amount of time.
The real work start after this. The Stratified quantities input contains the number of records that you want in the final output. Obviously, you can increase these quantities. I kept them small to be able to show the results in one image
After joining this with the main data on Category, the Random SortKey formula tool generates a random number for each data row. The data is then sorted by Category and Sortkey, giving a list grouped by category and randomized within each category. The Multi-row tool generates a unique ExtractID for the data in each category. The filter pulls out all the rows where ExtractID is less than or equal to the required quantity.
After running for about 3 seconds on my machine, you get the following results
You can see that we get quantity required from the Stratified Quantities input with random data rows pulled from each Category
If this isn't what you're looking for, leave a note with a clarification and I'll see what I can do.