Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Representative Sampling

eurban
5 - Atom

Trying to do a random sample selection in Alteryx that is proportional based on each region or state. For example, I have 7 stores across the US (California, Montana, Wisconsin, Florida, Illinois, Rhode Island, and New York), but I have more sales volume in New York than Rhode Island. I want the sample selection to be weighted appropriately based on population of sales. Also need every location to have at least one sample selected. Please see example below for how I would select samples manually. Is there a way to do this in Alteryx?

 

We would select 25 samples across the 7 locations below. 

 

Population

- California - 205 sales 

- Montana - 101 sales

- Wisconsin - 49 sales

- Florida - 326 sales

- Illinois - 22 sales

- Rhode Island - 89 sales

- New York - 444 sales

 

Total sales = 1,236

 

Sample Selection (total of 25 samples with each state at least getting one sale selected):

- California - 4 sales based on 17 % of total population

- Montana - 2 sales based on 8 % of total population

- Wisconsin - 1 sale based on 4% of total population

- Florida - 6 sales based on 26% of total population

- Illinois - 1 sale based on 2% of total population

- Rhode Island - 2 sales based on 7% of total population

- New York - 9 sales based on 36% of total population

 

3 REPLIES 3
gautiergodard
13 - Pulsar

Hey @eurban 

Built you a quick macro to help with this.

You can pass the data in the "I" macro input and the percentage of random sampling into the macro.

Hope this helps!

gautiergodard_0-1680543788865.png

 

eurban
5 - Atom

@gautiergodard 

 

Thank you for your response! We are trying to use the macro and we were able to get how many samples we want per division, but it is not coming back with how many records we have specified in our table below. For example we have 6 samples we need for Austin, but it is coming back with 59. It should be a total of 50 samples, but is coming back with 364. Is there something specific we need to have for the control parameter and data field within the macro?

Yoshiro_Fujimori
15 - Aurora

Hi @eurban ,

 

Here is one potential solution.

 

Summary process

1. Generate rows for each state (one row for one sales)

2. Pick one row for each state

3. For the rest of rows, generate random number to pick up the rest of rows (25 - 7 = 18)

4. Union the rows

 

Workflow

Yoshiro_Fujimori_0-1683185910865.png

 

Output

Yoshiro_Fujimori_1-1683186135652.png

 

Count by State

Yoshiro_Fujimori_2-1683186197278.png

 

I hope this matches with your requirement.

 

Labels