Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Creating a sample based on several conditions

nwatzlaf
8 - Asteroid

Hi there,

 

I have a data set that looks like this...

 

Zone       Units         Rank

2             12,000       1.75

2              550            3.41

3              750           2.32

4              100           7.54

 

What I want to do is grab a percentage of the units according to the zone up until a certain total number. Example I need 50% of the units to be from zone 2 but the total units for all zones should not exceed 1 million. But I also want it to grab the units based on rank, the lower ranking lines should be added to the sample set first.

 

 

Thank you for your time,

Natalia

2 REPLIES 2
OllieClarke
15 - Aurora
15 - Aurora

Hey @nwatzlaf here's a simple workflow that uses a multi-row formula and filter to get what you want

OllieClarke_0-1603982172358.png

Here's the total units per zone (also as a percentage) so you can verify

OllieClarke_1-1603982209486.png

 

Hope that helps, Ollie

 

 

RobertOdera
13 - Pulsar

Hi, @nwatzlaf 

 

Thanks for the question!

 

Please like + mark as an acceptable solution if this works for you.

 

Based on your brief, I understood the following:

  • You have multiple Zone, but Zone 2 has allocation over-weight (50% of Total) on the entire selection set.
  • Selection into the sample set is based on units thresholds (Zone 2 50% of Total, Total not greater than 1 MIL units, and the whole selection set has predetermined distinct Rank per row)

Your sample file does not cover all your criteria, so I generated an alternative sample to work through your actual use-case mechanics/ criteria.

 

A couple of things:

- You have an absolute number for Total Units (1 Mil), but a percentage based allocation weighting = cumulative total Zone 2 units will not always add up to exactly 50% or 500,000 units. At a tally < 500,000 units, the next incremental record may push the cumulative > 500,000, so it will be omitted.

 

RobertOdera_0-1603991969476.png

 

- I've seeded prioritization + threshold indicators + qualification = forced chronological approach to sampling selection (First Zone 2, then all other Zones) versus a fully randomized treatment

 

- I've commented on the workflow, but please let me know if it's still as clear as mud 🙂 and if you need anything else.

 

RobertOdera_1-1603992201666.png

 

The workflow is attached.

Cheers!

Labels