Alert: There is a planned Community maintenance outage October 16th from approximately 10 - 11 PM PST. During this time the Alteryx Community will be inaccessible. Thank you for your understanding!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Creating a sample based on several conditions

nwatzlaf
8 - Asteroid

Hi there,

 

I have a data set that looks like this...

 

Zone       Units         Rank

2             12,000       1.75

2              550            3.41

3              750           2.32

4              100           7.54

 

What I want to do is grab a percentage of the units according to the zone up until a certain total number. Example I need 50% of the units to be from zone 2 but the total units for all zones should not exceed 1 million. But I also want it to grab the units based on rank, the lower ranking lines should be added to the sample set first.

 

 

Thank you for your time,

Natalia

2 REPLIES 2
OllieClarke
15 - Aurora
15 - Aurora

Hey @nwatzlaf here's a simple workflow that uses a multi-row formula and filter to get what you want

OllieClarke_0-1603982172358.png

Here's the total units per zone (also as a percentage) so you can verify

OllieClarke_1-1603982209486.png

 

Hope that helps, Ollie

 

 

RobertOdera
13 - Pulsar

Hi, @nwatzlaf 

 

Thanks for the question!

 

Please like + mark as an acceptable solution if this works for you.

 

Based on your brief, I understood the following:

  • You have multiple Zone, but Zone 2 has allocation over-weight (50% of Total) on the entire selection set.
  • Selection into the sample set is based on units thresholds (Zone 2 50% of Total, Total not greater than 1 MIL units, and the whole selection set has predetermined distinct Rank per row)

Your sample file does not cover all your criteria, so I generated an alternative sample to work through your actual use-case mechanics/ criteria.

 

A couple of things:

- You have an absolute number for Total Units (1 Mil), but a percentage based allocation weighting = cumulative total Zone 2 units will not always add up to exactly 50% or 500,000 units. At a tally < 500,000 units, the next incremental record may push the cumulative > 500,000, so it will be omitted.

 

RobertOdera_0-1603991969476.png

 

- I've seeded prioritization + threshold indicators + qualification = forced chronological approach to sampling selection (First Zone 2, then all other Zones) versus a fully randomized treatment

 

- I've commented on the workflow, but please let me know if it's still as clear as mud 🙂 and if you need anything else.

 

RobertOdera_1-1603992201666.png

 

The workflow is attached.

Cheers!

Labels