Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Data Splitting Question

TheSAguy
7 - Meteor

Hi,

 

What is the easiest/best way to split a data set randomly into segments, when you have 4+ groups?

With 2 or 3 groups, I just use the "create sample" tool. But when I go beyond 3 groups it becomes a little more tricky.

 

I have two scenarios, even groups or set % per group.

 

 

GroupScenario 1Scenario 2
117%10%
217%50%
317%10%
417%10%
517%10%
617%10%
TOTAL100%100%

 

How can I easily take a data set and split it randomly into these groups?

 

Thanks.

 

 

4 REPLIES 4
Amy_smart
11 - Bolide

Have you tried to create a random number in the formula tool and group based on that number?

geoff_zath
Alteryx
Alteryx

@TheSAguy Here is a potential solution for Scenario 1 (even groups) that assigns a "rand_group" value to each row. There are two output options. One includes only even split data and excludes some data if the data count isn't divisible by the group count (for example, 100 data / 6 groups result in 16 data per group). The second will assign the remainder data to a random group and add it back into the dataset. 

geoff_zath
Alteryx
Alteryx

@TheSAguy Here is a solution for Scenario 2 (set split% per group). Same as the workflow above, but now you can specify a split% for each group (row) in the input. As a note, when adding the remaining data in if you choose that option, I didn't weigh the groups based on the different split% so it's just randomly added back in (I don't think that should have a large effect as long as your dataset is large enough). I did recalculate the split% so that info can be used. Hope this helps!

TheSAguy
7 - Meteor

@geoff_zath 

Thanks so much!!

I was out on PTO, so the tardy replay. 

Labels