Hello all,
I have a question for which I just cannot find a suitable solution. I have a dataset from which I want to draw a sample from, depending on how many entries (in percentage) the user wants to use. So if I have 100 rows and the user wants to draw 10% from it he/she then receives an excel sheet with 10 randomly selected entries. I was able to create a workflow which did so by using the random % sample and the numeric up down. But what I also want is a weighted random selection. I have one column with costs and the higher the costs the more likely it is that this entry makes it into the 10% (in case the user wants 10% of the overall dataset). I guess what I want is a monetary unit sampling where the user can decide how many entries he/she wants in the sample. Thank you in advance!
Solved! Go to Solution.
Hello @Al_ani ,
You would take a look at these articles that have samples about how to do weighted sampling:
https://www.theinformationlab.co.uk/2017/06/16/weighting-survey-data-alteryx/
This one is way longer and more in deep to prepare data for a linnear regression but maybe useful as well.
Hope this helps.
Gabriel
Hello Gabriel,
thank you very much for taking the time to answer. Those articles were very interesting. Maybe I was not able to express my question good enough as the articles did not really relate to my problem. I solved my issue by writing the sample task in R via the dplyr package which allows me to draw a fraction and use weights. I then used Alteryx with an interface to basically update the command size which is the fraction.