cancel
Showing results for
Did you mean:

Alteryx designer Discussions

#SANTALYTICS

Gather all 9 clues to complete the final Weekly Challenge on Dec 16!

SOLVED

Monte Carlo simulation sampling

Asteroid

Dear fellow Alteryx fans,

I would like to create a Monte Carlo simulation using the simulation sampling tool and I'm a little confused how to set it up. I have anonymized my data so that we can discuss the issue, and hope that by working this through on the forum it will be of help to others.

The example is a little contrived, so please bear with me!

I have three customer profiles. Assume my website serves each of these profile groups in proportion. I have a dataset containing 250,000 rows. I have analysed these customers coming from four different channels (A, B, C and D) and looked at the relationship between the number of page visits versus the total revenue. Specifically, I wish to examine the percentage of page views by channel versus the percentage of revenue by channel.

Please find attached a simple tableau representation of the full data, this details:

Page visits by channel in terms of the three profile groups
revenue by channel in terms of the three profile groups
performance, i.e. the percentage of total revenue divided by the percentage of total page visits

We can see that for the entire dataset, in channel A customer profile group 1 outperforms by 11%, but group 2 underperforms by 7%.

I would like to use the Monte Carlo simulation to construct levels of confidence around these percentages. I'm thinking that one way we can do this is by taking a sample of data and making the same calculation, and then repeating this exercise again and again. I would expect the mean outperformance in channel A for customer profile group 1 to be 11%, but what's the standard deviation around this? This is where the simulation tool would be useful if I knew how to use it!

Hopefully somebody out there will find this interesting!

Best wishes,

Jonathan

Nebula

Would something like this work for you:

Basically, this uses generate rows to make a set of simulations. Within each simulation, it creates a set of trials and then picks random data points from the input data.

Finally, it creates percentages for each simulation and then averages and std dev over the simulation set

Sample attached for you to look at

Asteroid

Aurora,

Super kind of you to take the time to reply to my question.

It looks like a very comprehensive answer, though avoids the simulation tool… No problem there, just need to get the job done!

I will go through your answer in more detail and revert if I have any questions. In the meantime, thank you kindly.

Best wishes,

Jonathan

Labels