Hi all,
I've developed a predictive model using Logistic Regression, and wanted to use the model to predict the outcome for a holdout dataset.
I used both the "Score" and "Simulation Scoring" tool, and received different prediction result.
May I ask how different are these 2 scoring tools, and in what way is 1 more advantageous over the other ? Would be glad if someone can give a more in-depth explanation on how the "Simulation Scoring" tool is used, and in what use cases is it more ... optimized for (pun intended).
Best,
Michael
Solved! Go to Solution.
Hi @yjd
Thank you for this question. The standard Score Tool returns the mean predicted value estimated by a model for an input data set of predictor variables, where the Simulation Scoring also considers the error distribution to provide a range of possible values. It allows you to draw from the probability distributions of the model's estimates. The primary distinction is that the Simulation Scoring tool returns a whole distribution of predicted values, where the scoring tool returns one estimate per input record.
Have you looked at the Simulation Sample Workflow at all? It can be found in Designer under Help > Sample Workflows > Prescriptive Analytics > 6 Simulation. You may have noticed that the input to the S anchor only includes three records, but the output in the D anchor returns 1,500 records. This is because the How many Samples from error distribution per iteration argument in the tool's configuration window is set to 500 - so for each input (iteration) in the S anchor, the tool is generating 500 predictions. This is useful because it captures the distribution of predictions for each condition (in this example the condition that is being changed is the lengths of a warranty).
The Simulation Score Tool is helpful for determining if populations of values differ significantly from one another based on the changing of a predictor variable. It accounts for uncertainty, where the standard Score Tool does not. In summary, the Score Tool is most appropriate when you are interested in the best estimate of a prediction for each record. The Simulation Scoring tool is most appropriate when you want to capture the uncertainty around those estimates.
There is a helpful Community Post on using the Simulation Tools that you can look over here.
Does this sufficiently answer your question? Are there any other clarifications I might be able to make for you? Please let me know!
Thanks Sydney for the insights!! I have a much clearer picture now, and I think this tool is extremely useful to further enhance one's statistical model!
Best,
M
User | Count |
---|---|
19 | |
15 | |
13 | |
9 | |
8 |