In the what was intended to be the final post of this series (that is, until Germany's major flame out in this year's World Cup), I discuss what we did to develop the predicted probability that each team would advance from the group round into the knockout rounds of the 2018 FIFA World Cup. In addition, I am going to use this post as an opportunity to examine how well we fared overall, both in terms of predicting individual matches, as well as which teams advanced to the knockout phase of the tournament. As a preview, our success rate for the group round is very consistent with what we saw in the Model Comparison results that I published in the second post of this series in terms of the match level outcomes. In addition, there is a YXDB file attached to this post with the match level probabilities, the predicted and actual outcomes.
At the end of the post, I set the stage for what will now be the final post of this series, an attempt to quantify the World Cup champions' curse.
In the last post, I presented the combined models we used to predict the probability a team will win, lose, or draw in a match. This model was based on all nonfriendly association football matches between February 1998 and March 2018, a total of 9020 matches. To estimate the probability each team will advance to the knockout rounds of the World Cup, we do not create a model to predict these probabilities since there are not a sufficient number of World Cup competitions between 1998 and 2018 on which to build such a model (there was an important rule change that was instituted in 1998 which makes using years prior to 1998 problematic). Instead we combine the match level win/lose/draw probabilities along with Monte Carlo simulation methods to obtain probability estimates that a team will place first, second, third, and fourth within their group. In turn, since the first two teams within a group advance, the probability a team will advance is given by the sum of the probabilities of the team placing first and the team placing second. To make this a bit more concrete, let's take the case of "team X" that has a 25% chance of being first in their group, a 30% chance of placing second, a 27% chance of placing third, and an 18% chance of placing fourth. In this case, the chance that team X advances is 25% plus 30%, or 55%.
To estimate the final group standing probabilities of each team within a group, we simulate the win/lose/draw results for each match 100,000 times. Unfortunately, the specific simulation we carry out is not currently possible with our Simulation tools, so we make use of custom R code instead. In the simulation, the estimated win/lose/draw probabilities for a match is used to parameterize a multinomial distribution in R, from which 100,000 independent draws are taken, resulting in a column with 100,000 win/lose/draw results for the focal team. Since a win or loss result for the focal in a draw translates to the opposite result for the opposing team, we immediately get the simulated results for the opponent as well, which again is a column of 100,000 win, lose, draw results. We next convert the win/lose/draw results into Group Round points (3 for a win, 1 for a draw, and 0 for a loss). We next add a uniform random number between zero and one to the point total, this is done in order to randomly break potential ties (tie breakers based on predicted goals scored and predicted goals against would require a lot more modeling, and the uncertainty associated with those estimates would be very high, making them little better than the random tiebreaker approach we use). For each draw in a sequence (which is more appropriately described as a trial at this point), we calculate the standing table (the total number of points for each team) based on the random draw from each match's multinomial distribution. Using these trial level standings tables, we count the number of trials each team is in first, second, third, and fourth place within their group. We can convert these to estimated probabilities by dividing by 100,000 (the total number of trials).
Having described the process, it makes sense to take a look at the workflow and the macros we developed to implement that process. Figure 1 shows the workflow used to run the simulation. Within this workflow, you will notice two custom macros, the first, and the one of greater interest, runs the actual simulations (it has been parameterized to allow the analyst to select the number of trials to use in the simulation, and the number of machine cores to use in running simulations in parallel) based on custom R code. The second custom macro is a pure Alteryx "groupby" batch macro that takes the raw simulation results and calculates both the standings table for each trial and the standing placement probabilities across trials.
Figure 2 shows the internals of the R based macro that does the simulation. This macro illustrates how R can be used to provide more advanced statistical capabilities (in this case making random draws from multinomial distributions) as part of an Alteryx workflow. In the figure, the Configuration window shows a portion of the R code used to do the simulation, while the Output window shows the raw output from the macro, which is then further processed by the second custom macro.
The results of our simulation (which are based on 100,000 trials, not the 3,000 trials shown in Figure 2), are extensively covered in the first post in this series.
We examine how well our models did from both the perspective of the individual matches themselves, and the probability each team would advance. In terms of the individual matches, Table 1 shows the confusion matrix between the predicted and actual match outcomes for the randomly selected focal team. The table shows results that closely match those found for the test sample in the traintest methodology used to create the models, in that draws are very difficult to predict, but the predictions are much better for games that actually ended in a win or loss. The accuracy of our predictions for the Group Round was 62.5%, compared to the just over 63% which we found in the test sample. This suggests that the success of our combined model is squarely in line with what we should expect to see based on the results in the earlier test sample.
Table 1. The Confusion Table for 2018 FIFA World Cup Opening Round Matches

Actual Draw 
Actual Loss 
Actual Win 
Predicted Draw 
0 
0 
0 
Predicted Loss 
5 
14 
3 
Predicted Win 
4 
6 
16 
In terms of the which teams advanced, we do fairly well in most groups, but there are a few where we were less than perfect. For Groups A (Uruguay and Russia), B (Spain and Portugal), E (Brazil and Switzerland), and G (Belgium and England), we predicted the correct teams in the correct order. In Group D we selected the correct teams (Croatia and Argentina), but had them in the wrong order (Croatia, not Argentina, won the group). We also did fairly well in Group C, placing France first, but Denmark took second over Peru (we gave Peru a 54.1% chance of advancing and Denmark a 53.6% chance, so we were very close to getting this one correct as well). One thing we really did not predict is Germany’s ugly fate in this year’s World Cup (I’ve heard that my colleague in this work, our German Sales Engineer Oliver Wahner, has gone into mourning following Germany’s loss to South Korea). We had Germany comfortably winning Group F, with Mexico and Sweden in a close battle for second. Mexico and Sweden did have a close battle (at least in the standings table), but it was for first place rather than for second. We predicted that there was only a 2.3% chance that Germany would place fourth in the group (in contrast we predicted that Germany had a 70.3% chance of placing first in the group), and based on predictions, Germany’s loss to South Korea was the biggest upset in the group round. For Group H we correctly predicted Columbia winning the group, but we predicted Japan would finish fourth and Poland second, but this is opposite of what happened. Overall, we managed to correctly predict 13 of the 16 teams that advanced to the knockout rounds of this year’s World Cup (an accuracy of 81.25%). The improvement in accuracy for predicting whether a team will advance as opposed to the match level outcomes is the classification model equivalent of “regression to the mean”, since it reflects an “averaging” of outcomes over multiple matches.
While Germany’s collapse (the reigning World Cup champion) in this year’s World Cup was not predicted, it is becoming something of a norm. In four of the last five World Cup tournaments (including this year), the previous champion has been eliminated in the Group Round. This phenomenon has become known as the World Cup curse. At this point, it has occurred enough times that there is sufficient data to attempt to estimate the magnitude of the “curse,” if it actually exists, which is the topic of the final blog in this series.
Dr. Dan Putler is the Chief Scientist at Alteryx, where he is responsible for developing and implementing the product road map for predictive analytics. He has over 30 years of experience in developing predictive analytics models for companies and organizations that cover a large number of industry verticals, ranging from the performing arts to B2B financial services. He is coauthor of the book, “Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R”, which is published by Chapman and Hall/CRC Press. Prior to joining Alteryx, Dan was a professor of marketing and marketing research at the University of British Columbia's Sauder School of Business and Purdue University’s Krannert School of Management.
Dr. Dan Putler is the Chief Scientist at Alteryx, where he is responsible for developing and implementing the product road map for predictive analytics. He has over 30 years of experience in developing predictive analytics models for companies and organizations that cover a large number of industry verticals, ranging from the performing arts to B2B financial services. He is coauthor of the book, “Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R”, which is published by Chapman and Hall/CRC Press. Prior to joining Alteryx, Dan was a professor of marketing and marketing research at the University of British Columbia's Sauder School of Business and Purdue University’s Krannert School of Management.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.