Data Science

Machine learning & data science for beginners and experts alike.

Welcome to the Football World Cup Analytics Blog. In this, I’m going to share with you the workflows and information for how we prepared the data in order to help us predict who would win the World Cup, which starts in a couple of weeks.


This compromised two aspects; historical analysis and model preparation, which we’ve outlined below. Ultimately, we want these to be available as resources for all of you to create your own analysis and predictions and apply your own expertise. Add to the discussion below with your views, opinions, and workflows!


Source: GIPHY




The first workflow (Historical Data Prep) was all around analysing historical international fixtures to see what we could learn and the different ways that you could cut the data to find interesting insights. This is a great framework for you to explore further and answer your own football-related questions!


I ran through this workflow and went into more detail on the analysis in a webinar earlier in the year, which you can watch back on demand.


The second workflow (Model Prep) was then built upon the first workflow and how we could prepare the data for modeling and feeding into Alteryx Machine Learning, allowing us to predict the winner of the world cup! In the Model Prep workflow, I have included a predictive model with the R tools so that you all have an example to start with.




This provides a high level of the additional steps we undertook to make sure the data was correctly cleaned, formatted, and contained the right information. Depending on what you currently have access to, this can then be fed into; Intelligence Suite, the R Predictive tools, and Machine Learning!


Once you have built a model, you can use the ‘Fixtures to Score’ file to make your own predictions for the games.


Data Resources


We’ve compiled some datasets which have been used during this series which you can now use for your own purposes. All of the datasets outlined below are included within the packaged workflows.

  • Every International Football Match
  • Penalty Shootout Results
  • World Rankings Data
  • FIFA (game) Player Data
  • World Cup 2022 Fixtures


Stay tuned for future blogs and webinars as we share how our model performed...! 


Source: GIPHY


Additional Resources:

  • Join our upcoming webinar on the 17th of November with Ex-professional and football legend Tim Howard, who has played in three world cups – Link to register
  • How to create a winning soccer bracket with analytics – Link to register