With the World Cup now full flow in Brazil, worldwide anticipation has reached a fever pitch. In South America, and Brazil in particular, office pools and group bets are popping up, with soccer fans hoping to pick the winning team. At Palisade, we don’t believe in taking wild guesses—which is why we developed a model using DecisionTools Suite software to forecast the probabilities of each one of the 32 national soccer teams winning this ultimate championship.
Taking data from the rankings of over 200 national teams from FIFA spanning the past four years (2011-2014), Palisade created a model that uses @RISK to determine the probabilities of different teams winning at different stages, and PrecisionTree for mapping this information into a tree, or bracket, format. @RISK uses Monte Carlo simulation to compute thousands of different possible outcomes for the tournament automatically. The historic strengths and weaknesses of each team are accounted for in the statistical models used to represent each matchup.
To build this model, I first classified teams into ten equally-weighted probability “bins,” or categories, based on their past performance. I then calculated the odds of each team winning, losing, or tying with its ranked adversary in a given match. For example, a team in the highest bin (number 10) would have a larger probability of beating a team in an intermediate (say number 5) bin. In this case, historic data predicts that the highest ranked team would have an 86% chance of beating an intermediate team, a 7% chance of losing and also a 7% chance of getting a draw. The model also gives the option to consider possible home field advantage during games, using data on a team’s location during previous wins and losses. That is part of the beauty and mystery of soccer. The chance that an underdog can, from time to time, beat the favorite!
After calculating these probability tables, I modeled all of the first 48 games. For those of you who are unfamiliar, here’s a brief primer on how the tournament functions: the 32 teams are allocated into eight groups of four. Within each group, all four teams play against each other. The top two teams from each group advance to a group of 16 teams. A win at this level accounts for 3 points, a defeat for 0 points, and a draw accounts for only one point to both teams.
An astounding number of results are possible. Even though teams may tie in terms of points on their quest to qualify for a second round, rules exist to break these ties. One rule states that the team with the greatest number of net goal difference (goals scored minus goals received) to that point in the tournament will proceed. If that figure is the same for both teams, then the particular match between these two tying teams is considered. Finally if a tie still persists, a coin toss determines which team advances.
I incorporated all of these rules for ties into the Monte Carlo simulation model. At the core, the model also considered historic goal scoring records for winning, drawing and losing teams. Obviously, teams that win a match score more goals than teams that either draw or lose.
After the 16 teams are classified, they engage in a single-elimination, bracket-style tournament (like March Madness), leaving eight, then four and then two teams for a final game. These additional 16 games (including a game to determine third place among the two losers of semifinals) were also simulated. After running 50,000 iterations, the model probabilistically channels each team into an eventual tournament win, and calculates its odds.
Depending on certain ranking assumptions, the odds calculations may vary significantly. However, a robust approach that considers both historic and current rankings yielded the following results:
With a home advantage incorporated, Brazil has the largest probability of becoming champions with a 17% chance, with Spain coming as a runner up with 12% probability. Aside from those two pack-leaders, the following six teams have the next greatest chances in taking the cup (in descending order): Switzerland (8%), Greece (8%), Germany (7%), Colombia (7%), Argentina (6%) and Uruguay (5%). Clearly, these probabilities do not differ significantly, which could make for an unpredictable championship if the two front-runners fail to win.
Without considering home advantage, Germany would be the most likely winner, with a 19.9% chance.
However, I have to concede that my trust in risk modeling may override national pride when it comes to placing bets on the World Cup. I am still not sure whether I would bet on my country, Costa Rica, in the office pool. In a group with three former World Champions–Uruguay, Italy and England–it only stands a 23% chance of classifying for second round, and only one chance in 440 of becoming World Champions.
Try deep learning using MATLAB