MANILA, Philippines ? Really? We can?t say for sure, but we can say that the probability is about 9 out of 100.
We can make a better prediction if we say the Netherlands will score two or more goals, Spain one goal or none at all. The probability of that happening is about 29 out of 100 or a little less than one in three.
Part of the excitement of the World Cup is that we can?t be entirely sure of the outcome. It?s thrilling to watch Van Bronckhorst slam the ball into the corner of Uruguay?s net from nearly 30 yards out or to see Oezil?s fast break pass to Mueller zoom by the valiant goalkeeper James, courtesy of the German midfielder?s boot. But it?s just as exciting when, time running out and penalty kicks just about to fire, David Villa unexpectedly scores Spain?s death blow to Paraguay at 83?.
?Nothing is certain in this world,? said Benjamin Franklin, ?but death and taxes.? Yet uncertainty never stopped humanity from trying hard to predict the future, and predicting the future got better when the science of probability was discovered by Pascal and Fermat in the seventeenth century. The science of probability makes it possible for us to predict what will happen.
This science shows us how to use mathematics to create models of events happening in the world. We gather observations about the world, which, together with our model, lets us to say something about the events happening in our world, including predicting the future. Often we can?t be 100% sure about what we say, but we can be sure enough to say something useful, sometimes very useful.
The entire process is called inference. Prediction is a special kind of inference.
Playing with 2010 World Cup numbers, I came up with my own model to predict who is going to win each match from the Round of 16 up to the finals. The predictions are significantly better than chance outcomes.
Among the matches leading to the finals beginning with the Round of 16, the model correctly predicted wins and losses in 9 out of 14 matches. The modest 64% success rate, we can say ?
and we are 72% sure about this ? is not due to chance. Pure chance would result in getting about half the predictions right.
The predictive model is an equation that predicts the winner based on a rank corresponding to the average goals scored in the tournament:
Predicted average goals scored = a + b*Average goal difference
Goal difference is the goals scored by a team minus the goals conceded to the opponent. It turns out to be a rather good predictor of the winner. Winning teams tend to post larger margins of victory.
Teams with higher predicted average goals scored in the tournament are higher ranked, and a higher rank means they are more likely to win a match.
In two matches ? Paraguay versus Japan, the Netherlands versus Uruguay ? the model predicted a tie. In the case of ties, it was necessary to pick the winner by making a judgment call, which in this case was based on the numbers posted by professional odds makers. Paraguay, odds of winning 33 to 1, was picked over Japan, 66 to 1. The Netherlands, odds of winning 23 to 10, was chosen over Uruguay, 8 to 1.
Which predictions didn?t come true? Upsets ? Ghana over the United States, Germany over Argentina, Holland over Brazil. In all three matches, the bookies had favored the eventual loser.
Most bookies in fact wouldn?t accept a bet against Brazil, which looked like it was dancing samba-style in a Mardi Gras parade toward the finals.
The model predicted Portugal would defeat Spain. This instance the crystal ball was cloudy because when Portugal decimated North Korea, 7-0, the extreme goal difference pushed Portugal?s predicted performance off the mark. It was Spain (21 to 10) that the bookies had chosen to win over Portugal (9 to 2), which is what happened, Spain 1, Portugal, 0.
Teams were more evenly matched as they got closer to the finals. The winner of Germany versus Spain was a close call. The model incorrectly predicted that the higher scoring Germany would win. Odds makers got it right by giving an ever so slight edge to Spain.
What does the model say about the Netherlands versus Spain match? Holland, with higher predicted average goals scored (1.82) will defeat Spain (1.60).
And what is the probability of the forecast coming true? One way of calculating the probability of a Holland win over Spain is to guess at a specific probable outcome and then calculate its probability.
Predicted average number of goals scored, which our model predicts for Holland and Spain, respectively, will serve nicely our purpose of choosing a probable outcome. We predict that Holland will score two goals (whole number closest to the predicted average) or more in the match. Since we predict that Spain loses, it has to score one goal or less.
We won?t deal with the probability of a tie because then we will have to calculate the probability of either team winning the penalty kick shootout. We don?t have a model to predict the outcome of the penalty kick contest following extra time.
We will use what?s called the Poisson distribution to model the probability of predicted average number of goals scored. According to this model, the probability of the Netherlands scoring two or more goals is 0.54, while the probability of Spain scoring one goal or less is 0.53.
A bit of complicated statistical analysis shows that the predicted average number of goals scored by the Netherlands is ?independent? of, meaning, not directly affected by, the predicted average number of goals scored by Spain, and vice-versa.
This piece of information allows us to multiply the respective probabilities for the two countries to get the probability of both events occurring together. When we do our multiplication, we get the probability of Holland scoring two goals or more while Spain scores one goal or less equals 0.29.
Holland defeats Spain, odds, one in three. That might seem like the best we can do, but it isn?t.
We can enjoy the game, not knowing for sure whether favorite Holland will be drinking orange juice in celebration. ?Uncertainty and expectation are the joys of life,? said William Congreve ? true in life, as true in world championship football.
Joseph I. B. Gonzales, Ph.D. is a communication and research consultant. He teaches Methods of Research in Management, and Managerial Statistics at the Ateneo Graduate School of Business. He has been involved as the writer or editor of over 70 academic and industry publications in the U.S. and the Philippines and has managed up to 40 qualitative and quantitative professional research projects. He is the editor of Techne: Managing through Numbers, an Ateneo Graduate School of Business publication.