One of the things I enjoy trying to do from time to time is developing predictions of sporting events, such as the NHL Playoffs. So when I heard that people were trying to predict the medal counts for the 2014 Sochi Olympics, naturally I became intrigued and tracked some of their results.
I found four different published predictions:
- Infostrada Sports: These guys used results from "Olympics, World Championships, and World Cups (or equivalent)" since the 2010 Vancouver Olympics to develop a likely scenario for who would win in each event. Their model had different weights for the results, time since the event, and nature of the event. They only ranked the top 15 countries on their medal table, and it was last updated three days before the opening ceremonies.
- Wall Street Journal: The prestigious journal interviewed experts and rated recent performances, and assigned probabilities to certain outcomes. They claim to have been accurate to "within a few medals" in the last two Olympics, but were actually just alright for the 2012 London games, and only good at predicting a few countries in Vancouver in 2010.
- SportsMyriad: I think this is a blog? Either way it's a fun website if you like sports stats. No real idea where the stats came from (apart from the disclaimer "It'll change from injuries, form, whims, etc.").
- Andreff & Andreff (2014): A working paper from the International Association of Sports Economists, and also posted to the Freakonomics blog, the authors correlated factors such as population, per-capita income, political regime, average snowfall, and number of ski resorts to try to determine the number of medals. This sort of approach has been used for summer games before (probably not with ski resorts as a major factor...), but apparently not for winter Olympics. These were the only guys to include upper and lower bounds on their predictions.
How did they all turn out? Sort of alright, I guess. Sort of.
The best prediction was by the Wall Street Journal, with a coefficient of determination of 0.77 for total medals, and 0.63 for golds (1 being perfect).
Notable exceptions were the Netherlands (getting double the expected medals - whoops) and South Korea (getting half their prediction), but otherwise things were pretty decent for the Wall Street Journal.
Next best was the SportsMyriad site, which came in only slightly behind at 0.75 for total medals, but less close at 0.58 for golds.
Andreff and Andreff were next up, with a coefficient of determination of 0.68 for total medals (their model didn't break it down into colours). They were the only group to include upper and lower bounds, which proved to be a bit silly since only 35% of countries fell within the bounds given to them. These guys were the most wrong about the Netherlands (they predicted very confidently they'd get 5-7 medals, instead they got 24).
InfoStrada was the furthest off, with a coefficient of determination of 0.22 for total medal count. This is a bit unfair of a direct comparison, though, as they only listed their top 15 countries, and the addition of 10 lower-performing countries would have likely bumped that up. Even on a comparison of their top 15 across all models, though, they still came last.
In general, the Olympics are tough to predict, for loads of reasons. Even the best in a sport don't win every event they compete in, and trying to predict the result of a single mogul run or figure skate performance is an exercise in futility. Team sports predictions are rough since the full teams only rarely play each other with the exact line-ups between Olympics, and occasionally Olympic berths are won by teams or athletes who don't even end up competing. Using socio-economic data is probably ok for getting a general picture of a country's winter abilities, but ignores the fact that sometimes people are just good at something despite their surroundings.
That being said, I admire the effort by these would-be predictors, and look forward to seeing how they do next time around!