Wednesday, June 26, 2013

NHL 2013 Wrap-up

So the Stanley Cup has been awarded. Congrats to the Chicago Blackhawks!

As I previously mentioned, I've been running a model to try to predict the NHL finals for a few years, and I've been running it again this year. The model outputs looked like this over the last few weeks:

Again, as the time through the playoffs progresses (x-axis), the probability of a given team winning is displayed as the height of the team's bar (so that at any time all teams' probabilities add up to 100%). Originally I had predicted the Senators had the highest chance of winning, with the Bruins in second and the Blackhawks in third. Fortunately enough, two out of those top three actually made it somewhere...

What's important though is to check if I'm anywhere near accurate. It's convenient, I suppose, to say that my top teams all did fairly well, but how well did I actually do? Take a look at this (format stolen straight-up from Wikipedia):

Conference Quarterfinals Conference Semifinals Conference Finals Stanley Cup Finals
1 Pittsburgh Penguins (64.2%) 4 1 Pittsburgh Penguins (22.8%) 4
8 New York Islanders (35.8%) 2 7 Ottawa Senators (77.2%) 1
2 Montreal Canadiens (15.9%) 1 Eastern Conference
7 Ottawa Senators (84.1%) 4
1 Pittsburgh Penguins (31.7%) 0
4 Boston Bruins (68.3%) 4
3 Washington Capitals (37.9%) 3
6 New York Rangers (62.1%) 4
4 Boston Bruins (76.7%) 4 4 Boston Bruins (59.8%) 4
5 Toronto Maple Leafs (23.3%) 3 6 New York Rangers (40.2%) 1
E4 Boston Bruins (52.4%) 2
(Pairings are re-seeded after the first round.)
W1 Chicago Blackhawks (47.6%) 4
1 Chicago Blackhawks (77.0%) 4 1 Chicago Blackhawks (58.2%) 4
8 Minnesota Wild (23.0%) 1 7 Detroit Red Wings (41.8%) 3
2 Anaheim Ducks (40.9%) 3
7 Detroit Red Wings (59.1%) 4
1 Chicago Blackhawks (70.3%) 4
5 Los Angeles Kings (29.7%) 1
3 Vancouver Canucks (37.8%) 0
6 San Jose Sharks (62.2%) 4 Western Conference
St. Louis Blues (43.1%) 2 5 Los Angeles Kings (34.3%) 4
5 Los Angeles Kings (56.9%) 4 6 San Jose Sharks (65.7%) 3
In each bracket, the percentages are the probabilities assigned by my model. A first glance analysis of this bracket shows that in 12/15 of the series, the teams I assigned the highest odds of winning to ended up winning (exceptions are Kings v. Sharks, Penguins v. Senators, and awkwardly Bruins v. Blackhawks).

If someone was flipping a coin, they would expect to predict 12/15 or better only about 1.76% of the time. This percentage is known as a p-value, and a standard convention in, say, medical experiments, is to use a value of 5% for determining if the results are significant or just arose as a matter of chance. Regardless of if this is a reasonable standard or not (this could allow 5% of the results of studies to be due to chance, and is a debate worth having), a "conventional" medical study with a p-value of 1.76% would likely be accepted, leading me to humbly suggest that my model this year was statistically significantly above chance when it comes to predicting the outcomes of series (though not necessarily the playoffs as a whole).

Another way of measuring the accuracy is by using a Brier score (similar to what I had in my previous post) to actually measure the accuracy of the probabilities assigned. It's all well and good to say the teams with the best odds often won, but how good were the odds that I assigned?

The average Brier score for all 15 series was 0.8182 out of 1, where a 50/50 guess for any given series would give a value of 0.75. My results are basically the equivalent of assigning a value of 57% at the outset to every team that actually ends up winning (as opposed to a random value of 50%) - in other words perhaps not definitively accurate and clairvoyant, but still likely significant.

One last thing to try is to plot the overall Brier score over the entirety of the playoffs, and compare that to random chance. It would look like this:


That's actually not so good. Compared to last year, there's virtually no difference between the two. At least in the parts where the two are different (mid-May, for instance), the model is higher than chance, but everywhere else is about even. The similarity between the two suggests that there weren't a lot of cases where series ended in remarkable against-the-odds comebacks, though when one did (Blackhawks v. Red Wings, for instance) it was able to be better predicted by my model than chance.

What's encouraging about all of this is that between all of the different ways of analyzing the accuracy, they converge on suggesting that the parameters that my model uses are reasonable and significant. Also, if I had bet on the first round, I would have made a killing. Too bad...

No comments: