Friday, January 18, 2013

The Downsides to Hockey Betting

Hey, have you ever "registered or recorded bets or [sold] a pool"? Or have you ever disseminated any information likely to "be of use in gambling, book-making, pool-selling or betting on a horse-race, fight, game or sport"?

If you have, welcome to the dark side of Canada's criminal underworld.

Turns out that, under Section 202 of the Criminal Code of Canada, those things are illegal. I'm not an expert by any means, but it kinda sounds like having any sort of unregistered group of people who bet on things, and who discuss information pertinent to their outcome, is illegal. Welcome to two years in jail!

Looking deeper into the Criminal Code, there's another fun point that even if you're betting with a legitimate lottery corporation (like, say, SportSelect), you cannot bet on the outcome of a single game (Section 207 (4) (b)). This was recently debated in the House of Commons where they tried to scrap the rule, but apparently the Senate might be fighting back on this one for a change.

These rules force legitimate lotteries and betting houses to combine bets. SportSelect, for instance, specifies that gamblers have to bet on a combination of between 3 and 6 games, where they will only win if they get all of their predictions correct.

This has a severely negative effect on the gamblers, because even if they are perfectly correct about the probabilities of an event happening, it will happen so much less frequently that they may not be able to afford to keep playing.

For example, let's take a look at the odds that SportSelect is offering for this weekend's hockey games. There are 13 games going on tomorrow, representing the vast majority of teams in the league. Taking a look at these games shows that the average payout for a team to win away is 2.28, win at home is 1.83, and for the games to go to a shoot-out is 5.73. In fact, 11 out of the 13 games were predicted to be won by the home team (but that's not important right now).

Since these are the first games of the season, it's likely that these odds are pretty much more or less pure speculation - lots of these teams haven't even played together yet. That makes it not inappropriate to compare the odds to, say, an average of historical games, and may represent an 'average' set of games.

Which I did.

Last year, which was a regular season, had 1,230 games played over the season. Of these, 24.4% went into overtime, and 60.3% of those went to a shoot out - meaning in total 14.7% of games went to shoot out. Of the rest, an impressive 57.7% were won by the home team (oh ok that previous thing now does look important...). Similarly, the previous year had 12.1% shoot out games, and 53.7% of the rest were won at home.

So let's say on average any given game has a 13.4% chance of going to shoot out, a 48.2% chance of getting won by the home team outright, and a 38.4% chance of going to the away team. If we pretend that those SportSelect odds represent a more or less 'average' set of games, we can compare them like this:

Outcome Odds Payout Expected Result
Home Win 0.4817 1.8277 -12.0%
Away Win 0.3841 2.2769 -12.5%
Shoot Out 0.1341 5.7308 -23.1%

So there you go - if you knew nothing else about a game other than long term trends, you'd expect to lose between 12 and 23% of your money on any given bet. That would put the SportSelect odds somewhere between a casino game and a lottery in terms of expected money lost.

But wait, it gets worse. Watch what happens when you have to bet on at least three games:

OutcomeOddsPayoutExpected Result
3 Home Wins0.11186.1053-31.8%
3 Away Wins0.056711.804-33.1%
3 Shoot Outs0.0024188.21-54.6%

Oh dear.

So basically, the way it's set up now dooms you to failure. What makes it even worse is that, even if you could overcome the 30-50% penalty by knowing more about the system than a historical average analysis, you'd still only win on maybe about one set of bets out of every eight or so. Even if you did have a good advantage, it would likely take a really long time to ever see any real profits off of it.

Oh well. Enjoy the hockey, I guess.

Wednesday, January 2, 2013

When Life Gives you Weather Stats...

...make Statsonade!

So here's my dilemma. I have this 'weblog', and it's really cool when people read it. (Not quite as cool as before, because apparently somebody spam-clicked my ads and now I no longer get money. On the other hand - no ads!) By far the most popular posts are when I talk about weather stuff or SU stuff, and as there are no present SU elections to write about and I've been doing weather posts at the end of each season, I may have nothing good to offer for this week.

Instead, I'm going to make statsonade from the stats that life tossed me. Yum!

In my last post, I presented the weather forecast comparison I had for six weather stations in Edmonton during autumn. It was pretty fun, and only one of the six forecasters outright told me I was wrong.

One of the accuracy measures I used in that analysis was the percentage of time that a forecaster was within three degrees of the true high temperature. An alternative way of presenting these results would be to just outright plot the predicted vs. actual temperature results for each station. Maybe it would look something like this:

There are some wicked fun facts from these graphs. In all of them, the red line is perfection, where what you predict is exactly what you end up with. The data was only presented for autumn, and stations are all quite close to perfect, as well as close to each other - the R-squared values range from 0.900 to 0.926. In general it appears as though most of the stations over-predict the temperature when it gets to the higher range

That's all well and good, but what if we wanted to take this a step further? Is there some combination of  stations that gets you better than any individual station? That would be like a weather model or something.

It turns out that you can actually get a marginally better prediction by using a weighted average of the stations. Consider the following:

T = 0.085TAD + 0.301EC + 0.148GB + 0.155WN + 0.483WC - 0.172CTV

After all that work, our R-squared value is a whopping 0.944. Though this method of aggregating weather forecasts is apparently a minor improvement, it's likely not worth it in terms of predicting the weather.

A fun result of the regression suggests that it would be easier to just take a weighted average of Environment Canada and Weather Channel's predictions, as they make up the majority of the formula. What's really strange, though, is that the CTV predictions get factored in as a negative value. CTV itself has a completely respectable correlation between predicted and real temperatures, but for some reason subtracting a weighted version of their numbers improves the overall prediction (when using them and at least two of any other weather station). Mysterious...