Thursday, September 27, 2012

Election Arithmetic

Pretend for a minute that you're a general of a vast army, and that you are in charge of defending your city from the ruthless Colonel Blotto. You know that you both have 1,000 troops, and there are ten battlefields between your two armies.

Assuming that whoever sends the most troops to a battlefield wins that battle, and whoever wins the most battles wins the war, what is the optimal distribution of troops in order to maximize your chances of winning? Think about it for a minute.

Certainly there are some very poor possible choices - for instance, allocating all 1,000 troops on one battlefield means that the best you can do is tie if your opponent does the exact same, and you lose under any other configuration. But is there a best strategy?

As you may have guessed, this is a classic and well-established game theory game (they're called games but are often not fun) known as the Blotto game. However, unlike many games there is no perfect strategy to this one. No matter what you do, your opponent can beat you if only they know your strategy. For instance, a strategy of evenly splitting 166 troops onto the first six battles and 0 on the remaining four gives you a very high chance of winning six of the fields, but is very easily countered by your opponent with troops to spare if they knew your plan.

This has led to a classification of a large series of games as "Blotto" games - there is an optimal strategy, but it is only identifiable after the fact (once you know your opponent's strategy). Another example of this type of game is rock-paper-scissors - your best bet is to pick randomly from any number of good strategies, but after the game is over both sides can identify the one strategy that could have beaten their opponent.

A wonderful paper published fairly recently hypothesized that the American elections may be able to be modeled as a Blotto game, but with a couple other parameters tossed in as well. A Blotto game is fairly similar to the electoral college system, they thought, because each side has a fixed amount of money to distribute across a 51 electoral colleges, with the winner being whoever can get at least 270 electoral college votes.

First, though, they had to consider how much strategic variables, which they chose to be polling numbers ten weeks before the election and campaign spending ratios, impacted the vote (if at all). They came up with three game possibilities: either the election could be modeled as a Lotto, Blotto, or Frontrunner game (which is, coincidentally, the name of the paper). In a Lotto game, knowing your opponent's strategy couldn't help you - much like in a lottery, you can't identify any strategy that could lead to victory beforehand, even though afterwards you could see where you went wrong. In a Frontrunner game, there is an identifiable connection between strategic variables and victory, but one side has such an insurmountable advantage that they cannot lose.

The paper analyzed the 1996 and 2000 American presidential elections, and determined that campaign spending per state did have a strong impact on winning or losing that state. They further determined that the 1996 election was a Frontrunner game - Clinton has so much money and such favorable polls that it was easy for him to pick a winning strategy, and that even if Dole had known that strategy he still couldn't have won.
The results from the 2008 election. In general you'd expect states up and to the right to be won by Democrats (due either to higher spending or higher poll numbers). This is indeed the case.

Much more interestingly, though, the 2000 election was in fact determined to be a Blotto game - in other words, Al Gore could have re-allocated his money in such a way that Bush would have lost the election. The authors of the paper estimated that he only had about a 4% chance of choosing the strategy at random, but it was still a possibility.

Fast-forward to 2012, and these results have an interesting impact on the upcoming presidential election. A glance at the current polling numbers shows that this election is nearly as tight as the 2000 election, suggesting that a Blotto-like model may turn out to be valid. Considering that Obama is currently leading the expected vote count as well as fundraising, though, it seems more likely than ever that Romney has a steep uphill climb ahead of him.

Monday, September 24, 2012

Summer Weather

Weather forecasting is insane.

As a career I couldn't even imagine how un-rewarding it is - you could pour hours and hours into developing new algorithms that only get tiny increases in accuracy due simply to the massive complexity of the system you're trying to model. When you're right people take you for granted, and when you're wrong you take a lot of blame.

That being said, a while ago I noticed that sometimes different weather forecasters will predict radically different weather for the same day, given the same data. Also, I noticed that on Monday the weather for the weekend could be substantially different than the forecast from Friday. These are all fair differences - tweaks to models could cause differences of opinions between meteorologists, and the closer your prediction is to when you make it, the more accurate we'd hope it would be.

I was curious as to how much of a change there would be, though, which is why I decided to keep track of it. Since the beginning of June I've kept track of the six-day forecasts for High temperature, Low temperature, and Probability of Precipitation for five different forecasting stations: timeanddate.com, Environment Canada, Global Weather, the Weather Network, and the Weather Channel. Environment Canada, Global, and the Weather Network were chosen based on the sites visited most frequently by myself and my friends, the Weather Channel was chosen as it is the basis of Yahoo! weather, and subsequently the commonly-used Apple weather app, and timeanddate.com was chosen because it's a large multinational site. All stations were chosen at the Edmonton downtown location, not the international airport, and data for predictions was collected between 11 and 12 am for consistency in comparison.

Now that summer's over, I have some preliminary results. And the winner (by a hair) is the Weather Network!

Score (out of 100):
  • Weather Network: 66.92
  • Global Weather: 66.02
  • Weather Channel: 63.99
  • Environment Canada: 55.00
  • TimeandDate.com: 54.25
The score is based on a weighted average that was more or less arbitrarily decided by me: each subsequent day in the future was weighted less (so that a prediction for tomorrow's weather is worth more than a prediction for next week's), and POP was worth more than the High prediction, which was in turn weighted more than the Low prediction.

Some fun facts!

Best High temperature prediction: Weather Channel 1-day prediction (96.79% within 3 degrees)
Best Low temperature prediction: Environment Canada 2-day prediction (96.07% within 3 degrees)
Best POP: Global 4-day prediction (p-value 0.346)

Worst High temperature prediction: TimeandDate 6-day prediction (55.20% within 3 degrees)
Worst Low temperature prediction: Global 6-day prediction (68.57% within 3 degrees)
Worst POP: TimeandDate 3-day prediction (p-value 0.038)

Some graphs!

Temperature score was based on the percentage of predictions that were within 3 degrees of the actual temperature. In general there was a very strong downward trend for the high temperature predictions - almost all stations had better than 95% accuracy at predicting tomorrow's weather, and they were all about 70% accurate at the weather a week from now. There was less of a trend noted for the low predictions, however those are typically less useful apart from determining the likelihood of frost.

The score for POP is based off the p-value for each category of prediction. In essence, I checked the number of days that a given station predicted a POP of 10%, and compared it to the fraction of days that it actually did rain for that prediction. This doesn't translate directly into an accuracy percentage, which is why I call them 'scores' instead (though if every category had precisely the incidence of rain as predicted, it would end up with a score of 100).


So there you go! Hopefully this helps you the next time you're planning a picnic (or whatever people check the weather for...).

Saturday, September 8, 2012

Higher Learning


Pick a professional career. Almost any will do. Now really take a moment to visualize how their job is done today - their tools, their projects, all sorts of the stuff that's required by their fields.

Now think about that same profession 300 years ago. Any chance it's changed?

Doctors have progressed from prescribing urine baths and bloodletting to minimally-invasive laparoscopic surgery. Engineers have moved from catapults to Mars rovers, and law enforcement has gone from corrupt court systems to advanced forensics (in most countries, at least). These and other similar advances have been undeniably amazing, and are largely responsible for our currently quality of life.

Sadly, though, one particularly glaring profession has been dragging its heels against this rapid change. Despite the enormous advances over the last hundreds of years, university education has been largely unchanged. For some reason, an overwhelming majority of university classes are still taught by packing vast numbers of students in a theater and being talked to for an hour. Tweed jackets have come and gone, and the use of powerpoint may have sped things up, but by and large the methods used to teach are virtually unchanged.

Courses are still taught largely by talking at students, assigning readings and homework, and then giving grades based on exams. Exams themselves are still mostly just forcing students to cram material for a couple days beforehand, then stuffing students in a room for two hours and making them answer random questions about the previous forty hours of lecture.

Why is this still the standard? How likely is it that, of all professions, how we teach people more or less peaked three hundred years ago? Why is it that the most common way of judging how well someone has learned is to cram them into a room and force them to recite things, and why on earth should the grading that results from that two hour test be worth up to 70% of their grade? The case has been made before that universities should get students focusing on learning how to learn, instead of what often seems to be the focus of getting students learning how to write exams, and I totally agree.

There have been some pretty exciting developments in expanding education options recently, though. For example, the Khan Academy has more than 3,300 video lessons and interactvie exercises covering math all the way from preschool arithmetic to first-year calculus, which are free for anyone to take. The Academy also covers basic sciences, humanities, and finance.

If that isn't what you're looking for, why not learn a language? The BBC offers free courses on all the major European languages, and essential phrases for 40 languages. If you're interested in something less structured, free lecture videos on hundreds of topics can be found anywhere online to anyone who's really interested.

What's particularly cool, though, are the opportunities that are becoming available for a more formalized education. Recently, three of the biggest names in education (MIT, Berkeley, and Harvard) joined together to offer advanced university courses for topics ranging from solid state chemistry to artificial intelligence. These even offer 'certificates of completion' - certainly worth putting on a resumé, even if they don't have quite the same weight as an official transcript. Registration for the courses is open right now, and I strongly suggest you take a look at what's being offered.

The advantage of this new-found variety in fairly high-level education options is that it may (hopefully) end up pushing the envelope for education options in universities. Many post-secondary programs are starting to allow more open-ended education, such as the option to learn by correspondence or online, and with so much knowledge so freely available we may yet see a change in the 'classical' approach to lectures.

And for those of you who truly are here to learn how to learn, I strongly suggest taking a look at some of the links - I can guarantee there's something out there you'll find fascinating.