Wednesday, December 12, 2012

Predicting SU Elections

Recently, individual bloggers like Nate Silver and √Čric Grenier have gained massive (deserved) notoriety for developing statistical models that prove to be very accurate in predicting the outcomes of major elections. If you haven't heard of them I strongly urge you to check them out!

At the end of the last SU election, I posted a very simple regression analysis of a short list of election statistics describing the executive elections on campus. Since then I've added to my model, and I have a good reason to believe it's been made much more accurate.

Spoiler alert: this post isn't going to have any spoilers. I'm not going to tell you any specific numbers. Sorry, potential candidates!

I considered a significant number of quantifiable parameters. I strictly chose to avoid anything subjective (like debate performance, quality of posters, how chatty they are when we hang out), and was able to break the parameters into three broad categories: popularity, experience, and campaigning.

Falling into these categories were measures like Facebook friends and interactions, number of years served on Students' Council or Faculty Associations, and amount of money spent or fines amassed during campaigning.

The coolest result of the analysis was the different impact of each factor. The lowest-weighted factors were the popularity factors (Facebook friends don't appear translate very easily into votes), and the most important factors actually fell under the experience category. This is actually kind of reassuring, especially as it appears to suggest that the elections may be a tiny bit less of a popularity contest than normally thought!

The current analysis uses the results of 30 candidates running for 12 positions over two years (I skipped 2010/2011 because the lack of contested races really messed things up). While this is by no means a conclusive sample size, the fact that the results are so consistent, even between the two years individually, is really promising. Take a look at this graph:

The graph shows the relationship between the predicted number of first-round votes from the model, and the actual number of first round votes from the election. If the model was perfect, all the points would fall on a perfectly straight diagonal line. As it is, they fall on a pretty great line - out of an ideal coefficient of determination of 1.0, the model yielded a result of 0.945. Also, it correctly predicted the winner of each race, which isn't too shabby. I'm personally pretty happy with that result!

So stay tuned during this year's election, because I'm going to try to use this model to predict some of the results. If that doesn't sound fun, then you need to work on your love of stats...

No comments: