Wednesday, June 30, 2021

Voting Patterns for Edmonton City Council's 2017-2021 Term

City Council, unlike other levels of government, doesn't rely on party systems for categorizing its members. That being said, there still can be, and in fact are, patterns in how members of council vote, and with the Open Data that's available on council voting records, these patterns can be examined.

There are a lot of different ways to visualize voting patterns, and I've played around with these before (see here and here - unfortunately, since most of the visuals for this blog relied on the now-dead Google Fusion Tables, there's really not much to see). I've settled on three favourite methods for the 2017-2021 Edmonton city council term - let's take a look!

First of all, as in previous years, I've disregarded all motions that were unanimous as they provide no particular differentiating information. That leaves the 2017-2021 term with 921 non-unanimous votes to examine (at time of writing).

The first pattern-finding method I like to use it to simply look at the success rates of each member of council. How often did a vote go the way they wanted it to? This can be a sign of consensus-building, or an indicator of work put in behind the scenes (perhaps at other committees), or potentially a matter of being a part of a majority bloc that tends to vote similarly:

While a direct comparison is perhaps unwise, these number in general follow the same pattern as my similar 2016 analysis. Of members of council who were present both years, Councillor Esslinger and Mayor Iveson are again the top two and Councillor Nickel is again the lowest. Councillors Walters, Knack, and Henderson are all within 5% of their 2016 results as well, with Councillor Caterina showing a slightly larger difference from before.

This of course is not intended to imply anything about the effectiveness of individual members of council, and is performed without a review of the motions themselves (whether they are procedural, multiple readings of the same bylaw, etc.).

Noteworthy from the last analysis was the result that Mayor Iveson had only 'lost' 17 votes out of 358 non-unanimous motions in the previous term. For comparison, at the time of writing, this number is now 94 votes.

A second pattern-finding visualization is how often members of council agree with each other. For the 2017-2021 term so far, that is:

The result from this analysis shows that a group of six members of council agree with each other more than 80% of the time across all pairings, and that a seventh member (Councillor Henderson) is just outside with a 79% minimum agreement rate (with Councillor Hamilton). With a council size of 13, seven members is a winning majority on most motions. Certainly, there is a correlation between the top six council vote winners and this group of six members of council - whether this group is ideologically similar or just more likely to compromise and build consensus is beyond the scope of this analysis though!

A third and final pattern-finding visualization that I quite like is adapted from the NOMINATE system used to scale members of the United States Congress. It is intended to represent ideological similarities and differences between members of council in a spatial manner - members closer to each other agree more, and further apart agree less frequently:

I'd like to stress at this moment that, as it's often tough to assign traditional political ideologies to city council bylaw amendments, this graph does not necessarily represent traditional 'left vs right wing' traits, nor traditional 'authoritarian vs libertarian' traits. The results of the graph are intended to model councillors as though their decisions are made based solely on two non-correlated factors, and the model above is oriented with the most significant factor aligned along the x-axis. 

It's totally cool if you want to stop now, but I actually really love this model system and I want to talk about it a bit more since it gained some interest when I did this for London. Effectively, the NOMINATE system models both councillors and motions along the two axes, then assigns a probability of each councillor voting one way or another based on the relative proximity to each "side" of a debate. The algorithm then iterates thousands of times, tweaking the positions of each councillor and motion in such a way to optimize the probabilities of each decision.

The net result of this is that, using only two dimensions, this use of the NOMINATE algorithm as it stands currently accurately assigns the correct vote to each councillor 93.4% of the time. While to some of you this may not seem perfect, a model that reduces the complexity of council decisions to two factors with over 90% accuracy is something I'm quite astounded by and happy with.

For instance, last week's vote to end the mask mandate effective July 1st broke down like this based on the model. Here, the orange coloring indicates 'voted no', and blue indicates 'voted yes', with clear circles for the locations of the decision-points:

Here, the percentages are the model's prediction at the odds of each councillor voting the way they did. The "yes" and "no" points are shown, and the dashed line indicates the mid-way point between the two positions. In this case, the model managed to accurately capture each member's vote (where accuracy here is defined by a yes vote with more than 50% probability, or a no vote with less than 50% probability). The probability doesn't necessarily reflect the difficulty a given member of council had in making their decision, and is more of the measure of accuracy of the model. 

By looking at all votes together, the model slowly hones in on the best placement for each member of council. Not all votes are as clean cut as this one - for instance, the vote on the solar power plant at EL Smith looked like this:

You can see here that the model was very close with councillor McKeen, and effectively swapped Caterina and Dziadyk. Again, as the model is probabilistic this doesn't mean it got these 'wrong', more that having these councillors and decision points in these locations is optimized over the entire term.

It's not a perfect model, but again I'm quite pleased with how accurately it is able to capture the voting term in only two dimensions!

So that's it - three different ways to look at the data, showing different aspects of what can be learned from it!

Monday, June 28, 2021

Which Edmonton City Councillor are you?

I've done this before, and had so much fun with it that I'm happy to once again present: 

A Buzzfeed-style quiz to get you more in touch with your elected representatives!

(it's totally ok if that doesn't excite you as much as it excites me)

Without further ado, here is a quiz for you to play around with. All decision points in the quiz are pulled from real votes in the 2017-2021 city council term, with information and sources provided.

Hopefully that was fun!

Like I said, I've done this before for Edmonton and London, and London was far more excited about it. The work that goes into these is an interesting mix of politics, whimsy, and data work.

The first step is to analyze the City of Edmonton open data set for Votes and Proceedings. For no discernable reason, the data set this term is inconsistent and halfway changes how votes are recorded, as well as changing how councillors are named. It's not particularly tricky to deal with, but it did have to be massaged a bit to be in a consistently usable format.

For this quiz, there's not much point in looking at unanimous procedural votes, so I focused on the 921 (at time of writing) non-unanimous votes. In an ideal world, a set of yes-no choices should require four or fewer questions in order to neatly sort into 13 possible answers (assuming approximately even splitting at each decision point). However, it's much more interesting and easy to answer the quiz when the questions are relevant and engaging. 

Most of the examples I chose for this quiz have news stories attached, which in my mind was a sign of that I'd found adequately interesting votes to base this on, and as a result a user on the quiz can get to a councillor with anywhere from three to five questions, which I was satisfied with. 

Hopefully you are too, because at one point in the design of this quiz one of the leading optimal votes was "That City Council waive the rules on providing notice of motion as set out in section 32 of Bylaw 18155 - Council Procedures Bylaw to allow Councillor S. Hamilton to make a motion without notice regarding the aerial mosquito program." It would've made things work so well but, well, it's hard to really care about it.

Each of the final results in the quiz genuinely leads to member of City Council who voted in the same unique way as the answers you provided. One assumption was made, which was that while Mike Nickel did not vote on his own censure, it was assumed that he would have voted no if he was forced to. 

Hope you had fun!

Saturday, April 27, 2019

Which London city councilor are you?

Open data can be used for a lot of things, and public meeting minutes of elected representatives are crucial in holding representatives accountable, ensuring they represent their constituents, and promoting honesty and efficiency in our government.

Or they can be used to make Buzzfeed style personality quizzes. That's what I did.

We've now hit a point in the City Council meeting minutes from this council so far where all councillors have disagreed with eachother on interesting votes at least once, which allows us to strongly differentiate between them. By presenting some of these votes, we can narrow down a few key motions that separate all the councillors, and present it in a Classification Chart. Since that's not as fun as a quiz, though, here it is in quiz format.

Share widely, and tell me who you got! (It may take a second to load)

Monday, April 22, 2019

Alberta 2019 Election Post-mortem

Well that was fun!

How did I do?

For more than a year now I've been tracking Alberta election polls with the hope of developing a reasonably accurate prediction model. Overall, I'm happy to report that the party I predicted in the lead won in 80 out of 87 races, and my riding qualifiers broke out as follow:

  • "Solid" lead: 65/65 (100%)
  • "Likely" lead: 12/15 (80%)
  • "Lean" lead: 2/5 (40%)
  • "Toss-up" edge: 1/2 (50%)
I think this is a decent proof of concept, small "lean" sample size notwithstanding, and I want to talk a bit about what went right and what went wrong, and how I can improve if I want to keep doing this sort of thing.

First of all, the polls leading up to election day didn't turn out to be too accurate. Take a look at the province and regional splits:

Edmonton was remarkably accurate, Calgary was close, but the rest of the province and the top line results were off significantly. This is possibly a cause for concern, as it could suggest that my model was taking inaccurate data as inputs but then claiming credit for an accurate output, which it wasn't designed to do.

The NDP ended up under-performing relative to their polling numbers, and likely the only reason this didn't mess up too many election prediction models is because they under performed mostly in areas like rural Alberta, where they were predicted to lose anyway. If the polls had been that wrong about the NDP in Edmonton, say, the predictions could have been far worse.

Similarly, my model and others like me likely wouldn't have fared too well if the NDP had overperformed their polling rather than underperformed. The same amount of polling error as actually occurred, applied the other direction, could have had the NDP win the popular vote across the province.

My takeaway from this is that I need to adjust my topline polling tracker. Right now it runs under the implicit assumption that errors in individual polls will cancel each other out. This seemed reasonable given that polls are produced by different companies with different methods. That led to my full Alberta tracker having a low confidence interval for the NDP in particular, though, as several polls in a row provided the same result. If I instead make the assumption that at least part of the polling error is correlated between polls, perhaps due to something beyond their control, then the final result from election night would have still been a surprise, but far less of one. Certainly something I'll take into account next time.

Other Metrics

Overall on a riding-by-riding level, I had an error of 6.4% vote share. That's not superb, but also not far from what my testing beforehand suggested, and was factored into my uncertainty. Comparing my final projection to actual results on election night doesn't look too bad:

If we ignore the Alberta Party and the Liberals, this leads to an overall R-squared value of 0.79, which I consider respectable. It's handy to ignore the low parties because they don't have much of a spread, and will skew the coefficient of determination calculation.

Very fortunately for me, if I input the final actual regional results as though they were a poll result, my model does improve. This is a good hint that my model is behaving decently, especially so since this hasn't been the case with all other forecasters.

With the correct Calgary, Edmonton, and Rural results input as large polls, my model improved to 83/87 seats correctly predicted and an R-squared for party support per seat of 0.91. Very encouraging - too bad the polls weren't more correct!

Finally, I also provided an expected odds of winning each seat for each party. It's one thing to count a prediction as a success if you give it 100% odds of winning and it comes true, but how does one properly score oneself in the case of Calgary-Mountain View, where I gave the Liberals (10.8%), UCP (16.2%) and NDP (73%) different odds of winning, and only one (NDP) did?

In this case I've scored each riding using a Brier score. A score of 0 means a perfect prediction (100% to the winner and 0% predicted for all losers), a score of 1.0 means a perfectly wrong prediction (100% to one of the losers), and because of the math, a score of 0.19 for a complete four-way coin toss (I only predicted the four parties represented in the debate).

Overall, I scored a 0.027, which is considerably better than just guessing. It's hard to get an intuitive sense of what that score really means, but it's mathematically the same as assigning an 83.5% chance of something happening and having it come true. Not a bad prediction, but there's room to be sharpened.

How did I stack up?

So like I said, there were a lot of us predicting the election this time around. I've tried to find as many as I can, and I apologize profoundly if I've missed anyone. I've only included forecasts that had either a vote breakdown per seat or anticipated odds of winning each seat for comparison purposes.

I've reported on three main measures (seat accuracy, R-squared per seat, and prediction Brier score), and I'll present as many of those for each forecaster as I was able to determine. Different forecasters win at different categories, so it's not necessarily a clear picture as to which one of us is the "best", so I'll mostly leave room here for interpretation:

I'm not claiming to be the second best, but it's important to note that being best in one measure doesn't necessarily mean best overall. There are also harder-to-evaluate measures in play here - for instance VisualizedPolitics and TooClosetoCall allow you to input poll values to see reactions for yourself, and both improved when given more accurate data (VisualizedPolitics also got to 83 seats accurately predicted, though still with a low R-squared value).

338Canada probably rightly can claim to have been the strongest this time around, but I given the polling errors we were faced with I think it'll take several more elections to determine if anyone is really getting a significant edge consistently. This isn't the first time we've compared ourselves to each other, and I think it's an important exercise in evaluating our own models and whether there's a need for more.

Thursday, October 25, 2018

London Instant Runoff Breakdown

London (Ontario) just had its first election using instant-runoff balloting. As I've mentioned before, I'm very interested in different forms of electoral reform, so as a new resident of London I was intrigued as to how the vote would work out.

London's system is a bit unusual inasmuch as voters can only rank their first three choices, but otherwise follows a pretty classic Instant Runoff system. Many of the elections resulted in first round winners, and therefore don't have a lot of room for fun analysis, but some of them went deeper and I thought it might be fun to show how the progressed in a Sankey diagram!

First of all, here's Ward 5 (my ward!):

As with all of the following, the leader in the first round ultimately ended up winning. Due to the lack of ability of voters to rank more than three candidates, the number of exhausted votes tends to grow quite quickly after the third round. Interesting patterns include the large number of Clarke supporters moving to Cassidy, and the relatively large number of Knott supporters preferring Warden over Cassidy at the end.

Ward 8
This race ended closer than it began, and likely didn't see any change in leader throughout the race due to the lack of strong trends in down-ballot rankings. 

Ward 9
This race ended quite quickly, with Hopkins getting more than 50% of the vote by the third round after preferential support from Charlebois' supporters.

Ward 12

Similar to Ward 9 - disproportionate support from Mohamed's voters to Peloza secured a win in the fourth round.
Ward 13
One of the tighter races of the election. Kayabaga drew large support from Warren and Hughes supporters, whereas Fyfe-Millar drew more support from Wilbee and Lundquist voters.

Ward 14

Pretty straightforward - along with being the top first choice, Hillier was the preferred alternate for both Tipping and Swalwell's voters leading to a more secure finish than start.


(Click to zoom and enhance!)

This one was far more lopsided than all the others. In the early rounds of voting, there was a small amount of jostling for positions 7-9 in the rankings, but apart from that no real changes occurred until Cheng's elimination. No abnormally strong trends in down-ticket voting occurred, though, so Holder held one throughout the end.

The city clerk has promised more detailed information to come out soon, so stay tuned for further analysis!

Monday, September 17, 2018

London City Council

Wow it's been a while since my last post. My apologies!

A principal reason for this is that I've moved - I'm no longer an Edmontonian, and am now a Londoner! London Ontario, that is. This almost definitely means I won't stop posts about Edmonton, but does mean that I'll be increasing my Ontario content.

London is currently in the midst of a civic election, so like any good new citizen to a city my first thought was to learn as much about the current council as I can so that I can make as informed a decision as possible. London's open data is pretty good, but their votes and proceedings aren't as organized quite as well as Edmonton's are.

Nonetheless, with the votes and proceedings that are available, I thought to take a look at council relationships in London in a similar way to how I did in Edmonton two years ago.

Unanimous votes aren't interesting, so I've focused this analysis on the 638 non-unanimous roll call votes as recorded in meeting minutes. First of all, let's take a look at how often each councillor agrees with each other:

Matt Brown is the mayor, and currently enjoys at least 70% agreement with 11 out of 15 councillors, which isn't too shabby. In general, there appears to be a mild bloc of six people (Brown through Park) who all agree quite strongly with each other, another similar block (Park through Hubert) who do the same, and then a handful of councillors who seem to go their own way.

Another sign of consensus-building on city council is the frequency that each member of council has the outcomes of votes in line with how they voted. Again, looking only at non-unanimous votes:

The mayor has been on the losing side of 51 votes out of 610 in which he's been present or not recused, which suggests a reasonable level of consensus building (though not quite as high as Iveson in Edmonton).

If we plot a graph of councillors, and connect them only if they agree at least 67% of the time, we get the following:

The cut-off here was chosen in order to include councillor Turner while still highlighting differences in agreement rates. Unsurprisingly, councillors Turner, Helmer, and Squire are relative outsiders, with a strong cluster of the six councillors mentioned before in the center. Also, this type of graph is incredibly satisfying to play with - enjoy at your own risk!

While showing relative outsiders, this plot doesn't really demonstrate any significant voting blocs. Another way to present the same data is to only connect members of council to whoever they agree with the most often. Doing that results in the following:

Here we get a more interesting structure. Nearly as many people agree more often with councillor Zaifman than Mayor Brown, though there are no separated islands of voting blocs. Only two members of council agreed with each other the most mutually, Matt Brown and Maureen Cassidy, an observation that is provided without further commentary.

The last way I'll look at voting patterns is to scale them using a variant of NOMINATE. This method was developed for analyzing US Congress voting patters, and can assign voting members to a political spectrum without needing to know what the bills being voted on were. For more information, this link is a fascinating read.

Obviously a city council is going to be less partisan than a parliamentary system, but the relative placement of councillors on the graph correlates with how often the agree or disagree with each other, as well as an approximate alignment on issues. I'll detail how this was developed in a subsequent post, but the short version is that each vote is also given a numerical position, and councillors who are closer to the "yes" vote than the "no" vote are assigned probabilities to vote either way. This is then trained against the actual vote data, and thousands of iterations of machine learning later we get this distribution.

Hopefully this has been an interesting glimpse into London city council. Have a fun election!

Friday, June 8, 2018

Ontario Election Wrap-up

The 2018 Ontario General Election is over, and if your team won then congratulations to you!

Over the last month or so I've been tracking the election polls and testing out a few different ideas in order to improve a general model that I'll end up using for the upcoming Alberta election. Of course, I wasn't the only person doing this, and I was able to find at least six other sites tracking and projecting alongside.

But who did the best? Can we learn anything specific about which models produce more reliable results?

First of all, we can look at seat projections. As far as I could tell by mid-day June 7th, this was the seat projection distribution between the seven of us:

CBC Too Close to Call QC125 Lispop Teddy on Politics Calculated Politics Extreme Enginerding Average Actual
PC 78 74 70 69 60 71 70 70.3 76
NDP 45 46 47 50 55 44 45 47.4 40
LIB 1 3 6 4 8 8 9 5.6 7
GRN 0 1 1 1 1 1 0 0.7 1
OTH 0 0 0 0 0 0 0 0 0

Ranking these by the root sum of squares difference from the actual results, we get:

  1. Calculated Politics (diff: 6.48). Their method involved seat-by-seat projections, suggesting a regional breakdown that seemed to work pretty well for them!
  2. Too Close to Call (diff: 7.48). They also provided seat-by-seat projections, and had regional factors involved to project those. Also, most handily, their simulator was interactive, but putting the correct values into it actually made their predictions slightly worse (still second place at 7.87 though).
  3. (Tie: CBC and Me) (diff: 8.12). We ended up with the same predictions for the NDP, but CBC was way under for the Liberals and I was quite a bit under for the PCs. My model didn't involve individual seat projections and instead just approximated historical trends for seat ranges based on party vote share, so that's a win for simplicity I suppose.
  4. QC125 (diff: 9.27). Another site with seat-by-seat projections. The actual seats fell well within their expected ranges, but were all off by a little bit. I'm unsure how they came up with the seat vote projections.
  5. Average (diff: 9.48). In this case, the wisdom of the crowds didn't pan out. 
  6. Lispop (diff: 12.57). Hypothetically they used a regional swing model similar to mine, so I'm not quite sure where the difference comes from here. It looks like they anticipated a much higher NDP voter base than actually happened.
  7. Teddy on Politics (diff: 21.95). It seems like Teddy paid more attention to leader favorability numbers than most of the rest of us, and that seems to have tilted the seat distribution against him. His was the only model to predict a minority government.
For most of the models, the seat projections came directly from the popular vote estimates. If we take a look at those, we get:

CBCToo Close to CallQC125LispopTeddy on PoliticsCalculated PoliticsExtreme EnginerdingAverageActual

Ranking these again by the same criteria we get:

  1. Me! (diff: 1.15) 
  2. CBC (diff: 2.69)
  3. Average (diff: 3.21) This is a better example of the group as a whole performing better than most individual members. This also probably makes sense as these numbers would have come mostly from the same pool of publicly available polls with a small amount of interpretation for trends and recency, as opposed to a large amount of interpretation as in the case with seat projections.
  4. Calculated Politics (diff: 3.29)
  5. Too Close to Call (diff: 3.56)
  6. QC125 (diff: 3.73)
  7. Lispop (diff: ~4.3) Note that Lispop didn't list their prediction for the green party vote total, despite projecting them to win a seat.
  8. Teddy on Politics (diff: 4.37)
Overall I'm really pleased with how I did, and I've learned a few tricks to use in upcoming elections. Next up will probably be Qu├ębec, hopefully with the same group of people, and we can see if this was a fluke for me or not!

Finally, here's my seat model with the actual results input as though they were one final gigantic poll at the end. Using these correct values would have resulted in the model being the most accurate seat projection of them all (diff: 4.24), which is an encouraging sign that the model itself was sound!

See you next election!