Monday, June 26, 2017

Edmonton City Council Gender Parity

Back in October I took a quick look at the success rates of female candidates getting into city council. In 2013, 22% of candidates were female, but only one out of the twelve council seats ended up being held by a woman. The aim of that post was to investigate some of the source of the gender disparity on council - namely whether the distribution of female candidates in different races was causing the issue, or whether there was an inherent bias against female candidates.

Ultimately, I determined that the relative lack of successful female council winners was more likely due to distribution of candidates across races than individual bias - without accounting for incumbency, there was no evidence of anything other than relative equal chances of winning between female and male candidates (i.e the number of female winners since 2004 is more or less what you'd expect assuming all candidates are equally likely to win).

That was a pretty positive sign, as it suggests that the biggest factor holding back a demographically-balanced council is the availability of under-represented candidates to run (which is totally outside of the scope of this blog to discuss), and perhaps more importantly, the avoidance of clumping of under-represented demographics into the same few races.

One of the biggest issues with the 2013 election was that five wards had no women running at all, and half of all women were clustered into two ridings. This drastically reduced the expected number of women into council, regardless of the relative proportion of candidates who put their names forward.

So with all that said, I've been keeping track of candidates for the 2017 civic election which are being tracked at Daveberta. For each candidate, I've tried to ascertain their gender in order by how they refer to themselves (political candidates love speaking in the third person), or how they're referred to in third party posts, and if all else fails by name and presentation assumptions. If you notice any errors, please let me know.

(Last updated July 14, 2017)

Based on the current 56 candidates, 21 are female and 35 are male (female ratio of 37.5%, up from 22% in 2013). However, based on the distribution between wards, an expected 4.03 seats will be won by female candidates, which could be considered a relatively inefficient allocation of seats based on the ratio of candidates. wards have no women running at all, and Ward 5 on its own has nearly a quarter of all female candidates.

Overall, it's most likely that the number of female councillors after the election will be between 2 and 6 (90% confidence).

Edmonton Council (38% female candidates)

Edmonton Catholic School Board (75% female candidates)

Edmonton Public School Board (43% female candidates)

I'll keep this updated as the campaign continues! Tune in for future updates. Again, this does not take into account extra factors such as experience or incumbency, and is only for entertainment purposes and whatnot.

Tuesday, May 9, 2017

Next Game Wins?

(Subtitle: Which Game Should You Win? Part 3)

Three years ago, my friend Andrew pitched in to the blog and asked which game in the playoffs was most worth winning. The results were a bit inconclusive, but from it he developed a database of all playoff outcomes since 1943, so a year later I looked at the dataset again and developed Markov-style chains of playoff odds based on different positions in the playoffs.

Now that it's playoff season again, people are naturally interested more than normal in hockey and I recently overheard someone comment that, though a series was currently at 2-1 for wins, the next team to win was undoubtedly going to win the series.

Good thing I have this handy database of all playoff outcomes ready, because that immediately intrigued me as to how likely it actually is that, at any given point, the next team to win a game will win the series overall. This is perhaps another way of asking the same question as before - how much does this upcoming playoff game matter to the grand scheme of things?

Before looking into the historical data, though, it's worth doing the math to see what the odds would be if the human element were removed (with all games having a 50/50 chance of going either way, and all games being independent). Obviously, if a best-of-seven playoff series is tied at 3-3, then the next game winner is guaranteed to win the series, so that's an easy starting point.

From there, it's not too hard to work backwards to figure out the rest of the odds. If a series is at 3-2, then there's a 50% chance that the leading team wins (which would give them the series win, and a 100% chance therefore of winning the series), and a 50% chance that we get to a 3-3 position, where the chance of the trailing team being the overall winner is again 50%. Overall, that makes the chance that the next game winner will be the series winner (50%*100%)+(50%*50%)=75%.

If we continue this way, then we can generate this table of values. For all following graphs, the 'home team' is the team that has home town advantage for the first two games:

So what's not surprising here is that the odds that the next team to win will be the series winner are always above 50%. That makes sense, because no matter what the position is beforehand the winner is improving their overall odds of winning the series. What's more interesting is how little games tend to matter when the series is lopsided.

Of course, games aren't all independent or aren't all 50/50 toss-ups. Historically, home teams win 54.5% of games, so let's see what happens if we recreate this table with that factored in. It's a bit more complicated, but essentially the same analysis as before, to get this table:

Here we start to see the effects of the playoff structure and the pattern with which it allocates home games to different teams. For instance, when the original home team is up 3-0, the upcoming game almost doesn't matter at all, but the situation isn't quite the same if the original away team is up 3-0. Similarly, both 3-2 game situations have different values. This can be perhaps more easily rationalized - if the original home team is up 3-2, then the upcoming game is going to be in their opponent's home town, which makes it more likely that that other team will win, but if they do then it's tied coming back home, so that's less of a big deal. On the other hand, if the original away team is leading 3-2, they're more likely to win this upcoming game 6, and can lock the series up right there.

Of course, this is all fun and games from a theoretical point of view, but what's actually been happening in real playoff series? Here we go:

This is definitely more interesting! Here we have a clear outlier from the theoretical projections from before, where the 'least important' game is game 5 when the original away team is up 3-1. At this point, the original home team would be playing back at home, but would be down by such a significant deficit, resulting in a situation where they end up with a fairly high 'last hurrah' win rate, before ultimately losing the series 2-4.

On the other hand, there's a surprisingly high predictive score for whoever wins the game after the original home team gets up 1-0, at 74% (8% higher than what you'd expect in a coin toss scenario). I imagine this indicates that the original home team is likely to win their first game, and that if the original away team can't bounce back then the series is likely sorted out by that point (at least, in harder-to-quantify matters than you'd expect).

So the answer to the question 'which playoff game is the most important' remains a solid "it depends", but now you have three different ways of looking at the question. Use them wisely, and enjoy the 2017 playoffs!

Monday, February 6, 2017

How often does the best team win the championship?

Imagine we have a four team single-elimination tournament. Team A is good enough that you'd expect them to win about 80% of all games, Team B ought to win about 60%, Team C should win 40%, and Team D ought to win about 20% of games (against random opponents). If we seeded a single elimination tournament with these teams, it could look something like this:

Given the information above, how likely is it that the best team in this tournament, Team A, ends up being the winner? In other words, how effective is this tournament structure and seeding system at determining the best team out of the four?

The first tool we'll need for this is the Log5 formula - given the true winning percentages of two teams, this formula tells you the odds of a given team winning. So for instance, Team A playing Team D is pretty lopsided, and Team A has a 94.18% chance of winning that game based on this formula. Similarly, Team B and Team C are much closer in relative skill, so Team B only has a 69.23% chance of winning that game.

Based on all of this, we can come up with the relative chances of any of the four teams winning the tournament overall:

So in our hypothetical situation, this single-elimination four-player seeded tournament with known team skillsets resulted in the best team, Team A, winning 72.26% of the time, and the worst team winning 1.06% of the time. Not shabby.

However, what if you didn't know the skill levels of the teams going into the tournament? How confident could you be that the eventual winner of the tournament was, in fact, the best team when they signed up? One way to determine this is by running a Monte Carlo simulation - let's sign up four teams of random skill values, run them through a tournament exactly the same way as we just did with our sample teams, and see who the winner is. Then let's do that 10,000 times, and see how often the best team wins.

The results are interesting: with randomly drawn skill values for all teams (mean= 0.5, stdev=0.13 [see below**]), we'd expect the winner of the tournament to be the strongest team only 44.3% of the time. About 10.5% of the time, the weakest team in the tournament would end up winning the whole thing!

So is a single elimination tournament a particularly good way of determining the best overall team from a pool of four teams? Probably not. What happens if we up the ante, and have a double elimination tournament? Double elimination tournaments are exactly what they sound like - any given team needs to lose twice before being eliminated. They tend to look something like this:

This sort of format ought to improve the chances of the best team winning, as a single unlucky (and unlikely) loss won't eliminate them too early. If we run the same sort of analysis, but with a double elimination tournament, we end up with the winner being the strongest team 51.1% of the time, and the winner being the weakest team only 7.7% of the time. A reasonable improvement all in all.

Unfortunately, this modest increase in chances of determining the truly best team is offset by the increased length and uncertainty in the tournament. A single elimination tournament needs three games total, whereas a double elimination tournament needs either six or seven games, depending on how the sixth one goes. This is also annoying to schedule and sell tickets for, as organizers have no way of knowing if the sixth game will be the exciting final or not.

Expanding a bit, we can do the same analysis with single, double, and triple (!) elimination format tournaments for tournament of as many teams as we want. Before we continue, though, I'll mention that for a 32 team single-elimination randomly-seeded tournament, the odds that the winner will have been the best team overall are 22.0%. Keep that in mind.

So why talk about these tiny little tournaments? It's to get you ready for the real deal: professional sports.

Professional sports leagues generally tend to fall into a regular season and playoffs, where the regular season is used to seed teams into some sort of order and filter out the best, who end up playing in a tournament bracket of one style or another. What happens if we do the same sort of analysis for each major (and some minor) sports league?

National Basketball Association

Overview: 30 teams play 82 games each in the regular season. Teams are sorted into conferences and divisions. The playoffs are a 16-team single elimination tournament bracket, where each round of the playoffs consists of a best-of-seven games series for elimination. Teams are seeded within conferences, and the bracket is fixed at the end of the regular season. This gives us:

Overall odds of the NBA Championship being won by the season's best team: 45.9%.

Pros: Best-of-seven series in playoffs reduces variability in results. Not seeding teams based on division standings, and only looking at conference standings, reduces the chances of weaker teams getting into the playoffs by virtue of leading weaker divisions.

Cons: Relatively long season for regular season, large range in potential length of playoffs.

National Hockey League

Overview: 30 teams play 82 games each in the regular season. Teams are sorted into conferences and divisions, and are more likely to play teams inside their divisions and conferences than outside of them. The playoffs are a 16-team single elimination tournament bracket, where each round of the playoffs consists of a best-of-seven games series for elimination. Teams are seeded within divisions such that the top three teams of each division are guaranteed a spot in the playoffs, and the highest-performing two teams of the conference that remain get in as wildcards. All of this results in the following distribution:

Overall odds the Stanley Cup winner was the season's best team: 45.4%.

Pros: Best-of-seven elimination in playoffs reduces variance and increases the chances of better teams triumphing.

Cons: Relatively long season for regular season, large range in potential length of playoffs.

Fun fact: Only real difference between NHL and NBA results is the seeding into the playoffs and how wildcards are handled, and that results in hardly any change at all.

Major League Baseball

Overview: 30 teams play 162 games each in the regular season. Teams are sorted into leagues and divisions, are are more likely to play teams inside their divisions and leagues than outside of them. The playoffs consist of the winners of each division and the two wildcard runners-up from the conference, and is a wonky sort of 10-team single elimination tournament where the first round are two single game wild-card playoffs, followed by four best-of-five division series, then two best-of-seven league winner series. Finally, the winners of each league play each other in a best-of-seven series to determine the winner. From this, we get:

Overall odds the World Series winner was the season's best team: 45.4%.

Pros: Teams are more likely to be correctly seeded heading into playoffs due to extensive regular season. Fewer teams in playoffs makes it less likely for weaker teams to get lucky.

Cons: Short playoff season with sudden-death games and best-of-five series increases variance.

Fun fact: If all rounds (including wildcard) in the MLB playoffs were best-of-seven series, the odds that the winner was actually the best team increase to 46.4%.

National Football League

Overview: 32 teams play 16 games each in the regular season. Teams are sorted into conferences and divisions, and are more likely to play teams inside their divisions and conferences than outside of them. The playoffs are a true 12-team single-elimination tournament, where each division leader and two runner-up wildcards are seeded from each conference. This results in:

Overall odds the Superbowl winner was the season's best team: 28.2%.

Pros: Short, fixed playoff schedule is predictable.

Cons: Single elimination sudden death games greatly increase the chances of weaker teams winning by chance and eliminating stronger teams. Relatively short season also doesn't guarantee accurate seeding of teams heading into playoffs.

Fun fact: Before, I mentioned that the chance of the winner of a randomly-seeded 32-team single-elimination tournament being the best team was 22.0%. That means the NFL season format isn't really all that much better than just having one six-week March Madness style showdown each season.

Some more minor tournaments that are still near and dear to my heart:

Canadian Football League

Overview: Nine teams play 18 games each in the regular season. Teams are sorted into two divisions. The playoffs are a single-elimination 5-team tournament, where the highest-ranked team in each division gets a bye to the division finals. Teams can cross-over into other divisions if the fourth-place team in one division has more points at the end of the regular season than the third-place team in the other division. This gives us:

Overall odds the Grey Cup will go to the season's best team: 38.0%.

Pros: Short, fixed playoff season is predictable. Still somehow have a longer regular season than the NFL. Higher ranked teams getting a bye to the division finals helps out the stronger teams.

Cons: Single elimination playoff format, which includes more than half the league, leads to a bit of a crapshoot. It's disappointing than the NHL with over four times as many teams is better suited to finding its best team each year.

Fun fact: Again, this isn't substantially better than just running a 9-team randomly-seeded single elimination tournament right at the beginning of the year. It's more fun, though.


Overview: Major curling tournaments involve 12 teams, who play each other once each in a round robin. The top four teams are seeded into a Page playoff system, where the top two teams are in quasi-double elimination system, and the remaining teams are in single elimination. This gives us:

Overall odds the winner was the tournament's best team: 37.3%.

Pros: Fixed playoff system is short and predictable. Page playoff system gives a bonus to the teams who perform best after a fair and balanced round-robin.

Cons: Single elimination format of playoffs increases variability.

Fun fact: Curling is fun and you should try it.

So there you go. Unsurprisingly, sports leagues with longer regular seasons and best-of-seven playoff series are better suited for determining the actual best teams each season, whereas leagues with shorter seasons and single elimination tournaments as less well-suited. Now you have numbers to show for it, at least!

Thursday, February 2, 2017

Redistricting in Alberta

Every eight to ten years, the Alberta Electoral Boundary Commission meets up to reconfigure the riding boundaries for upcoming elections. This is a fairly important part of our democracy, as cities are often growing, people tend to move around a lot, and keeping our boundaries updated to reflect current population distributions is a good way to ensure that everyone is equally represented in our legislature.

Fortunately enough, Alberta isn't like many US states where it is elected politicians themselves who decide where these boundaries will be drawn, which can tend to lead to gerrymandering as I've discussed before. So although in Alberta districts are determined by a neutral committee and appear to avoid obvious signs of manipulation for political purposes, they aren't really all that good at making sure everyone's vote counts the same no matter where they live. For instance, the Electoral Boundaries Commission Act specifies that the maximum population deviation from the average per riding is 25%. As well, provided the area is sparsely populated enough, up to 4 districts can have populations that are as much as 50% below the average population of a riding in the province.

This led to a situation where, based on the 2011 census, the largest district had a population of 51,800 people, more than twice the size of the smallest district at 23,050. Someone living in Dunvegan-Central Peace-Notley has nearly twice the voting power of the average Albertan when it comes to provincial elections, at a population a whopping 45% below provincial average. (As a side note, this still isn't as bad as on the federal stage, where Labrador is 73% below average, and five times less than the highest populated riding in Brantford-Brant, but that's a different story.)

So, as an infomercial might say at this point, "There must be a better way!"

The Electoral Boundaries Commission is accepting submissions now while they begin their redistricting process, and this seems as good a time as any to determine a better solution. What is a fair way to split the province up into 87 sections each with the same population?

One of the coolest solutions is to use the shortest splitline algorithm. As explained by CGP Grey, the shortest splitline algorithm is a repetitive process that searches for the shortest line that splits an area perfectly in two by population. Each half is then split again with the shortest line that produces equal halves, until ultimately we stop when we've gotten the desired number of sections split up, which are necessarily of exactly even population.

So lets try this for Alberta. The first thing we need is a population distribution of Alberta, which Statistics Canada helpfully has lying around on their website. It looks like this:

This is Alberta broken up into 5,711 census dissemination areas based on the 2011 census.

Next up, we would normally find the shortest line that crosses Alberta in such a way that exactly half of the population is above the line, and exactly half is below the line. Since Alberta has 87 districts, though, we actually want to find the shortest line that has 44/87 of the population above it, and 43/87 of the population below it. In my (slightly optimized) model, that looked like this:

Then we split each half again. The top half is an even number, so we can split it in two easily, whereas the bottom has to be split into 22/44 and 21/44 segments. That gives us this:

And so on and so forth until we've split all of Alberta up into equal segments. The final result of this ends up being this lovely stained glass window:

Of course, things can't always be perfect no matter how hard you try, so this is a solution for Alberta that has a maximum population in each riding of 0.38%. The largest riding has 42,052 people in it, and the smallest has 41,752. This is a solution to split up Alberta that has a maximum voter variance that is 118 times smaller than we have now, and a coefficient of variation that is 68 times smaller.

Also, just because the map was drawn with straight lines doesn't mean it has to stay like that. If we go back to our census dissemination area shapes from Stats Canada, we can convert an Edmonton distribution from this:

To this:

Which is actually starting to look pretty reasonable. Neighborhoods are kept together, and the areas are looking relatively compact.

The shortest splitline method is an objective and fair way to distribute votes such that everyone's votes are counted equally. I was able to redistrict all of Alberta using Excel - no fancy programming skills are needed. There's no reason that we can't have redistricting being as boring as updating census data and having a computer spit out a single solution each time we need it.

That being said, there are still some objections people could have with it - for instance, it doesn't necessarily give a hoot about municipal boundaries. Take Red Deer for example: after applying the algorithm to Alberta, Red Deer got sort of unfortunately split into four districts, each of which includes substantial amounts of surrounding countryside:

Oh no.
This would have people in Innisfail voting alongside southwest Red Deer, and people in Saskatchewan River Crossing voting alongside west Red Deer. That's probably a bit messed up. In cases like this, I think it's probably fair to use the splitline algorithm to get to a starting point, and then massaging the districts as needed to make sure they make a bit more sense. Because the algorithm got within a 0.38% maximum population variance to start with, it is likely quite straightforward after to swap around some areas as needed to keep the popluation variance still small. For instance, Public Interest Alberta recommends a maximum population deviation of 5%, which I'd suggest is reasonably easy to achieve if we're starting from a point where our deviation is essentially 0%.

So if I've convinced you that using algorithms to redistrict our population can lead to fairer, objective, even distributions of our political districts, and that those are things worth having in our democracy, head on over to the Commission's website and leave them a submission before February 8th!

Thursday, October 27, 2016

Edmonton City Council Gender Equality

In the 2013 civic election, 79 candidates ran for mayor and city council, 17 of which were women. The election resulted in one woman getting elected out of a council of 13. Though women represent 51% of the city's population, they represented only 22% of candidates, and resulted in 8% of council seats. With results like this, it's little surprise that groups like Equal Voice are calling for improvements to our system, including encouraging a larger diversity of candidates and promoting a more balanced and representative city government.

Taking a deeper look at these results shows some interesting trends though. For instance, in 5 wards in 2013, there were no women running at all, and half of all female candidates were clustered in races in two highly contested wards. This suggests that, while 22% of all candidates were female, the distribution of female candidates may have already been predisposed to a lower number of female winners in the end. Let's take a look.

There were 12 wards and one mayoral race in 2013 for city council, and the proportion of female candidates per race ranged from 0-43%. Assuming any given candidate has an equal chance of winning any given race (an assumption we'll check later), this is the expected distribution of female winners:

As previously mentioned, there was absolutely no chance of there being anywhere from 9-13 women on council, as 5 races were contested solely by men. Based on the uneven distribution of candidates in the remaining races, there was an expected 2.01 female councillors last year, or 15% of council. So while the number of women on council was still less than expected, it was closer than what we might have expected based on the total number of female candidates. Instead of being 14% lower than what we might expect from candidate distribution, we were 8% lower.

So does this mean that female candidates are 8% less likely to get elected than male candidates? That's really hard to say, and it turns out we don't have enough data yet. One way we can check is by looking at the p-value of our outcome - what's the chance that we could have gotten something as bad as the result we did, assuming our null hypothesis (that women are as likely to get elected as men) is true? In this case, the p-value is 0.37. Essentially, our data set is small enough that any result between 0-4 female councillors wouldn't have been all that much of a surprise (and in fact, 6-7 would have been an indication of an opposite effect). So let's not worry about significance yet, and instead look at more elections!

Edmonton's civic elections elect people to mayor, council, and public and catholic school boards. If we do the same analysis for all three councils for the last four elections, we get a chart that looks like this:

This suggests a lot of things, including:

  • City Council results over the last 4 elections haven't shown more than a 10% deviation one way or another. More importantly, the p-values for each election have been totally reasonable.
  • There's a lot of variation in the Public School Board elections. This is partially explainable based on how small the Board is, so any variation will be magnified from a percentage basis. On the other hand, that level of variation isn't present in the (smaller) Catholic Board...
  • Catholic School Board elections haven't shown an anti-woman bias in this data set.
So, interestingly enough, of the 12 discrete votes that I looked at, 4 had a slight anti-woman bias, and 8 were perhaps slightly pro-women. Essentially, what this suggests is that female candidates are just as likely to get elected as male candidates. If we add up all the results since 2014 into one graph (of 402 candidates running for 114* positions over the last four years), we get the following distribution:

Overall, 46 women have been elected to 114* electable seats, where the candidate distribution and chance would expect us to have elected 42.5. The p-value assuming an equal electability between women and men is 0.22, so no, meninists, there isn't a pro-woman bias either. These results are pretty much what we'd expect given the candidate distribution we've had. This general conclusion holds true across city council (p=0.29):

And Public Schools (p=0.19):

Though intriguingly enough breaks down at Catholic Schools* (p=0.03):

*: Here it's worth mentioning that before 2010, the Catholic election system was really weird and had a wildcard winner from whoever had the most votes but wasn't elected in their ward. This was particularly silly seeing as not all wards were the same size, so I've ignored the wildcard seat and victor for the purposes of this analysis.
The fortunate summary of all of this is that there's no evidence that any system is rigged against female candidates. That being said, the proportion of women elected to civic office in the last years is just under a third of total offices filled, which isn't even remotely balanced. The best way to get a more representative council is to have more under-represented demographics put forward their candidacy, so if you know anyone who might be interested or qualified (of whichever underrepresented group you choose), I strongly encourage you to encourage them to run.

Wednesday, August 17, 2016

Edmonton Bike Safety

Bicycles in Edmonton have been in the news quite a bit recently, particularly given the success of new bicycle development in Calgary. Bicycle lanes in Edmonton have been proposed, installed, removed, illegally painted, removed again, and blocked in council quite a bit in the last few years. Frustrations between cyclists, city planners, and drivers have gotten to a boiling point recently, and I think it's safe to say that whichever side of the debate you're on you're likely sick of it all. But please keep reading!

With all that said, things have recently gotten a bit more interesting from a data point of view. A month ago I was made aware of a data set of cycling injuries and incidents from 2009-2014 from the nice folks at Spacing Edmonton, which were analyzed by them as well as (more recently) the group over at Slow Streets.

Specifically, the people at Slow Streets made the claim that injury hot spots indicate where more cyclists are travelling, showing cyclist 'desire lines' which would be prime targets for bicycle infrastructure. However, a quick look at the map suggests that the streets with supposedly lots of bicycle traffic are also the roads with lots of vehicle traffic. Hypothetically, even if all streets had the same bicycle traffic, we might expect a similar distribution since one might think that more cars might lead to more interactions with cars.

So let's take a look and check this hypothesis. Fortunately, Edmonton has a nice map of average annual weekday traffic for major roadways. I combined the map data of all 1,070 cyclist injuries from 2009-2014 with the map of all streets that had traffic volume stats, and ended up with this result:

Error bars represent the 95% confidence interval for injury rate.

It looks as though there is a decisive link between vehicular traffic on a road and the number of cyclist injuries. As the city doesn't seem to have any specific information on bicycle ride distributions, it's hard to say with any certainty if the Slow Streets analysis is correct. Either way, it's clear that wherever cyclists mix with lots of cars, we get lots of injuries. This analysis ended up looking at 571 km of major roads with traffic data, which were responsible for 760 of the injuries recorded from 2009-2014.

But hey, that's not all! Edmonton also has a map of everyone's favorite (or least favorite) things - bike lanes!

From the map, Edmonton's road bike-friendliness can be broken down into four different types. There are separated shared use pathways, painted on-road bike lanes, signed on-road bike lanes, and plain old normal streets. So what does my previous analysis look like if we split road types up by their bicycle infrastructure? Why, this:

Again, error bars are the 95% confidence interval. Basically, ignore the green bars...

What are some takeaways from this? Well, first of all, major roads very infrequently have signed on-road bike lanes, so there's far too much variability for a proper analysis of them (green on the graph). Far more common are roads with separated shared bike paths (red), or no infrastructure at all (grey). From this, we get the firm (and hopefully not unreasonable at all) conclusion that biking on separated, wide, shared pathways for bikes is safer than biking on a normal road with traffic, by a factor of about 2.

From the City of Edmonton bike map

However, an interesting conclusion from this is that it's extremely hard to make the argument that painted bike lanes are safer than normal roads. In fact, in some cases, it looks quite a bit safer to bike on non-bike-laned roads. Weird.

What might cause this? Well, first of all I'd say that this analysis is a few factors short of anything scientific. For instance, the bike lane map for Edmonton likely includes lanes and paths that haven't existed for the entirety of 2009-2014, or have since been removed, so some of the injuries from my analysis are likely classified inaccurately. As well, other researchers, when investigating bike lane safety, controlled for the presence of parked cars on the side of the road, which I did not. So while I wouldn't necessarily go so far as to say that my analysis shows Edmonton bike lanes are more dangerous than streets without bike lanes, I stand by the assertion that bike lanes aren't safer than streets without them. I embrace the subtlety of that distinction.

Regardless, the data is quite clear about the effects of vehicle traffic on bike incidents, and the effects of physically separating bike paths from roads. Namely, separating vehicle and bicycle traffic may reduce bicycle injuries by a factor of 2 on busy roads, and up to a factor of 6 on quieter roads.

Again this is not surprising at all - I can't stress just how intuitive and likely boring the main finding here is. But this data set of cycling injuries from 2009-2014 does seem to show that painted bike lanes have not had the effect that was perhaps intended.

In my opinion, having decent bicycle infrastructure is absolutely important to having a vibrant and healthy city. Hopefully future bike lane decisions are made keeping injury prevention and statistics in mind, in such a way that we can expand our biking infrastructure as effectively as possible.

Sunday, July 3, 2016

Edmonton City Council Votes (Part 2)

A year ago I did a short piece looking at Edmonton city council voting patterns. It was pretty fun and showed some cool blocks in city council, but since then we've had a monster by-election, so it seemed like now is a good time to take a second look at this analysis.

Since council as a whole got elected in 2013, there have been 5763 votes performed, according to the city's Open Data catalogue. Of course, many of these are procedural matters, and the vast majority of them are unanimous. If we restrict the votes to non-unanimous votes to see how the councillors interact, we're actually only left with 358 votes to look at.

Of those 358 votes, we can come up with this result, showing how often each member of council agreed with each other member of council. I've colour-coded it to make the numbers seem a little less daunting:

The major update here, of course, is the addition of Councillor Banga to the mix. He seems to generally follow the Iveson/Esslinger/Walters group that we identified last year, though generally less so than his predecessor Amarjeet Sohi did. He also seems to disagree with Councillor Caterina disproportionately relative to anyone else. Again, much like a year ago, Councillor Nickel is a bit of an outsider, who agrees with his colleagues far less than anyone else does.

Another way of looking at this is to make network graphs. This first one shows all connections with councillors that agree with each other at least 67% of the time (this number was chosen so that Councillor Nickel isn't left out). Feel free to play with it, it's rather fun!

Alternatively, we can generate a network graph based on who agrees with who the most frequently. Orange arrows (when you hover over them) indicate the most frequent agreements for each councillor, blue arrows indicate that another councillor most frequently agrees with the first, but that it isn't reciprocated.

This shows a bit more clearly how potential groupings look at city council. Five councillors agree with Mayor Iveson more than anyone else, and two other councillors most frequently agree with two of those five. On the other hand, the remaining 5 other councillors tend to spread out from Councillor Caterina.

Of course, these two groups aren't all that different - Councillor Caterina and Mayor Iveson still vote the same on 75% of contested motions, so realistically they agree 98% of the time on all motions, but the above network graph is a nice way to dramatize it!

Finally, we can also take a look at how often each member of council ends up getting the result they voted for on each motion. Again, only looking at non-unanimous votes, we have:

Impressively, Mayor Iveson ends up on the winning side of a council vote 95% of the time. In fact, of all 5763 votes performed since 2013, Mayor Iveson has only been disappointed 17 times. There are certainly many conclusions that can be drawn from that, but at the very least nobody can say that Don Iveson has difficulties instituting the agenda he wants on council.

So there you go. I plan to do another analysis like this before the next election, so stay tuned for that one!