Wednesday, December 17, 2014

My Issues with the Broadbent Institute's Inequality Report


Apparently I have it in for think tanks or something. Every few months a think tank somewhere comes out with a report that means well, portrays a message with fundamentals I agree with, but manages to mess up some amount of the data handling in a way that gets me riled up.

This time it's the Broadbent Institute. They released a report on income inequality recently, and presented the data in a virtually identical format to an American video from two years ago. While I agree that income inequality is a big issue in Canada, and I'm sure that the average Canadian isn't clear with just how bad it is, I have a pretty big issue with the statistical rigor in their report.

This is a screenshot from their video:

Along the x-axis, they have different population percentiles in 10% chunks. The chunk on the far right represents the richest 10% of the population, the one to its left is the next 10% richest people, etc.

The problem with this chart is that it shows the 50-60th percentiles as being richer than the 60-70th and 70-80th. They're trying to tell us that the 5th richest group is richer than the fourth and third richest groups. 

What!? That doesn't make any sense by definition.

These values appear to come from this table in their report:

Somehow, Canadians apparently consistently think that the middle 20% of the population is supposed to have more money than the 2nd wealthiest 20%. That's not possible, and I can't believe that it got all the way into the report and the video without anyone hitting the emergency stop button. The income curve shown in the first figure ought to be a version of a Lorenz curve, and necessarily should increase from left to right. Even if that is the actual result from the survey, it shows that either the survey wasn't clear enough in its instructions, or that adequate controls weren't in place in the survey to ensure accurate results. 

When I brought this up to them on twitter, their response was:

Which is... silly. There's a logical distinction between a "strong middle class" and a "middle class that's stronger than the upper middle class." They've clearly decided to ignore this distinction.

Finally, a (more than average) nitpicky point. Take one more look at this graph (where the blue line was the original "ideal" values);

The blue line of ideal values has five data points in it, which is great, but they're nowhere near where they should be! Each point (or kink in the blue line) corresponds with a 20% chunk of data, so they might have shown the mid-point 10%, 30%, 50%, 70%, and 90% marks. But instead they've shown the 0%, 25%, 50%, 75%, and 100% marks. This implies that the 0th percentile Canadian (the absolute poorest person in Canada) ought to have the wealth value that the people surveyed thought belonged to the bottom 20% as a whole.

Anyway, the point of all this is that research into wealth inequality is really important, and doesn't deserve to be handled quite this badly. If you're going to be sharing this report, please do so with a grain of salt.

Monday, December 15, 2014

Winter Tires in Canada

Well there's snow on the ground and the temperature's pretty low, so we can pretty solidly declare that winter is upon us. And with wintry blizzards comes one of the great Canadian traditions: changing over your summer tires for winter tires.

If you're anything like me, you probably waited until just after the first major snowfall to remember to put them on. This often ends up with you driving around dangerously for a week waiting for your appointment, all the while dodging other summer-tire skidders. It's a fairly dangerous and unpredictable way to go about driving.

Recently I tried looking up recommendations for when to put on your tires and came to an interesting discovery: almost every single source recommends to put them on once the temperatures dip below 7 degrees Celsius. Everyone from the tire producers to the Tire and Rubber Association of Canada agrees with this fairly precise temperature recommendation.

Why? Turns out that summer tires are made of a different rubber that gets quite stiff below 7 degrees, and reduces the friction of the tires (the comparison that was used was that they approach the consistency of a hockey puck). Winter tires become more effective below 7C, even on dry clean pavement.

Not to scale. Probably.
If you're looking to drive as safely as possible (which you should, seeing as road injuries are the 9th leading cause of death worldwide), it might not be quite enough to just wait until the forecast predicts a temperature below 7, seeing as it often takes time to book an appointment and by that point it could be a bit late. Fortunately, Environment Canada has the daily temperature for various cities over the past several decades all neatly stored online.

So I decided to take a look. These are the average mean daily temperatures for Edmonton per day for the 30-year span between 1981-2010:

Since each day of the year has a decent variation to them, it's also possible to determine the expected probability that any given day will be below 7 degrees Celsius (using their averages and standard deviations). That might get you something like this:

Once you have this, it's fairly straightforward to choose when to put on your winter tires. If you were willing to accept a 50% risk of being ill-equipped for the weather, you'd be looking to put them on sometime around the beginning of October, and take them off around the beginning of May. That's vastly longer than I typically have mine on for, and I suspect that's the same for many people. In total, an Edmontonian ought to have their winter tires on by October 1st, and leave them on for 210 days (at least seven months of the year!).

Of course, a 50% risk of having the wrong tires might seem a bit high for some people. If you were only willing to accept a 10% risk, you'd be looking at 261 days of winter tires starting September 4.

So that's all well and good for Edmonton, but how about the rest of the country? I decided to look at 30 stations' worth of data spanning 1981-2010 (~330,000 data points) to try to develop a map for winter tires in Canada. These stations included all major cities and a few select points to accurately represent the geographical differences. This is what I got:

Unsurprisingly, the northern territories tended to need winter tires more than the southern provinces (quite frankly, it's not worth taking winter tires off if you live in Iqaluit). What might be surprising to some is that even the warmest parts of the country, that hardly ever see snow, ought to have proper winter tires on for at least a third of the year.

Another way to represent the data is to show the probability of being below 7C on any given day like this:

Where green means 0% chance of being below 7C, and red means 100%.

The vast majority of Canadian cities have a high risk of being below 7C sometime in October, and it's important to know when exactly that will be in order to be sure you're driving with the best equipment available. In fact, the above graph can be summarized as follows:

One final thing to note: only the province of Quebec has legal requirements for winter tires, with the exception of some British Columbia highways. These legal requirements fall way outside of the 7 degree recommendation though. It's all well and good to have laws for additional safety when operating motor vehicles, but if they fail to capture the designed temperature ranges of the actual tires, it seems like a bit of a missed opportunity.

Monday, December 8, 2014

2014-2015 ski season

In case you haven't noticed, it's snowed a bit recently in town. And any time it snows in Alberta, I get excited that it's likely been snowing up in the mountains. And that means skiing!

As of December 7, the website OnTheSnow shows that Marmot Basin (the closest ski hill to Edmonton) has a snow depth of 90 cm. That sounds rather decent, and certainly right at the end of November it got a massive dump - but how does that actually compare to normal? I decided to figure out.

Here is the cumulative snowfall of Marmot basin for every ski season since 2007-08:

Alright, so there's quite a bit of variation in there. Maybe a better way of looking at it is like this:

For these graphs, the grey zone represents the maximum and minimum values over the last seven seasons, the light grey line is the average, and the black line is this season so far.

So there's good news and bad news here. The good news is that there's actually quite a bit more snow this year so far than normal! In fact, there's about as much snow at Marmot right now as there typically is by about January first. All in all, maybe not a bad time to go there, in fact!

The bad news is that, apart from two huge dumps, there really hasn't been any action in Marmot. It was way below any of the previous seasons measured until two weeks ago. Marmot looks like it's in a good position now, but if it hadn't gotten luck at the end of November it would pretty much just be rocks. In fact, we can tell it *has* been lucky - Marmot Basin typically only gets two to three snowfalls exceeding 20 cm per day per season (actually 2.43 average), and has already had two this year. Lucky for it now, but it's hard to predict for the future of the season.

Marmot Basin is also relatively easy to predict - on average by the end of the season, its total snowfall has a coefficient of variation of 37.1%. It also has a reasonably early season, with 100 cm of snow fallen on average by December 31.

But how about other Alberta ski hills? Take Sunshine Village, for instance:

Sunshine has a similar situation to Marmot Basin. It's been lagging behind previous years until the end of November (though still within normal ranges), and is now pretty much back on track. Hard to say how that will hold up though. They don't typically reach 100 cm of snowfall until a bit earlier than Marmot (average December 18), and tend to be more predictable (coefficient of variation of 23.8%). They also get far more snow in total than Marmot Basin does...

Lake Louise enjoys a base of 100 cm on average by December 16, but is raucously tricky to predict (coefficient of variation at end of season of 44.3%). Lake Louise has had the same problem as Marmot Basin - it had far less snow than previous years up until a sudden burst rather recently, but it's been flat since. Hopefully that isn't terrible news for the season...

Nakiska's almost doing the best for this time of year out of any of the last 7 years! Good for it. They tend to have more variation at this time of year than other Alberta hills too, so it's actually a bit tougher to say if they'll have a good season or not. They tend not to get a 100 cm base until around January 23rd, and have very unpredictable seasons, with a coefficient of variation of total snowfall of 48.2%.

Norquay's a bit sad. They're well within previous years' ranges, but it's still not looking nice. They'll get their first 100 cm on average by February 10 (yikes), and have a variation in total snowfall around 37.6%. Some years they don't even get 100 cm of snow, though.

Castle Mountain's another sort of sad mountain with a later season (100 cm average by January 11th) and high variability between seasons (43.2%). Both Castle and Norquay seem to have missed the awesome snow dump that the rest of Alberta had, but are tending to stick a bit better into where they'd be expected at this point in the season.

So overall for mountains in Alberta, it's looking like now is a great time to go to Marmot, Sunshine, Lake Louise, or Nakiska. They're certainly at least all doing much better than average for this time of year, and will likely continue to be above average for the rest of December.


Earliest decent season: Lake Louise (December 16)
Highest average snowfall: Sunshine Village (486 cm by May)
Most predictable: Sunshine Village (23.8% variation by season)

The sad thing is... BC mountains do way better on almost all counts. Take for example Fernie:

(100 cm by Dec 22, average snowfall 705 cm, COV 25.8%)

Or Whistler:

(100 cm by November 24, average snowfall 796 cm, COV 27.7%).

Both mountains consistently and reliably get far more snow than anything in Alberta. While that may make them sound great on paper, they still haven't had the trend-bucking dump that Alberta mountains have had, and are currently lagging quite far behind their Alberta peers. So while I can't guarantee that they'd have particularly good December skiing this year, you certainly ought to be able to rely on them for quality skiing in the mid- to late-season!

Monday, November 24, 2014

Foo Fighters Frenzy

This weekend, Edmontonians got the chance to stand in line for the Foo Fighters ticket pre-sale. The way the line-up system worked, though, left a few people upset - instead of a good old-fashioned first come first served system, Northlands used a lottery system where customers were given a number when they arrived, and then at a fixed time people were stopped from getting in line and a random number was chosen.

The person holding that number became the new front of the line, and everyone ahead of him was moved to the back in their original order. People were supposed to be given numbered tickets randomly, but instead they were given tickets in the same order that they arrived in.

It was billed as a process to avoid making people camp out early, as there was no benefit to being first in line, but sadly not everyone read the rules. This resulted in a system where some people had waited for hours to be at the front, only to find out they were likely going to be moved to the back of the line once tickets actually went on sale, and would have to wait even longer for less selection.

So this sounds like something that could be fun to analyze! Based on the information in the Journal article, and from a friend who was in line, I've come up with the following parameters for this problem:

  • 600 people join the line
  • They join the line at approximately a steady flow between 6:30 am and 9:30 am
  • My friend waited ~4 hours once the sale actually started, and had 490 people ahead of him at that point, so people spend an extra approximately 30 seconds in line for every person ahead of them after the lottery draw
  • There's some benefit to being the first person to pick a ticket, and people get less happy as less selection becomes available

Based on this, I came up with a utility function for every spot in line starting out. The utility function is 1.0 (perfectly happy) for someone who shows up right when the doors close, and by luck gets the first choice (so no waiting at all), and is 0.0 (unhappy) if they have to wait 8 hours and get last choice out of the lineup (for instance, the first person in line if the drawn number was #2). Whether this is realistic or not is up to you.

So what's the expected utility for any given position in line?

As a base case, here's the results for a first come, first served set-up:

Pretty straightforward - in the hypothetical scenario I've invented, showing up 3 hours early to guarantee first selection is much better than showing up right at the cut-off, waiting 5 hours while 599 people buy their tickets first, and then getting last selection out of anyone else in line. Your best bet here is to show up first and get your tickets as soon as possible, hands down.

But what actually happened for Foo Fighters was this:

In this system, the average wait time for anyone still ends up being the same, but there's a much wider variance depending on where the ticket is drawn. The variance in utility in this case is WAY smaller. On average, though, the person who shows up last is better off than the person who shows up first - showing up last is the only position that guarantees you'll move up in line once the lottery starts, an your average time spent in line would only be about two and a half hours. The poor fella who came three hours early is likely to be moved fairly far back, and on average will have to wait in line twice as long as the person who showed up last. Brutal.

So if everyone had paid attention to how the system was going to work, they should have all showed up as close to the lottery draw as possible. Of course that wasn't what happened.

But what if you knew that wasn't what was going to happen? What if, say, you had a friend in line too, and the two of you wanted to know the best positions combined to be to maximize the utility of you two as a team? A bit more of a complicated analysis, but the results look something like this:

The absolutely optimal place to have two people in line is to have one show up somewhat early and try to get in position 300, and one show up right at the end and end up with position 600. By spreading out this much, no matter which ticket is called to be the new front of the line, both members are likely to have no more than an hour and a half of waiting time, with one partner not having had to wait at all beforehand. Splitting it up like this is probably also nicer for the other people in line, because whoever was furthest from the front after the lottery draw could leave the line too!

What's really quite interesting is that the optimal positions aren't always off by 300 - in fact, so long as one member of the pair is in the first 210 positions, it's optimal to have the second to be precisely 390 positions behind. Yay math.

So sure - if you had a really good idea of how the lottery system for ticket pre-sales was going to work, you could game it and do just as well as showing up early to a first come, first served system. This is completely ignoring the fact that, when it comes to lining up for tickets for a show, first come first served is actually a much better idea. I get that Northlands wants people to have an even chance at decent tickets, but people are going to line up early anyway, and they're just penalizing them by being inconsistent with their lottery systems. Keep it simple, and let people decide for themselves how long they want to wait in line.

Friday, October 31, 2014

Women's Inequality in Canada

About 5 months ago the Canadian Centre for Policy Alternatives released a report that compared the "best and worst place to be a woman in Canada." I wasn't a huge fan of the report - in fact, I thought the analysis wasn't too dissimilar from my zombie post, and disagreed with how strict rankings were compared across broad categories. I also thought that calling Edmonton the "worst place to be a woman" in Canada to be a bit of a jump from the findings as reported - Edmonton was instead (by their standards) the lowest-ranked city out of 20 in terms of equality.

Just recently, the World Economic Forum released its very own report on worldwide gender inequality, and I liked it much better. It measures a similar number of quantitative results, but weighs factors appropriately based on statistical measures, and compares them based on scores instead of their rankings between categories. Though the merits of the specific measures used are open to interpretation, I'm satisfied that they're representative of inequality across the world.

These stats were so well prepared and presented, in fact, that I figured Canadian cities deserved the same treatment in their rankings. If we take (approximately*) the same approach as the World Economic Forum, and apply it to Statistics Canada results for our top 20 cities, this is what we get:

Rank City Score
1 Victoria 0.836
2 London 0.817
3 Sherbrooke 0.783
4 Ottawa-Gatineau 0.777
5 Vancouver 0.773
6 Québec 0.772
7 Toronto 0.766
8 Saskatoon 0.756
9 Montréal 0.747
10 Oshawa 0.746
11 Halifax 0.739
12 Winnipeg 0.730
13 Hamilton 0.727
14 St. Catharines-Niagara 0.726
15 St. John's 0.720
16 Kitchener-Cambridge-Waterloo 0.713
17 Regina 0.711
18 Windsor 0.708
19 Calgary 0.693
20 Edmonton 0.692

So there's good news and bad news here. Good news: some cities are pretty much in the same rank as the CCPA study (especially around the bottom of the list). Bad news: some cities moved over 10 spots (London and St. John's basically traded places). In general there's a weak correlation between the two analyses.

The World Economic Forum model looks at four major categories: Economic Participation, Health and Survival, Educational Attainment, and Political Empowerment, all weighted the same. Within each category are up to 5 differently weighted sub-categories - statistical measures that are typically converted into ratios of female:male success, with 1.00 being perfect equality, and 0.00 being complete inequality. Weights within each category are distributed based on the overall variance of that measurement, so that a 1% change within one sub-category is worth the same as a 1% change in another.

There are a couple of advantages to using the World Economic Forum model when looking at Canadian cities. Comparing inequality scores to each other allows for a better overall picture of how the city is doing, compared to a strict rank of cities between each other. Also, by following an internationally accepted standard, we can compare these numbers directly to the results of other countries to get a better idea of exactly how good or bad any given city is.

Some fun findings:

Victoria, the best city in this ranking system, has an inequality index of about 0.836, which indicates that it is more or less as equal as the entirety of Norway. It mostly got this due to being very strong across all categories, but particularly for having the closest to balanced city council participation out of any city in the country.

Montréal was approximately equal in terms of female-to-male equality to Canada on average. Much like the country on average, women in Montréal do just as well or better than men in terms of health and education, but are still trailing behind from an economic point of view and are drastically far behind in terms of political representation.

Edmonton, sadly, still has a lot of ground to cover and I'm afraid there's now way of looking at the stats that doesn't rank it last in terms of equality. Our score puts us approximately equal to Russia in terms of inequality.

Edmonton suffers mostly from having had very little female political representation, as well as significantly fewer women working than men, while earning much less than they do.

None of this is saying that Edmonton is particularly bad place to be a woman - certainly I'm sure that most women feel safer in Edmonton than they would in Russia, and the standard of living is likely much better on average. It is simply the case that Edmonton men have it as much easier than Edmonton women do as Russian men to Russian women.

As a country in general, we still have a long ways to go, and I'm certain Canada can climb from 20th in the world. But that change starts here in our cities, and an analysis like this is (in my mind) much more useful for telling us where we stand than the report by the CCPA five months ago.

*Minor changes from WEF report include:
-Political Empowerment: "Head of state" was changed to "mayor", ministerial positions were combined with members of parliament and replaced with current members of council.
-Wage equality for equal work data wasn't available for cities, so it was combined with estimated earned income.

Spreadsheet available upon request

Health 1, 2
Economy 1, 2, 3
Political: Individual city websites

Thursday, October 2, 2014

McDonald's Monopoly Stats

Man, was I ever excited when I saw that McDonald's Monopoly was back this year! I had a blast looking at the Roll Up the Rim stats last year, and hoped I could do the same for Monopoly this year.

Then I was pretty disappointed to realize that Business Insider had done their own analysis. I was all set to just read theirs contently, until I realized that they just copied and pasted their same article from the year before (hint: the prizes changed this year, dummies). So yay - I get to do my own!

(By the way, a less fun breakdown of the stats is done in the official McDonald's rules, so feel free to check my sources as we go along).

So how this all works is that whenever you buy certain food items (like a medium fountain drink, medium coffee, Big Mac, etc), you win two game stamps. Stamps either have a property, an instant win for food, or an instant win for some sort of other fancy prize. The breakdown for how this looks is approximately this:

A total of 1,303,683,256 stamps were printed. McDonalds' claim that one in four purchases will result in an instant win is bang on then, which is nice. The reasons these numbers may not line up 100% (unfortunately) is that some of their prizes are listed for the U.S. only, and they haven't indicated the exact distribution for Canada.

So one quarter of the time you buy anything, you'll win something. That's kinda nice. I mean, chances are it'll be medium fries (they're 50.2% of all food items, after all), but you could always hold out for that rare Royale with Cheese (you have a 2.2% chance any given time you go!).

The other three quarters of the time, you'll get a property, and if you collect all in a property group, you win big! The problem is that McDonald's doesn't distribute their properties in an even manner, instead they distribute them in such a way as to give you hope. Each colour group has a handful of very common properties (typically a 1 in 11 chance of getting them for any purchase), and one property that's very nearly impossible to get.

Bit tough to make out the really valuable prizes...
For instance, McDonald's has printed off approximately 60,670,043 Baltic Avenue stamps, so the chance of getting at least one any time you play (remember, 2 stamps per play) is 9.09%. Those are pretty good odds - it won't take you very long to get yourself a Baltic Avenue stamp. They only printed off 1,000 Mediterranean Avenues though - so your odds of getting one of those any time you play are only 0.00015%. A wee bit tougher - and don't forget, that's only for the cheapest property prize ($50).

Similarly to how I looked at Roll up the Rim, we can take a look at just how many times you'd need to play Monopoly to have a reasonable chance (>50%) of winning any given prize. The average price of eligible McDonald's items ranges from $1.00 (hashbrowns) to $4.49 (Bacon Clubhouse), but I'll use the average of all eligible items as $3.17 for calculations (because who wants to eat millions of hashbrowns?). If we start with the properties, we'd get:

Mediterranean Avenue: Need at least 451,666 plays to have a shot at getting... a $50 gift certificate! In the process you'll likely win about $170,000 worth of prizes, but it'll result in a net loss of $1,300,000. Maybe not worth it.

Vermont Avenue: Need at least 112,955,548 plays, but you just might win gas for a year! Never mind the net loss of $316,000,000.

Virginia Avenue: Better get ready for 90,364,438 hash browns (or equivalent), for your very own chance at $5,000! Fortunately, by that point you'll likely have had 4 million each of the other two properties for that group, so you're good to go!

Tennessee Avenue: Pretty easy with only 1,895,651 plays needed, and you just might get a Samsung Galaxy! Fun fact: the caloric value of 1,895,651 hashbrowns is enough to feed you for about 415 years in a row.

Kentucky Avenue: Enjoy the necessary 22,591,110 plays you're likely going to need to win your very own 5-night Delta Vacation for Two! If you got that in medium fountain drinks, it would fill five and a half Olympic swimming pools (everyone's favorite measurement of volume).

Ventnor Avenue: Another relatively easy one - only about 6,024,296 plays for a 50% shot at nabbing a Beaches Resort Vacation for your family! If you bought all your plays with bacon clubhouse sandwiches, that would only cost you about $27,049,089, which is actually a pretty good deal (sarcasm).

Pennsylvania Avenue: A bit tougher, but it'll take about 225,911,096 tries for a reasonable chance at a trip in a Cessna private jet! You might think that's tricky, but wait until you put effort towards:

Boardwalk: Approximately 451,822,158 plays needed, but you'd have a glorious chance at nabbing A MILLION DOLLARS $817,572*! Even at the lowest rate of $1.00 per play for hashbrowns, this effort would require your Monopoly opponents to land on your Boardwalk 225,911 times (in original Monopoly, assuming you had a hotel there).

(* Also - the million dollar prize is misleading. Because it's paid out in $50,000/year installments over 20 years, the present value of the money is much less than a million dollars once you take inflation into account.)

Adding everything up, the total value of all prizes offered in Monopoly is $488,423,499.28. With a total of 651,841,628 plays, that means that the expected return of any given play is about $0.75. This may be pretty reasonable if you're buying $1 hashbrowns (get a hashbrown, plus an expected loss of only about 22 cents), but at an average eligible food item cost of $3.17, the loss to McDonald's on each purchase is about $2.42 or 64%, giving it a worse house edge than Lotto 6/49.

So, much like Tim Horton's Roll Up the Rim (or any large promotions, for that matter), I wouldn't recommend McDonald's Monopoly from an investment point of view. If you're already out there gobbling up hashbrowns, carry on though!

Friday, September 12, 2014

Edmonton's (by)Lawbreakers

Before I get to the fun stats, bear with me while I rant for a bit.

The Edmonton Police Service offers a handy-dandy little tool known as the Neighborhood Crime Map, that shows where crimes take place across the city, and gives some insight into past crime behavior. I was all set to do an analysis of the data, and was pretty excited about what I was going to be able to show, but then more carefully read the terms and conditions which include:

While it is acceptable to pass the website link on to others in your community, you will not share the information found on the website with others other than with members of the Edmonton Police Service or other law enforcement agencies
Seeing as others have explicitly asked for permission to share the data and were turned down, and the EPS has "dealt" with others who used it without permission, I've decided not to publish any of my analysis. But shame on them. The site isn't so user-friendly that it's perfectly optimized to give the best information possible, and offering data to the public but forbidding them from discussing it doesn't really count as "open data."

* * *

That being said, the City of Edmonton publishes open data that's actually open. This includes my new favorite data set: Bylaw Infractions!

The bylaw infraction data set includes heinous crimes like "Weeds" and "Unsightly Property", but also things people are actually concerned about like "Unlicensed Businesses," "Graffiti," and "Snow/Ice on Walk". So lets do everything I would have done for actual crimes, and instead look at people who let their grass grow too long.


Weather, unsurprisingly, has a decent effect on bylaw infractions that are reported. On the one hand, it's not at all surprising that people report weeds in the summer, and snow on sidewalks in the winter:

On the other hand, a similar graph for infractions for unlicensed businesses makes, as far as I know, no sense. For some reason, every January or February a lot of people get fined for unlicensed businesses. Most likely kids selling off their Christmas presents, I reckon:

At least the new year stings seem to be getting less intense over the years... yay?

Finally, we also have graffiti and unsightly property, which look something like this:

Apparently properties are more unsightly during the summer - who knew? More reasonably, the complaints probably are easier to make when there isn't snow covering a poorly-kept lawn (or whatever "unsightly" means...). As for graffiti, there's a fairly consistent double-peak pattern every year, where graffiti artists seem to take a bit of a break around September. Maybe it's something to do with them hooligans getting busy going back to school? Who knows...



Looks an awful lot like graffiti is concentrated downtown and in Strathcona. I'd comment on how that may or may not be associated with actual crimes, but I'm not allowed to by the EPS Terms and Conditions. Instead... it seems correlated with tall buildings...?

Snow/Ice on Walk

Unlike graffiti, snow and ice left on sidewalks seems to be a bit more spread out around the city. Apparently the southern suburbs are either better at shoveling, or better at hiding ice, than their northern neighbors.

Unlicensed Businesses

If you want to run a business but don't have a license, I wouldn't go downtown or to West Edmonton Mall. That's just what they'll be expecting.

Unsightly Property

Mirror, mirror, on the wall, where's the unsightliest property of them all? Well, the most violations happen in Alberta Avenue, and in general just north of downtown. Keep looking sharp, suburbs!


Not super surprisingly, the neighborhoods with high weed violations tend to correlate quite well with unsightly properties. It would really suck to get a double-whammy for both at once, wouldn't it? At least some of the oh-so-pretty suburbs (like Windermere and Summerside) are getting caught on weeds too!

Nosy Neighbors

Take a look at this:

When it comes to who's actually reporting these bylaw infractions, it's almost a perfectly even split between bylaw officers and everyday citizens for "tattle-tale" infractions like not shoveling, ugly houses, and weedy gardens. On the other hand, I'm solidly impressed that the vast majority of unlicensed businesses are reported by citizens. I guess people don't like being ripped off? On the other hand, nobody much seems to mind graffiti apart from the bylaw officers...

So there ya go. Not quite the crime post I wanted, but still fun to look at nonetheless. Thank you to the City of Edmonton for having actual open data, no thank you to the EPS Crime Map, and special thanks to my friend Daniel for suggesting the bylaw infractions as an alternative!

Tuesday, September 2, 2014

Edmonton's Census Correlations

Back in May, a lovely website went viral that listed a number of spurious correlations between unrelated sets of data. It was loads of fun to read, and a lovely reminder that correlation doesn't imply causation.

Edmonton's 2014 census data was released last week, in a glorious Christmas-like occasion for people like me who are into that sort of thing. The census asked a couple fun questions and broke the results down by neighborhood, and I originally figured it might be a fun idea to comb through the data for ridiculous correlations like the Spurious Correlations website.

Unfortunately nothing super ridiculous stood out. Regardless, take a look at some of the more fun findings from the Census that maybe haven't been picked up on by other sources:

Married people don't like renting

I mean, really, nobody really likes renting, but it seems like married people especially don't.

Low apartments make you lazier

In general, living in an apartment correlates with transit alternatives that aren't driving, but people in high-rise apartments walk to work way more than people in shorter apartments. Sure, this is maybe because most of the people who walk to work live downtown and that's where the high-rises are, but it's more fun to think that short apartments compel people to bus...

This fun graph

Basically, as neighborhood populations change, people's jobs change too. For instance, the most common time to have a family member in preschool is when you have people in your house under age 5 (duh), but the second most common is when you have people aged 35-40. That double-peak pattern gets shifted over by 10 years and flattened out for grade 7 kids.

Other moderately interesting (but less pretty to graph) correlations include:

  • Full-time workers like driving their own cars, but only really post-secondary students bother consistently taking transit to work
  • People who've been in their house a long time tend to pay attention to the newspaper and radio more for their city info, but people who've been there for less than 3 years seem to prefer the city website
  • People who go to Catholic school seem to like driving more 
  • People working part-time are more likely to have lived in their house for more than 5 years than people working full-time (but less likely than if there are high school kids in the house!)
  • 25 to 40 year olds tend to move around the most, after then they seem to stick in the same house for a while

Wednesday, August 27, 2014

Population of Canada by Longitude

A couple of months ago a friend of mine on Facebook shared a post with me that graphed the population of Canada by latitude. They also challenged me to come up with a similar graph of Canada's population, but by longitude.

And I promptly forgot. Until now!

Like the original post mentions, finding geographical data that matches up with population data from Statistics Canada is quite tough, largely because the postal code data is intellectual property of Canada Post, and they don't much like sharing. I managed to find the 2011 Census data sorted by Forward Sortation Area (the first three digits of your postal code), and the geographical data for all postal codes (which was an unreasonably large file), and combine the two to get a fairly precise view of the data. To make sure what I had was close enough to the original graph, I redid his work by latitude first:

Close enough. Around the north some things get wacky because postal codes are so large and we likely used different ways to approximate the centers of each FSA, but I'm still reasonably satisfied with the result.

It's a fun graph, and deservedly the original got a nice amount of HuffPo press. It's pretty weird to think that about half of Canada lives below the northern suburbs of Montreal, and only 31% of the country lives above the 49th parallel section of the border.

Sure, Canada's tall, but lets talk about how wide it is. It's really wide. It stretches from 52°37'W at Cape Spear to 141°0'W at Boundary Peak, which covers nearly a quarter of the longitudinal values on earth. Yay us.

If we do the same analysis as the previous graph, but for Longitude, we get the following (you can click on the image to zoom and enhance, spy-movie style!):

So really, nothing too surprising. The majority of people tend to live somewhere between Toronto and Québec City, and in both British Columbia and Alberta the major cities tend to fall more or less along the same line north-south.

I was planning on combining both maps into a generic heatmap for Canada, but then I stumbled on this, and it's way cooler than anything I'd have been able to do, so I'll just share it with you instead. Try not to get too mesmerized...

Friday, August 15, 2014

The Great Oven Mitt Review of 2014

So I've got some cool friends.

A little background: when I first moved into my apartment (one year ago today exactly!), I only remembered to buy oven mitts at the last minute. I grabbed the quickest and cheapest one I could, which was a single oven glove appropriately called the "Ove' Glove." The night I moved in, we made pizzas and the Ove' Glove was the source of much entertainment and complaints, as it appeared to be a very good conductor of heat instead of an insulator. Whoops.

To remedy this, a two weeks ago at my birthday party my friend Cassandra decided it would be a great gag gift idea for everyone to bring oven mitts for me. As a result, I am now in possession of 16 oven mitts, ranging in colours, sizes, and materials.

So on this, the anniversary of me moving in, I've decided to work with these oven mitts the way that I know best: test them and write a report.

My set-up was pretty basic - I made a system to hold the oven mitt a fixed distance away from a medium-heat element, stuck a meat thermometer inside, and took heat measurements for up to ten minutes. A check without oven mitts showed that this setup subjected the oven mitts to a temperature of approximately 70 degrees.

Test Setup (highly technical)
 So without further ado, I present to you my rankings for oven mitts from worst to best. Starting with:

#9 Blue Mitts of Death

  • Brand: Dollar Store
  • Value: $3
I suppose at this point it's worth pointing out exactly how I'm ranking them. First and foremost, I'm looking at how long it takes the mitts to actually burn you. According to this source, 55 degrees is hot enough to give second-degree burns after 17 seconds. Holding onto a 70 degree heat source, this means you'd burn your hand in less than 5 minutes using this oven mitt. Sure, that's not how normal people use these, but hey - you gotta compare them somehow. Since these have the highest potential for burning, I rate them the worst.

#8 Pink, Flowery, and Painful

  • Brand: Dollar Store
  • Value: $2.50
Though these are by far the prettiest, they're also quite deadly. If my hand had been in them for the experiment, I would have gotten a burn about 5:20 into the test. Not nice. It's also worth pointing out that throughout all tests, these oven mitts got the closest to 70 degrees (67.6 after 10 minutes). Yikes!

#7 Black Cuisinart

  • Brand: Cuisinart
  • Value: $15.95
Hilariously, I got these as a Christmas present from my parents before any of these birthday shenanigans went down. Sadly, they're also apparently the type of oven mitt that likes to burn your hand off. Their redeeming factor is that they only increased in temperature 1.2 degrees within the first 30 seconds of the test, which is more than enough time for most oven extractions. Would likely have burned my hand about 5:40 into the test.

#6 Green Silicone

  • Brand: Ming Wo (?)
  • Value: $9.99
Man, these silicone ones look so fancy, but really like burning your hands to crisps. This is very similar to #7, in that it has one of the lowest heat gradients at first, but by 7 minutes into the test would have made you very unhappy. I'm sure there's some materials science point to be made here, but that would involve actual science.

#5 Languages of Pain

  • Brand: Dollar Store
  • Value: $2.50
Awesomely enough, I got two pair of these for my birthday. These were pretty decent for a language lesson, but woulda burned your hands at about 7:20 into the test. An excellent example of Dollar Store quality oven mitts holding their own against their expensive counterparts though...

Here are pretty graphs of the worst five oven mitts:

Again, the pink and blue oven mitts both had high initial rates of heat pickup, and ended up with the highest heats (the ranking order is a bit different from the graph because I tested the pink one on a colder day. I know, terribly unscientific of me...). The silicone mitts did much better for the first two minutes, but then took on heat at a similar rate to everyone else. Tsk tsk. 

The rest of the mitts happily didn't ever hit 55 degrees within their tests, so I'll rank them based on their total heat gain over the 10 minutes:

#4 The Ove' Glove

  • Brand: No clue
  • Value: $18.99
In a stunning come-from-behind near-podium finish, the Ove' Glove turns out to be a contender! And if you don't believe me, check out this totally awesome super cool consumer video (sarcasm). The Ove' Glove gained heat at an average rate of 2.95 degrees per minute - not shabby!

#3 The Alien

  • Brand: Dollar Store
  • Value: $2
Put this sucker on your hand and you've got great alien chestburster puppet! Alternatively, use it to take hot things out of an oven and not burn yourself. By far the best bang for the buck, somehow it combines the silicone and fabric and makes a decent oven mitt, gaining an average 2.82 degrees per minute.

#2 Better Barbeque

  • Brand: CTG Brands
  • Value: Weight in gold?
Wowza. This one is hefty, basically goes up to my elbow, and can hold its heat, only gaining an average of 2.66 degrees per minute. Very nice. These also won the contest for lowest heat pick-up in the first minute, and didn't even register a temperature change until 30 seconds into the test. 

#1 President's Choice

  • Brand: PC
  • Value: 7 unicorn hairs?
These guys were the bomb, only gaining 2.61 degrees per minute. They're also flexible enough to use regularly, unlike the silicone ones. 
Graph of the top 4 oven mitts:

Again, some very smooth curves here. The CTG oven mitt was by far the steadiest heat increase, but lost out to the PC mitt over the full length of the test. I know that my ranking has been more-or-less arbitrary this whole time, but I'm comfortable with declaring the CTG mitt to be my favorite (because really, who uses a mitt for 10 minutes at a time?).

Thanks again to everyone for pitching in on the oven mitt present. I hope I've used them in an appropriate manner! 

Wednesday, August 13, 2014

Mud Heroes Aren't Normal

Last weekend I went ahead and did something I never thought I'd do: the Mud Hero race down in Red Deer.

Mud Hero, for those of you who are blissfully unaware, is a crazy obstacle course/race/endurance sport/mud bath and spa/general day of chaos that follows in the ever-growing trend of mud runs for the athletically-inclined. There are dozens of similar events to this around Canada each summer, and the Mud Hero appears to be one of the most popular with over twelve thousand participants over the three days of heroing in Red Deer last weekend.

The event attracted people from all backgrounds and fitness levels, and has likely inspired wonderful stories of perseverance, raising money for charity, and teamwork through adversity. To all this, I say nonsense. The most interesting part of the Mud Hero is the statistics, and, much as though readers of this have likely figured out already, the inescapable conclusion that Mud Heroes just aren't normal.

In an attempt to ostensibly appear as much like a legitimate race as possible, all participants were given timing chips to track their racing - and all results are posted online for people to show to their friends and family and brag about just how slowly they trudged through the muck. Since this involves thousands of numbers, it's pretty easy to salivate over the possible statistics of said numbers. So I did. Here's a graph of everyone who ran on the last day of the Alberta Mud Hero:

Right off the bat that may look quite like a normally-distributed bell curve - there's certainly a lovely peak right around the middle, and it tends to taper off at either end. The reported average time for the course was 1:25:22 (85 minutes), and that seems to be reasonably around the middle of the peak.

It isn't just enough to assume that that's a normal distribution though - a normal distribution is a rather precisely defined curve that doesn't necessarily include all bell-like shapes. The results from Sunday's Mud Hero had a mean of 85.37 minutes and a standard deviation of 29.90 minutes - as these are the two parameters you need to develop a normal distribution curve, we can compare Sunday's results to the normal assumption and get the following:

That's not really all that close at all. These are two bell curves that have the same mean and standard deviation, but are not identical, leading to the fun conclusion that Mud Hero runners are not normal (well, normally-distributed at least). Mud Heroes tend to be positively skewed (the mean is higher than the median), and have shorter and bounded tails.

This isn't really all that surprising - in fact a normal distribution would have been surprising as there are necessarily cut-offs to the data (nobody can do the race faster than 0 minutes, for instance), and it was a relatively short race. Often people tend to view all bell-shaped curves as normally-distributed, even though there are an incredible amount and diversity of probability distributions out there.

So Mud Heroes aren't normal. What else can we learn from the data? Fortunately the results are broken down into genders, ages, and hometowns, so let's look at those!

First of all, gender:

Fascinatingly enough (for an event whose purpose is explicitly to get dirty), women outnumbered men 2 to 1! That's pretty awesome. A quick Analysis of Variance test shows that the men were statistically significantly faster than the women were this time around though, which I suppose is the trend in races like this. Shame...

Bam. Age graph. I'm not entirely sure why the men aren't as consistent as they age, but then again, who is? (marriage joke)

And finally, home city:

Turns out there's no reasonable statistical difference between participants from Red Deer, Calgary, and Edmonton. These box plots for their results suggest they have almost identical distributions for time, and an ANOVA test suggests that they can all be considered to be drawn from the same population. So really, even though the average time for Calgarian Heroes was two minutes faster than Edmontonian heroes, it's not significant enough for them to brag. So ha!

All in all though, Mud Hero was definitely a fun experience. If you're looking for a good excuse to get tired and muddy, I'd highly recommend it for next year!