Tuesday, July 17, 2012

Watch out for thinking traps

Imagine that a new test for HIV has been developed, and that the powers that be have decided to pursue universal testing with it in order to improve public health. Let's assume that it's a very accurate test - it tests positive for 95% of people who actually have it, and it tests negative for 99% of people who don't.

Pretend that you get the blood work done, and get the dreaded bad news of a positive result. Now, you've probably been safe - stayed away from dirty needles, used protection when needed, et cetera, and you immediately go into denial. "This is outrageous," you might say, "it must have been a false positive!"

Taking a cursory look at the information provided about the test, what do you think your chance of a false positive is? Your first impression might be that it's 1%, which certainly seems sadly unlikely.

Let's do the math:

In 2009, the number of people in Canada with HIV was 65,000, and the population was 34 million. That leaves 33,935,000 Canadians who don't have HIV.

The test is positive 95% of the time if someone has the disease, so 61,750 of the people with the disease will get a true positive result. Similarly the test is positive 1% of the time for people without the disease, and would give 339,350 devastating false positives.

That means that of the approximately four hundred thousand Canadians who got a positive result, only sixty thousand of them actually have the disease. If you're not part of a particular risk group, your chance of being healthy with a positive result is not 1%, it's actually a whopping 85%.

This type of approach is critical when dealing with science and technology. A politician could perhaps argue that, with such a seemingly accurate test, everyone should be screened for the good of the public. Due to the low prevalence of the disease, though, such universal testing would do far more damage than good. Actually sitting down and working out the numbers is crucial when dealing with science and statistics, and you should always be wary of blindly listening to information relayed at the beginning, or that's spun against you.

Now that you're primed, let's try a more famous example - the famous Monty Hall problem:

Congratulations! You're in the middle of the game show Let's Make a Deal, and you're standing in front of three doors. Behind one door is a car, and behind the other two are goats. You pick a door, say #1, but it isn't opened. The host, who knows what's behind all three doors, opens one of the other doors, say #2, and shows that it has a goat. The host then says to you, "Do you want to pick door 3?" Is it to your advantage to switch your choice, or does it matter?

Your first reaction may be that it doesn't matter - at this point you are faced with two doors, and you know that one has a goat and that the other has a car. Surely it must be a fifty-fifty decision, and it doesn't matter what you choose. This is, perhaps surprisingly, incorrect. In fact, you can double your chances of winning a car if you switch doors!

Because the host knows what's behind each door, if you switch you are guaranteed to be switching from a goat to a car or vice versa (never a goat to a goat, because that option has been eliminated by the host). Two thirds of the time you're going to pick a goat right off the bat, and switching would get you the car, and only one third of the time would you pick a car first and regret switching. Even though at first glance it would seem your choice doesn't matter, taking a second to think things through can definitely turn out to be profitable to you.

One last example of critical thinking: in front of you are four cards, and you know that each card has a letter on one side and a number on the other. The four cards you can see have E, M, 4, and 8 written on the side facing you, and you can't see the reverse side.

Suppose you were told that the cards always obey the following rule: If a card has an E on one side, then it always has a 4 on the other side. If you wanted to figure out whether this rule is true or false by turning over some of the cards in front of you, which would you turn over to find out? Take a second and think it through. Go on.

Most people immediately choose the card with an E on the front, and rightly so - if it turned out to not have a 4 on the back, we would know the rule is false. Some people will also choose the card with a 4 on the front, but this isn't necessary - the rule was only specified what happened to all cards with an E, not all cards with a 4. Similarly there's no sense in checking the card with the M.

But most people will stop there, and completely overlook checking the card with the 8 on the front. What if you checked it and there was an E on the back? Then the rule would be broken just as easily as if there was no 4 on the back of the card with the E on its front.

This is an example of a confirmation bias: when trying to test theories we often set up tests that are designed to prove something, as opposed to set up tests that are designed to disprove the same theory. Ancient scientists would often accumulate a massive amount of data that agreed with their hypothesis and then call it confirmed, only to be embarrassed when a simple test disproved their theory completely.

Proper critical analysis of probability and the awareness of cognitive biases are important to keep in mind in science and technology, but also in making everyday decisions. The next time you are shocked by something on the news, or need to make a major decision, it always helps to take a step back and really think about  both what you're looking for and the information you've been given.

This post also available at The Wanderer Online.

Tuesday, July 10, 2012

More NHL Streaks

Last week we took a look at the occurrence of streaks in the NHL and found that the rate of winning or losing streaks wasn't any different than what we'd expect from random chance. This was pretty cool in terms of modelling the NHL for next season, as it showed that individual games in the NHL could accurately be modeled as independent of the results of previous games.

A couple people asked me to look into a couple of other possibilities for further investigation. The first thing I decided to look into, and the subject of this post, was the influence of rest time between games on the game outcome. It may make a certain amount of sense that teams playing back-to-back games are at a bit of a disadvantage, for instance, and longer rest times may improve performance. Again, I looked at all the games from the last season for all teams, and this is what I got:

I couldn't think of a great label for the y-axis, so let me describe the graph a little bit better. The x-axis is the number of days since the previous game played by that team. For each category of days since previous game, the percentage of wins was calculated, and then compared to the percentage of games that the team won over all 82 games, which is presented on the y-axis. If the data points are above 1.0, that means that the team did better than average after that number of days off, and vice versa. The grey data points are the results from each team, and the blue squares are the average for each set.

Not so surprisingly, teams tend to do worse when they've only had a day since their last game (on average, they do 20% worse than normal, in fact). Their results tend to improve from there so that they do about average when given a two-day break, and then teams tend to do about 10% better than average when given a three-day break. By the time that a team has had a four-day break, though, they tend to do as well as average again, though with a significantly larger variance between teams.

The results from the 5-, 6-, and 7-day breaks were much less consistent - only about two-thirds of teams had 5- or 7-day breaks, for instance, and only 10 teams had a 6-day break in their schedule. However, on average for the 5+ day breaks, teams tend to do exactly as well as average (no bonus or penalty either way).

Fortunately, the vast majority of games for each team take place after a two-day breaks, and that's fairly consistent between all the teams. Also, the number of one-day breaks is typically about the same as three-day breaks for individual teams, which I'm sure is planned deliberately.

Friday, July 6, 2012

NHL Streaks

Two statisticians get on an airplane to go to a conference. Once they've sat down on the plane, one notices that the other is carrying a bomb with him. When asked about it, the second statistician says, "Well, if the odds of having one bomb on a plane are tiny, then the odds of having two must surely be zero and we're guaranteed to be safe!"

This sort of thinking highlights a very popular gambler's fallacy. Often people will think that if a fair coin has been flipped heads four times in a row, it's more likely to be tails on the next flip because there should be an even number of heads and tails. Though it is true that, after a very large number of flips, we should expect approximately 50% heads and 50% tails, the problem with the fallacy is that each coin toss is completely independent of any other coin toss. No matter how many heads we've flipped, the coin isn't keeping track and trying to balance itself out - we just know that after a while they ought to end up about even.

Statistically independent events are important in probabilities. Coin flips, roulette wheels, dice tosses, and explosive suitcases are typically independent each other, and it's important to know this if you're ever going to go to a casino. (Related: most betting strategies people will try to sell you count on you not knowing this - don't buy them!!). On the other hand, some casino games actually have dependent probabilities. The best example is blackjack - if you know which cards have been played from a deck, then you should know roughly what distribution of cards are left to be played in the deck as it isn't being reset after each hand (which is how card-counting works).

I recently read an article by The Wanderer magazine that examined, in part, a paper by the National Academy of Sciences that investigated the effects of randomness on performance. In the paper, a large-scale model was developed where past performance had an impact on future performance. This got me wondering whether or not this was something I should incorporate into my future NHL models.

Before I get into my results, ask yourself: do you think individual games in the NHL are dependent on previous results? Is a team on a three-game winning streak more likely to win the next game than they would be if they'd just come off a three-game losing streak? Or does it matter?

It's pretty easy to rationalize it either way - perhaps having lost a couple games in a row a team could be feeling depressed and be more likely to lose, or maybe they are more inspired/desperate and would be more likely to win. Similarly, having won a couple games in a row could perhaps make a team more confident and give them an edge, or too cocky and make them lose.

If there is an effect, it would definitely be worth incorporating into a model of the regular season, so I was interested in taking a look at the results of last season. I plotted 2,460 results from the last season (fun), and decided to compare these results to what would be expected if there wasn't any impact from previous games. In every season of 82 games, there are 81 sequences of two consecutive games, 80 sequences of three consecutive games, 79 sequences of four games, etc. On average, we should expect ~50% of two game series to have the same results (win-win or loss-loss), 25% of three games series be a streak, etc. I looked at each sequence of up to 10 games in a row for each team, compared it to what I'd expect without any dependence between games, and this is what I got:

I have to admit I was pretty surprised. The results (red squares), averaged for all teams almost perfectly matched the results that we'd expect if the games were independent. Crazy!

The pink lines on the graph were the maximums and minimums for each streak achieved per team, and the blue lines are the range that we'd expect 90% of teams to fall inside naturally if games truly were independent. Again, most teams fall within this range (oddly enough, typically 27/30 which is our magical 90%).

What does this mean? Basically, for the vast majority of teams, and for the NHL as a whole averaged across all 30 teams, hot streaks or cold streaks from any team throughout the season happen almost precisely as often as we'd expect. There's no evidence here that previous games impact future performance in any significant way.

This is actually pretty good news for my model, because now it doesn't have to be quite as complicated as I was worried about. It's also humbling to know that, even though hockey games depend on the collective actions of a bunch of humans, they follows expected statistical patterns so closely.

So the next time you're betting on hockey games or worried about bombs on airplanes, just take a moment to consider that these events are statistically independent. A team on a hot streak is no more or less likely to win than any other team, and there's really nothing you can do about other people bringing bombs on your flight.