Monday, August 20, 2012

Scientific Illiteracy

In 1998 British physician Andrew Wakefield was offered £400,000 to publish an article claiming the vaccine against Measles, Mumps, and Rubella could lead to autism. The article was funded by lawyers preparing for an anti-vaccine lawsuit, and despite the fact that his results could not be reproduced and were dismissed by an overwhelming majority of researchers, a massive and ultimately deadly panic swept across the world. Vaccination rates dropped across Europe opening the doors to completely preventable outbreaks, children were killed or permanently disabled, and an estimated $25 million in avoidable hospital bills were accumulated due to completely avoidable MMR outbreaks.

It ultimately took a long and unnecessary campaign by researchers and health professionals to re-prove the safety of the MMR vaccine, but the scientific agreement hasn't quite been translated to the public - in fact today about 48% of Americans either don't trust or are unsure about the safety of vaccines, largely based on that one fraudulent article.

Vaccines are a dramatic example of the consequences of a public that doesn't trust or doesn't understand science, but they are absolutely crucial to public safety. Not getting vaccinated is not only a hazard to yourself but also to everyone around you. Sadly modern outbreaks of completely preventable diseases, such as the recent Whooping Cough outbreak, are reminders of the impact of ignoring science.

Studies have shown, though, that in general the public has a very poor understanding of some fairly basic concepts. Take for example these survey results on scientific literacy:
  • 6% of Americans don't believe smoking can cause lung cancer
  • 13% don't know that plants produce the oxygen we breathe
  • 20% aren't aware the center of the earth is very hot (and are presumably very confused by volcanoes)
  • 25% still think the sun goes around the earth
  • 46% don't know it takes the earth a year to orbit the sun (but were probably still stumped by the previous question), and
  • 52% fell hard for the Flintstones and believe dinosaurs and humans coexisted
Admittedly none of these specific misconceptions of science are likely to be dangerous to an individual, apart from perhaps lung cancer and smoking. What's instead frightening is that these are all concepts that are taught during or before high school, and suggest a public that is largely ignorant or apathetic to some of the most fundamental concepts we rely on. Even more horrifying is that these percentages have tended to only get worse between 2001 and 2010.

Regardless, though, of whether the misunderstanding of science is harmful on an individual basis or not, this attitude towards science of either apathy or automatic distrust is very dangerous for society. Distortion of science for personal or political benefit is very common, and has even been used explicitly to cause harm.

Two particular government exploitations or distortions of scientific understanding come to mind. The eugenics movement in the early 20th century claimed that human breeding needed to be controlled in order to advance our evolution, and was pushed ferociously by political groups and individuals around the world. Even countries like Canada and the United States got caught up in the movement, with individuals like Tommy Douglas and Alexander Graham Bell advocating for restrictions on who could marry and have children, and certain provinces and states forcibly sterilized individuals who were considered unfit to breed. The movement ultimately led to the rationalization of murders in the name of cleansing in Nazi Germany. It ultimately took the combination of the end of World War II and the further understanding of genetics to bring about the end of the vast majority of eugenic based programs, but not after massive personal and societal loss.

On the other hand, Soviet Russia took control of science and dismissed genetics entirely as a "bourgeois pseudoscience", instead adopting the practices of Lysenkoism for agricultural development. This explicit adoption of absolutely useless techniques held back Russian understanding of genetics for decades, and resulted in the firing, imprisonment, and execution of legitimate Russian scientists.

But misunderstanding of science still harms us daily. Despite court findings of fraud, millions of dollars a year of "ion bracelets" are still being sold by pretending to be scientific. Fictionalized versions of polygraph tests have led to the belief that they're foolproof - and private polygraph examiners have likely been responsible for propagating actual lies - even though psychologists have determined that they really aren't any better than guessing. Some people will often reach out to homeopathy at the expense of medically-proven treatments even though it's been consistently debunked by doctors.

Fortunately it's not all bad. The public acknowledgement of Canadian science journalists like Jay Ingram and Bob McDonald with the Order of Canada recently was an important step for supporting the field, which is undeniably important for keeping people informed, and televised outreach through the Discovery Channel and shows like Mythbusters has done a lot for increasing interest in science topics and critical thinking. Hopefully as science continues to advance we will see fewer opportunities for people to take advantage of people's misunderstanding of it, and more opportunities to get people interested and engaged. We definitely need it.

This was also posted on TheWandererOnline with graphics by Michelle Weremczuk. Check it out!

Wednesday, August 8, 2012

Beat the Odds

If you're going to operate a profitable betting ring, there are two things that are important to know. An obvious one is the odds of a particular event happening: a good casino wouldn't make any money if they didn't know the odds associated with blackjack or roulette, and adjusting their payouts accordingly is what gives them any profit.

What's also important to know is the number of people who are likely to bet on a particular outcome. This is less important perhaps on a roulette table, but potentially very important when it comes to something I've been investigating a lot recently: sports betting. It makes a certain amount of sense that sports betting companies (such as SportSelect) track their ticket sales to consumers, and in fact the companies can profit just as much by adjusting their payout according to ticket sales as they can according to individual game odds. As it is likely substantially easier for a company to track their sales than it is to predict the future, this is quite likely a critical factor in their payout odds.

The profit made by a company in sports betting can be visualized like this:

Basically the expected profit can be determined based on the chance of an event occurring multiplied by the cost of that event occurring. So an event (such as home victory) with a chance p of occurring, will pay out an amount equivalent to the payout odds x, and the number of tickets sold that chose that event (m). By considering both at once, we can get an 'expected' average profit that takes into account all possibilities - by tweaking the payout factors, a company can assure themselves of continually profiting from the sports betting.

Presumably SportSelect knows the fraction of tickets sold (m and n) very well. They must. It would be negligence on the part of the company not to know what they're selling, how much they're selling, and who they're selling to. Presumably also they have some sort of model that allows them to predict game odds (p and q) with a reasonable amount of accuracy. As they control the values for payout (x and y), they can then have a good sense of control over their profit.

P and m are both factors that relate to the specific event that's being examined. P is most likely intrinsically involved with the relative strength of the teams, and m accounts for whatever factors lead people to purchase lottery tickets betting on certain teams (perceived skill, popularity, etc.). Together, the factor pm more or less accounts for the expected amount of money SportSelect will lose should that event come to pass.

A quick look at the payout odds that SportSelect offers shows a strong trend - the majority of their payout options average between the two events at a payout of 1.7 - combinations such as 1.6 and 1.8, 1.5 and 1.95, etc. Some more unlikely payout combinations are 1.4 and 2.15, 1.3 and 2.45, and 1.25 and 2.65, and these tend to average a little higher, but still within the range of 1.7-1.95. There are very few combinations outside of that, so for the sake of this piece I'll take only these into account.

In order to try to investigate just how SportSelect comes up with their odds, I set up a series of random p and m values to look at some trends. This is what I got at first:
OH MY GOD THAT'S UGLY. Whew. Jeeze. What I have here is the product of p and m on the x-axis, and then the expected profit percentage based on any of the five payout combinations as explained above (the legend lists the average of the two payout values for each of the five sets) after 1000 data points for each. This is really really ugly though.

Part of the reason it's ugly is the relationship between pm and qn - the complementary payout values for the alternate event. For radical values of either p or m, we tend to get qn values that are tremendously different, which gives the ugly values as shown above. Looking at the cluster of points where the majority lie appears to form a series of curves; this is a cleaner version of that graph:

Much better. This is actually rather interesting, if I may say so myself. What we get is different ranges of the pm factor result in different payout values (x and y from before) giving the largest profits to the company. So for events with either very large differences in who people bet on (m) or who is actually likely to win (p), larger payout odds are more likely to result in profits. Smart, eh? In fact, it's quite easy for SportSelect to guarantee a 14-16% profit by estimating (with not necessarily that much accuracy, even) the pm factor. As they ought to know the sales figures (m), then they only need to be reasonably accurate on the actual game odds in order to make a killing.

Assuming that they do in fact take ticket sales into account, an opportunity to perhaps profit does then exist. Take a look at these tables:

This first one is just a representation of the graph above - each colored zone represents a range where a new payout scheme becomes the most profitable, measured against values for p and m (the middle values are pm). If we change it to represent what those colours actually mean, we get:
In general this follows the trend mentioned before - for games where either the sales or the odds are anticipated to be close, we have lower payouts (an average of 1.7 is typically odds such as 1.6 and 1.8, remember), but with games with larger disparities we have higher payouts offered (an average payout of 1.95 would feature 1.25 and 2.65 for different teams, respectively).

If we look at the pm value with an m of 0.32 and a p of 0.5, for instance, we notice something interesting. The pm value is 0.16, so therefore the most profitable payout distribution would be one with an average of 1.775, such as 1.4 for one team and 2.15 for another. However, if we were really sure that the odds were truly 50-50, then betting on the teams with 2.15 odds against would be profitable - 50% of the time on a $1 ticket we'd get $0, and 50% of the time we'd get $2.15, with an expected return of $1.08. Quickly tabulating these results gives the last graph:

Here the green values are where there's money to be made, the gold values break even, and the red values are guaranteed money losses. These are the values of the absolute best bet that can be made for each combination.

So what does this mean? It means that on certain cases, it could be possible to beat the SportSelect betting system. This would have to involve a very high degree of certainty in the actual odds of a given team winning a game (at least as accurate a model as they use would be required), and it would involve them trying to capitalizing on a fairly significant majority of the public purchasing tickets for one team over another (at least 2:1 ratio would be required).

Still, though, numerically it's possible if you have a good enough model and are patient enough. Good luck!