Meta blog

The translation of “meta blogging” goes something like “blogging about blogging”. I don't like it. The reason being that, when I read a blog, I'm not interested in the blog itself. I'm interested in the topic of the blog. I don't care what the author has been up to recently and I'm not interested in whatever adventures he pulls himself through in order to splash his letters onto my screen. I suspect that my readers share that indifference to information relating to my persona and, thus, I'd like to keep any such uninteresting details at a minimum (I'm not doing a particularly good job here).

Objective meta-meta blog

Many meta blog entries are concerned with the responsibility of the author and the fact that the author haven't managed to actually author anything for a while. The readers are supposed to have been subjected to great distress in the absence of quality content to digest off the pages of the blog in question, and therefore it lies within the author's responsibility to maintain their hunger for information with a steady flow of well thought out entries.

Subjective meta blog

My readers are a sparse collection of nerds, friends, acquaintances, my brother and hopefully a few others (please say “hi” in the comments). However, I believe that my responsibility as a writer is proportional to the size of my reader base. Because of the small size of that reader base, I believe my responsibility is limited to not lying and deliberately spreading misinformation on these pages. I'm sure I don't have to produce a certain quantity of text in order to meet the expectations of my readers. And I haven't produced much of anything here lately. In the future, however, the updating of this blog will probably be just like my reader base – sparse. Maybe I'll post every two weeks or something along those lines.

After all, I write for my own pleasure. This blog fulfills a need to discuss some topics that I have no other media for, even if it means I'll discuss them with myself. And it helps me practice my English and general language skills.

So, hopefully you'll hear from me in the future and I'll be able to sense your presence from the slow ticking of the stat counter or from the precious few comments to my entries. Until then, have a nice day.

Oh, and I apologize for this entry.

Conditional Probability

In my last entry on probability theory, I promised to have a more detailed look at conditional probability. We need this for solving the disease test problem and for solving the Monty Hall problem mathematically.

P(A | B) reads "the probability of A given B", and this is referred to as conditional probability. The formula for calculating this probability is

P(A | B) = P(A AND B) / P(B)

Why?

We're trying to figure out the probability that the event A is also true, given that B is true. Sometimes A may be true even though B is not, but we're not interested in those instances.

Independent events

Now, sometimes the probability of A is independent of B. For example, if we flip two coins [A, B], and each event A and B is true if the corresponding coin comes up heads. If B is true (that is, comes up heads) the probability of A is still 50% (1). This means that

P(A | B) = P(A)

where P(A) is the a-priori probability and the conditional probability is unchanged due to the information that we gained from flipping the coin B.

Dependent events

But what if

P(A | B) != P(A)

("!=" reads "does not equal") In this case, A is dependent on B. What this means is that, as we gain information about B, the probability of A changes from the a-priori probability. In this case, we need to consider all the cases when B is true:

P(A AND B) + P(NOT A AND B) = P(B)

Those are all of the instances where B is true. So we know that B is true. Out of all the instances where B is true [P(B)], some of them are instances where A is also true [P(A AND B)]:

P(A | B) = P(A AND B) / P(B)

An example

Let's say we have a drawn three cards from a deck: An ace, a king and a queen [A, K, Q]. We shuffle those three cards and draw two of them, trying to draw an ace. Let's say that the first card is not an A. What's the probability that the second one is?

First, let's define two events:

Card1 is the event that the first card is an A
Card2 is the event that the second card is an A

P(Card1) = 1/3
P(NOT Card1) = 2/3
P(Card2 AND NOT Card1) = 1/3

The last probability is easy to see from an a-priori standpoint - The probability that any one of the drawn cards will be an A is 1/3. So the probability of that one card to be an A and the other one not to be an A is also 1/3.

P(Card2 | NOT Card1) = P(Card2 AND NOT Card1) / P(NOT Card1)
P(Card2 | NOT Card1) = (1/3) / (2/3) = 1/2

More intuitively, this can be illustrated as follows:








Card 2

AKQ
Card 1A01/61/6
K1/601/6
Q1/61/60

As we can see in this matrix, there are six possible combinations of two cards. [AA, KK, QQ] are not possible, since there's only one card of each rank. Each possible combination has a 1/6 probability of occuring. [AK, AQ] are the possible combinations where the first card is an A. In the matrix, we can find the probabilities stated earlier:

P(Card1) = 2 * 1/6 = 1/3
P(NOT Card1) = 4 * 1/6 = 2/3
P(Card2 AND NOT Card1) = 2 * 1/6 = 1/3

Asking what P(Card2 | NOT Card1) is, is the same as asking "how big a fraction of the times that we don't pick an A as our first card do we pick an ace as our second card?". We can easily see in the matrix that there are 4 cases (marked as green) where we don't pick an A as our first card. In two of those cases our second card is an A. 2/4 = 1/2. But also, 2*(1/6) / 4*(1/6) = 1/2.

2*(1/6) / 4*(1/6) = P(Card2 AND NOT Card1) / P(NOT Card1)

Stated in words, in 2 out of 4 cases when the first card is not an A (out of a total of 6 possible cases, which includes draws where the first card is an A), the second card is an A. So, knowing that the first card is not an A, we can narrow the situation down to those 4 cases, giving us a probability of 2/4 = 1/2.

I hope this entry has helped your understanding of how conditional probability works. It's not very formal, and it's not very extensive, but hopefully it's quite intuitive and at least free of too big gaps in it's logic. However, it's late now, and I kind of just threw this one out there, because I haven't posted anything for a while.

__________
Notes:
(1) If you think otherwise, you're subjected to the gamblers fallacy, which we'll have a closer look at in a future post.

Soccer Penalty Kicks Article Flawed?

In my last post on this subject, I made a quick reference to an article on a mathematical examination of soccer penalty kicks. In this article, Tim Harford gives a brief survey of the findings of a paper by Ignatio Palacios-Huerta of the Brown University. The paper (pdf) is quite an interesting read, and I really recommend anyone with some knowledge in statistics and game theory to read it. However, I do believe I've detected a flaw in it. Though, before proceeding any further, I should include all the standard disclaimers, including, but not limited to, the fact that I'm in no way an authority nor an expert in this area, and that there is a chance that I've misunderstood things. I have all due respect for Mr Palacios-Huerta as a scientist and for Mr Harford as a writer, and I'm merely a layman myself.

Anyways. After writing my first entry on the article by Tim Harford, I got to thinking. Quoting from Tim Harford's article:
Professionals such as the French superstar Zinédine Zidane and Italy's goalkeeper Gianluigi Buffon are apparently superb economists: Their strategies are absolutely unpredictable, and, as the theory demands, they are equally successful no matter what they do, indicating that they have found the perfect balance among the different options. These geniuses do not just think with their feet.
At first, this seemed to be a good indication that Zidane and Buffon are indeed playing optimal strategies. But what hit me after writing my first entry, is that their playing optimal strategies doesn't make themselves indifferent between their strategy choices. That is, their playing optimally doesn't make them succeed equally often no matter what they do. It does, however, make their opponents indifferent between their strategy choices.

The optimal strategy is about making your opponent indifferent between his strategy choices. Recall, from my previous post on this subject, how the indifference equations for each player included the strategy choices for the other player, but not his own strategy choices. This relationship works two ways: Your playing optimally doesn't make you indifferent, and your indifference is not an indication that you're playing optimally.

So the Harford article is wrong. The fact that Zinédine Zidane and Gianluigi Buffon seem to be indifferent does not indicate that they play optimal strategies. It does, however, indicate that their opponents, on an aggregate level, are playing optimally.

Now, is this Tim Harford's or Ignatio Palacios-Huerta's mistake? In order to find out, I read the original paper by Mr Palacios-Huerta.

In the paper, Palacios-Huerta starts by formulating a hypothesis saying that professional players are indeed playing a minimax strategy. In order to test this hypothesis, he examines a sample of 1417 penalty kicks. I have no objections to his examination on all the players on an aggregate level. However, when testing the hypothesis for individual players, he seems to be looking at each individual players' strategy choices and their corresponding outcomes. Using Pearson statistics and p-values, based on those figures, the hypothesis is rejected for five players.

On an aggregate level, we can look at the overall figures of both sides of the game. According to the hypothesis, both goalies and kickers should have equal sucess rates, no matter their choices. This can be tested and the hypothesis rejected with the tools used by Palacios-Huerta. But when testing the hypothesis for individual players, we should look at that individual player's aggregated opponents' sucess rates for their strategy choices, which, it seems to me, is not what he's done.

So what hypothesis should we reject when the individual figures used by Palacios-Huerta don't give a good enough match with the hypothesis? Well, not the one that that particular player is playing minimax, but rather the one that his opponents, on an aggregate level, play minimax. This is not, in itself, an uninteresting hypothesis to examine, but, as far as I can see, it's not the one intended by Palacios-Huerta.

Unfortunately, the tables provided in the paper don't allow for the data to be rearranged so that we can perform this test on our own. There is no information on the strategy choices of the opponents of each individual player and their corresponding outcomes, so we can't examine the hypothesis that a specific individual player plays optimally, without accessing the underlying data.

So, what do we know about Zinédine Zidane and Gianluigi Buffon? Not much, but it seems they've been playing against superb economists.
__________
Notes:
Again, let me remind you of the disclaimers. I'm really a laysman, and I may very well be wrong. Either all wrong or just in my interpretation of the paper.

External links in this post:
World Cup Game Theory - What economics tells us about penalty kicks by Tim Harford. The quoted article in Slate Magazine.
Ignatio Palacios-Huerta at the Brown University website
Professionals Play Minimax by Ignatio Palacios-Huerta of the Brown University (pdf format)

Other resources:
Tim Harford - The Undercover Economist

The Monty Hall Problem, Part 4

Hopefully, we all agree that we should switch when faced with the problem given in the basic formulation of the Monty Hall problem. However, in the alternative formulation given in my first entry, there's one major difference. In the original game, the host was obliged to reveal a second door after watching you picking one. In my version of the game, I hadn't made any such commitment. So, why would I give you a chance to change your mind? Possibly out of generosity, sure, but most probably because I knew you had made the right choice and wanted you to switch to an empty cup.

Put in game-theory terms, switching is a dominated strategy. Your strategy choices are to switch when given the opportunity or to never switch (switch / stay). I have more strategy choices than you do. This is a full payoff matrix of the game for all possible strategy choices, with your choices represented as columns and my choices as rows. The outcome values are the probabilities of you winning the bill.








SwitchStay
No-No1/31/3
No-Yes1
1/3
Yes-No01/3
Yes-Yes2/31/3


My strategy choice "Yes-No", for example, means that I offer you an opportunity to switch if you choose the right cup initially, but I don't offer you that opportunity if you choose the wrong one. So the first Yes or No refers to whether I offer you that opportunity when you choose the right cup, and the second one to the case when you choose the wrong one.

Notice that all of my strategy choices but "Yes-No" (offering the opportunity to switch only when you've picked the right cup) are dominated. This means that they can never lead to better results, only worse, depending on your strategy choice. So there is no reason for me to choose any of those strategies. Thus removing those strategy choices, we get a much simpler payoff matrix:





SwitchStay
Yes-No01/3


It should now be obvious that switching is a dominated strategy. So in a game-theory sense, switching is a bad strategy. However, game theory isn't everything. Maybe you have a "read" on me, making you believe that I want you to have the bill. Maybe you think that I intended to always give you the switching opportunity as in the original Monty Hall problem. So there may be reasons to deviate from game-theory optimal play. But lacking such guidance, you're probably better off resorting to game theory, in this case guaranteeing you a 1/3 chance to win the prize.

Quiz: The Probability of Having a Disease

A new super-resistent virus has emerged in Farawayistan. The Ministry of Health estimate 1 out of 1000 to be infected. After immense efforts, a very accurate test has been developed. If the test subject is infected, 99.9% of the time the test will come out positive. But if the subject is not infected, 1 in 1000 times the test will yield a false positive result.

A random person is tested, and the test comes out positive. What's the probability that he's infected?

Please, answer in the comments.

The Maths of Soccer Penalty Kicks

In a very interesting article, titled World Cup Game Theory - what economics tells us about penalty kicks, Financial Times columnist Tim Harford gives a simple and intuitive introduction to how to apply game theory to soccer penalty kicks. I really recommend reading his article. (After writing this entry, I detected a flaw in this article. More on that here.)

In this entry, I'll examine the game theory of penalty kicks in some more detail, and we'll arrive at an actual formula for each player of the game. Due to the mathematical nature of this examination, there will be some math that may look dense and deterring, but it really looks more complicated than it actually is. As an attempt to make it easier to follow, I've used some color coding. Maybe, this serves only to make it look messier. Please tell me what you think in the comments.

Now, let's get started

A penalty kick can be reduced into a simple grid game. There are two players: The penalty kicker, who has the choice of which direction to shoot, and the goalie, who has the choice of which way to throw himself.

As Tim Harford points out, there is not enough time for the goalie to see which way the ball is going and to subsequently choose to go in that direction in order to intercept the ball. He must guess, risking going in the complete wrong direction. So the goalie's choice of direction is independent of the shooters choice (1).

Most shooters have a stronger and a weaker side, and should tend to favor their stronger side. But if they always choose to aim at the stronger side, the goalie can exploit this by always going that direction. The shooter can then counter-exploit this by shooting at the other side, where he will most certainly score a goal even though it's his weaker side, since the goalie will be going the other way. The goalie then reacts to this, and we have a never-ending loop of exploitations and counter-exploitations. To solve this problem, we need to find a game-theory optimal solution that offers an equilibrium to the game.

Game-theory optimal play

If both players play game-theory optimally, neither one can better his chances by altering his strategy. If he could, his strategy wouldn't have been optimal. In the same fashion, the optimal strategy is unexploitable, since the opponent can't gain an additional edge on it by switching from his optimal strategy. Remember how the shooter could elect to always shoot to his weaker side to exploit a strategy where the goalie always go to the shooters' stronger side? That means that the goalie's strategy wasn't optimal.

Now, if neither player can gain an edge by altering their strategy, then, by definition, they're indifferent between their choices. If they weren't indifferent, one choice would be better than the other, and the player would gain an edge by opting for that choice. So, in order to find the game-theory optimal strategies, we should look for indifference points.

Strategy values

This game can easily be summarized into a grid as follows:






GSGW
SS
50%95%
SW
80%30%

where the rows represent the shooter's strategy choices and the columns represent the goalie's strategy choices. Each cell correspond to a combination of the choices of both players, and the figure is the corresponding chance of a goal. GS means that the goalie throws himself in the shooter's strong direction, and GW that he opts for the shooter's weak side. SS and SW relate to the corresponding shooting strategies.

Now, those figures are just made up for the purpose of illustration. I don't claim that they're realistic in any way. Notice though, that I've taken into account the chance that the shooter misses the goal even if the goalie goes the wrong way, and that there is a bigger risk for this when he shoots to his weaker side. Notice, also, that the chance of a goal is greater if he opts for the stronger side and the goalie goes the right way, than if he opts for the weaker side and the goalie goes that way.

So, we have 2 strategies for each player: SS and SW for the shooter, and GS and GW for the goalie. We have 4 strategy pairs, SSGS, SSGW, SWGS and SWGW, with corresponding outcome values. (2)

Calculating the goalie's strategy

If the shooter chooses the strategy SS S% of the time, he will chose the strategy SW 1-S% of the time (3). Similarily, if the goalie chooses the strategy GS G% of the time, he will chose strategy GW 1-G% of the time. We should now solve for S and G, which are the strategy variables for the two players.

The expected value for the shooter of strategy SS is:

E(SS) = G * SSGS + (1 - G) * SSGW

In plain English, this means that the shooter will obtain an outcome value of SSGS (50% in our example) the G% of times when the goalie chooses the strategy GS, and he'll obtain a outcome value of SSGW (95% in our example) the 1-G% of times when the goalie chooses the strategy GS.

Similarily,

E(SW) = G * SWGS + (1 - G) * SWGW

Now, here comes the magic. The shooter is indifferent when E(SS) = E(SW), as explained above. To find this point, we'll insert the two equations above into that equation:

E(SS) = E(SW)
G * SSGS + (1 - G) * SSGW = G * SWGS + (1 - G) * SWGW
G * SSGS + SSGW - G * SSGW = G * SWGS + SWGW - G * SWGW
G * SSGS + G * SWGW - G * SSGW - G * SWGS = SWGW - SSGW
G * (SSGS + SWGW - SSGW - SWGS) = SWGW - SSGW
G = SWGW - SSGW / (SSGS + SWGW - SSGW - SWGS)

And that's the formula for the indifference point for the shooter. When the goalie chooses his actions according to this formula, the shooter will be indifferent to his strategy choices. Plugging our example figures into the formula:

G = SWGW - SSGW / (SSGS + SWGW - SSGW - SWGS)
G = 0.3 - 0.95 / (0.5 + 0.3 - 0.95 - 0.8)
G ~ 0.684

The goalie should go for the shooter's stronger side about 68.4% of the time.

Calculating the shooter's strategy

Similarly, we calculate the expected values of the goalie's strategy choices:

E(GS) = S * SSGS + (1 - S) * SWGS = S * SSGS - S * SWGS + SWGS
E(GW) = S * SSGW + (1 - S) * SWGW = S * SSGW - S * SWGW + SWGW

Notice that high outcome values are bad for the goalie, as that means a large probability of a goal. But that doesn't matter to us. We don't care about the exact values of the strategies, as long as they're equal. Now for the indifference equation:

E(GS) = E(GW)
S * SSGS - S * SWGS + SWGS = S * SSGW - S * SWGW + SWGW
S * (SSGS + SWGW - SSGW - SWGS) = SWGW - SWGS
S = SWGW - SWGS / (SSGS + SWGW - SSGW - SWGS)

Again, plugging in the example figures:

S = SWGW - SWGS / (SSGS + SWGW - SWGS - SSGW)
S = 0.3 - 0.8 / (0.5 + 0.3 - 0.95 - 0.8)
S ~ 0.526

Conclusion

The shooter should opt for his stronger side only 52.6% of the time, while the goalie goes that way 68.4% of the time. Both players are then indifferent to their choices. This means that either one of them could choose any action, and still obtain exactly the same result. But biasing towards one decision opens up for the opponent to exploit that bias, and that's why they should stay with the frequencies prescribed by the solution (4). They cant do better by changing, but they can do worse, if their opponent catches on.

Next entry on this topic: Soccer Penalty Article Flawed?
__________

Notes:
(1) Actually, some goalies have developed an ability to, based on the movements of the shooter prior to the impact with the ball, anticipate in what direction the shooter will aim. This gives him a greater time frame in order to decide which way to go. On the other hand, some shooters have developed a counter-technique of bluffing which way he's going to aim, so this boils down to a game of it's own. For simplicity, we'll assume that there are no prior indications as to which way the shooter will aim.
(2) Here, we consider the probability of a goal, for each strategy pair, an outcome. Even if the eventual, actual outcome of the kick is still uncertain, the probability of a goal, given a specific strategy pair, serves as a value of that strategy pair for each player.
(3) As explained in the entry Probability Theory for Dummies.
(4) If the opponent isn't playing the optimal strategy, however, one might consider making exploitative adjustments to the optimal strategy. Bear in mind, though, that this opens up for counter-exploitation, and should thus only be done if one don't expect the opponent to catch on.

External links in this post:
World Cup Game Theory - What economics tells us about penalty kicks
by

The Monty Hall Problem, Part 3

Previous entries in this series:
Part 1 | Part 2

In this post, we'll examine the Monty Hall problem intuitively. First, we'll start with the problem in it's basic formulation:

You're on a game show. Here are the rules of the game: There are three doors, and behind one of them there's a brand new car. Behind the two other doors, there are goats. You'll choose one of the doors, and subsequently, the game show host, who knows where the car is, will open one of the other doors, revealing a goat. Then you are given a choice: Will you stay with your initial choice or will you switch to the other unopened door?

Many answer initially that it doesn't matter. They reason that, as there are only two doors left, one of which include the car and one that does not, the chances that any one of them veils the car is 1/2. Sensible as this may seem, it's incorrect. In fact, the probability that the other door veils the car is 2/3, so you should switch.

How can this be? Of course, when we first picked a door, there was a 1/3 chance that we picked the right one. But when the game show hosts opens another door, doesn't that additional information change the situation? In fact, it doesn't. We don't gain any additional information from his opening of a door, since he'd open a door to reveal a goat whether or not we picked the right door to start with. We knew from the beginning that the probability was 1/3, and we haven't gained any additional information to change things, so we're still looking at a 1/3 probability.

Here's another way of looking at it: 1/3 of times, you'll pick the right door to start with. Staying will grant you the car, and switching gives nothing. But 2/3 of the times, you'll pick the wrong door, and the host will open the other door that also veils a goat. Those times, staying gives nothing, and switching grants you the car. So switching gives you the car in 2/3 of instances.

In the next part of this series, we'll have a look at the mathematics of this game, but for now, let's leave it at this. As a side note, if the host didn't know where the car is, but he still opened a door at random and just happened to reveal a goat, then the chances would, indeed, be 1/2 and you'd be indifferent to switching.

In the alternative formulation that I gave in my first entry, I said that the correct decision is to stay with your initial choice. The formulation was slightly different, with only one significant difference. But what? I've had a few friends asking me about this on MSN, and one of them actually came up with the correct answer by himself. Can anyone figure out what that difference is, and why it changes things so drastically?Please comment.