Testing the fairness of a coin.

One of the examples most oftenly used in the introduction of probability is the toss of a coin: we toss a coin, say, 1000 times, and then calculate the frequency of one outcome, such as heads. We then define a fair coin as having \Pr(Heads)=\Pr(Tails)=0,5. Now we often perform certain procedures to “test” whether the coin is fair – technically, we compute the probability of the observed proportion under the assumed model (“the fair coin”).

However, consider a coin that produces the following results: in odd throws, it will come up heads, and in even throws, it will come up tails. With this coin, we will have a long-run frequency of heads of one-half. So, we find no reason to reject the null hypothesis of “fair coin”.

However, clearly this coin doesn’t produce random results. While it is a fair coin with regards to the proportion of heads or tails, it displays strong auto-correlation. This means that, given the results of throw n, we are very confident in predicting throw n+1. This means that we should take one more aspect into consideration when determining whether or not a coin is fair: independence of the throws.


Now, consider series of 8 throws (I guess a mathematician would call them 8-tuples). What about finding the series “HHHTHT” in these 8-tuples? Clearly, there are 2^8 = 256 possible 8-tuples, of which the following contain the series:

  1. xxHHHTHT
  2. xHHHTHTx
  3. HHHTHTxx

There are 2^2 = 4 8-tuples of type 1), 4 of the second and 4 of the third – i.e., 12 8-tuples containing the series. We should expect that around 4,7% of the 8-tuples contain this series. However, it is possible to produce a number of 8-tuples (an outcome space) that makes the coin fair according to the standards above, even though it contains a much greater number of this series than one would expect – which leads me to my point:

We often call statistics the science of uncertainty (or randomness); but randomness is not a concept that is unique to statistics. Consider the information theoretical approaches to uncertainty: the two rules above for a fair coin is equivalent to maximizing the Shannon entropy over the parameter space (p,\rho). When moving ahead in statistics, we should always ponder what we really mean by “the null model”.