In this post, I’d like to discuss a simple sense in which statistical reasoning refutes itself. My reasoning is almost trivial and certainly familiar to statisticians. But I think that the way I frame it constitutes an argument against a certain kind of philosophical overreach: against an attempt to view statistical reasoning as a branch of logic, rather than an activity that looks more like rhetoric.

To make my argument I’d like to mash up two books which I’ve talked about before on this blog. The first is Ian Hacking’s Logic of Statistical Inference (I wrote here about its wonderful chapter on fiducial inference). The other is an interesting section in St. Augustine’s confessions, which I discussed here. Ian Hacking’s ambition is, as the title of the book suggests, to describe the basis of a logic of statistical inference. His primary tool is the comparison of the likelihoods of what he calls “chance outcomes” (implicitly he seems to mean aleatoric gambling devices, but he is uncharacteristically imprecise, implying, I think, that we simply know a chance setup when we see it).

St. Augustine, as I discuss in my earlier post, has a worldview stripped of what modern thinkers would call randomness. In St. Augustine’s vision of the world,an unknowable and all-powerful God guides to His own ends the outcome even of aleatoric devices, such as the drawing of lots and, presumably, the flipping of coins. Many people in the modern age do not think like St. Augustine. So it is reasonable to ask what I will call “St. Augustine’s question:” is St. Augustine’s deterministic worldview is correct?

# Hacking on St. Augustine’s question

I would like to attempt, using Hacking’s methods, to bring the outcome of a coin flip to bear on St. Augustine’s question. One might reasonably wonder doubt that is a fair to Hacking. However, the first sentence of Hacking’s book articulates the scope of his:

“The problem of the foundation of statistics is to state a set of principles which entail the validity of all correct statistical inference, and which do not imply that any fallacious inference is valid.”

Hacking’s goal is ambitious (my argument here is essentially that it is over-ambitious). However, to his credit, it is clear: if we can formulate the St. Augustine question as a statistical one about a chance outcome, then we should expect Hacking’s logic to come to the correct epistemic conclusion. Furthermore, Hacking states himself (when arguing against Neyman-Pearson test in Chapter 7) that “the best way to refute a principle [is] not general metaphysics but concrete example.”

Finally, lest it seem too esoteric to argue with St. Augustine, or that this example is too contrived to be meaningful, at the end of this post, I will draw connections between my argument and some shortcoming’s of likelihood-based model comparison that are well known to statisticians but largely ignored by Hacking’s book.

# Hacking’s law of likelihood

Hacking’s principle of inference is embodied in his “law of likelihood,” which is introduced in Chapter 5. The goal is to justifiably connect aleatoric statements to degrees of logical belief (without going through subjective probability). Stripping away some of Hacking’s notation, his law of likelihood states in brief that

“If two joint propositions are consistent with the statistical data, the better supported is that with the greater likelihood.”

Here I should clarify some of Hacking’s terminology. By “statistical data” he means everything you know before conducting a chance experiment, including the nature of the nature of how you get the data. A “joint proposition” is some statement about the world, possibly including things you don’t know, e.g., future unobserved data, or some unknown aspect of the real world. Hacking spends a lot of time defining and discussing his terms.

For the present purpose, it suffices to describe some of Hacking’s own examples from Chapter 5 of how the law of likelihood is to be used. Suppose that a biased coin has P(H) = 0.9 and P(T) = 0.1. Then, by the law of likelihood, the proposition \(\pi_H\) that a yet-unseen flip will be H is better supported than the proposition \(\pi_T\) that it will be T, since P(H) > P(T). Similarly, if we observe K heads out of N flips, by the law of likelihood, the proposition \(\pi_{K/N}\) that P(H) = K / N is better supported than the proposition \(\pi_{(K-1)/N}\) that P(H) = (K - 1) / N.

Are these assertions trivial? Hacking spends the first part of the book arguing that they are not, and the latter part of the book demonstrating important differences, both conceptual and practical, with decision theory and subjective probability. Suffice to say they are beyond the scope of the present post.

# Asking St. Augustine’s question with the law of likelihood

Let us suppose that we have made single coin flip which came up H. The coin was designed and flipped symmetrically to the best of our abilities. St. Augustine’s question can be expressed in terms of these two simple propositions:

- \(\pi_{R}\) (Randomness): P(H) = 0.5, and we observed H
- \(\pi_{A}\) (Augustine): P(H) = 1.0 (God wills it), and we observed H

Obviously, the law of likelihood supports \(\pi_{A}\), answering St. Augustine’s question in the affirmative, i.e., that St. Augustine’s worldview is better supported than randomness.

Let me be the first to admit that this is pretty trivial. Perhaps you are disappointed, and sorry you bothered to read this far! Let me try to bring you back in.

First, observe that the same reasoning applies to any number of coin flips. You might ask — was the sequence HTHTTH pre-ordained or random, and the law of likelihood always supports that it was pre-ordained. The same reasoning can be applied to whether some small number of flips in a particular sequence were pre-ordained — e.g., when asking whether every flip in the sequence HTHTTH random, or was at least one of them pre-ordained, the law of likelihood supports that at least one of them was pre-ordained. The same reasoning applies to degrees of probability, as well — e.g., when asking whether every flip in the sequence HTHTTH was fair, versus was it P(H) = 0.6 when H came up and P(T) = 0.6 when T came up, the law of likelihood supports that the sequence was not fair.

In short, the law of likelihood always supports the most deterministic proposition. In this sense, the law of likelihood does not support its own applicability. Without randomness, there is no need or use for a logic of statistical inference. When given the opportunity to ask whether or not there is randomness in a particular setting, the law of likelihood always militates against randomness, and eats its own tail.

# Statisticians know this, and so does Hacking

This phenomenon is no surprise to statisticians, of course. Model selection based on likelihood — whether Bayesian or frequentist in design and use — favors the more complex models unless some corrective factor is used, such as regularization or priors. The answer given by the law of likelihood to St. Augustine’s question is just an extreme end of this phenomenon.

Is Hacking aware of this problem? Of course; Hacking is aware of most things. For example, in Chapter 7, he discusses very briefly the importance of weighting likelihoods in some cases (“One author has suggested that a number be assigned to each hypothesis, which will represent the ‘seriousness’ of rejecting it … In the theory of likelihood testing, one would use weighted tests.”) Unfortunately, Hacking’s discussion of Bayesianism in Chapters 12 and 13 does not take up this point, focusing instead on arguing against uniform priors and dogmatic subjectivism. Probably most damningly, Hacking does not shrink away from using the law of likelihood to reason between a large number of expressive propositions and a single less expressive one, as in, for example, in his comparison unbiased tests in Chapter 7 (page 89 in the Cambridge Philosophy Classics 2016 edition). In summary, Hacking does not appear to take very seriously the fundamental role extra-statistical evidence must play in applications of the law of likelihood, in order to avoid its own self-refutation.

# We must deliberately choose the statistical analogy

The point is that describing the world with randomness is a choice we make, and we make it because it is sometimes useful to us. In the course of doing something like statistical inference, we *must* posit *a priori* the existence of randomness as well as explanatory mechanisms of limited complexity. At the core of statistical reasoning is the *discard* of information — of viewing a set of voters, each entirely unique, as equivalent to balls drawn from an urn, or viewing the days weather, which is fixed from yesterday’s by deterministic laws of physics, as something exchangeable with some hypothetical population of other days, conceptually detached from contingency and their own pasts. Failure to remember this can lead to silly arguments about whether phenomena are “really random.” In other words, we must choose to make the statistical analogy, and accept that its applicability may not be indisputable.

From this perspective, Hacking’s ambition — a logic of statistical inference — seems hopeless, not because of some inevitably subjective nature of probability itself, but because of the subjective nature of analogy. How can you form a logic which will give correct conclusions in every application of an analogy? The affairs of statistics are inevitably human and not purely computational, and the field is more exciting and fruitful for it.