2025-11-01
What are statistics for?
Reading Causal Inference in Statistics by Pearl, Glymour, and Jewell this morning,1 and the following stopped me in my tracks:
“In Ronald Fisher’s influential manifesto, he pronounced that ‘the object of statistics is the reduction of data’ (Fisher 1922). In keeping with that aim, the traditional task of making sense of data, often referred to generically as ‘inference,’ became that of finding a parsimonious mathematical description of the joint distribution of a set of variables of interest, or of specific parameters of such a distribution.”
Pearl et al. use this to emphasize that the traditional role of statistics is not to articulate causal relationships. But, the passage jumped out to me because it is a quote from 100 years ago from someone whose approach to statistics deeply influences our work today.2 It’s striking if we pause to think about the information technology of Fisher’s day: spreadsheets were paper. So, being able to reduce a column to a few numbers (moments if you will), was a huge task and valuable in it’s own right.
I think there is something a little more general here: it’s easy for us to forget the influence of technology on our current practices. Part of why so much work went into closed form calculations for clustered, and other, standard errors was simply that bootstrap was time consuming.3 The same is true for why we use OLS for everything (I’m not knocking OLS here, I use it as much as possible), maximum likelihood was hard to estimate when most of the foundational work in many fields was done. I suspect that this is part of the reason why Fama-Macbeth regressions were so popular, they are a clever solution to the particular problem at hand, but also a very attractive solution when you had to run your regressions overnight using punch cards.
-
because my job is amazing and I can do exactly what I want when I want) ↩
-
e.g. our current, though hopefully soon to be outmoded, preoccupation with statistical significance is mostly due to—a misreading of—Fisher. ↩
-
In 2006 my Econometrics professor had a separate computer for bootstrapping. It would run for days. ↩