Originally published in The Reasoner Volume 10, Number 3 – March 2016
Hard-to-quantify (aka “severe”, “deep”, “Knightian”, etc.) uncertainty is a major concern both for theorists and practitioners of decision-making. Over the past few years our urgent need to understand and manage this kind of uncertainty has been argued for mainly with reference to the disastrous consequences of the financial crisis, natural disasters, terrorism, and similarly dreadful phenomena which often make the global headlines. David Spiegelhalter in his 2015 book Sex by Numbers: What Statistics Can Tell Us About Sexual Behaviour, tackles the problems of reasoning, decision, and policy-making under typically unreliable statistical data from a decidedly less negative angle. Spiegelhalter is Winton Professor of the Public Understanding of Risk in the Statistical Laboratory at the University of Cambridge, and author of the Understanding Uncertainty blog. The book has been commissioned by Wellcome Collection and published with Profile Books. It is based on three large surveys done in 1990, 2000 and 2010 by the British National Surveys of Sexual Attitudes and Lifestyles (NATSAL), and which are believed to constitute be largest scientific study of sex in the world to date. The volume is complemented by the interactive Sex by Numbers Infographic which sums up, in style, some of the central findings of the study.
The link between this wonderfully entertaining volume relates and severe uncertainty is easily explained by the author:
“A strictly scientific approach might install CCTV in a randomly selected set of bedrooms. This would not only make staggeringly dull viewing for most of the time, but would also miss those sudden bursts of passion on the shower or the shed.”
Spiegelhalter then adds that “head-cams” on some volunteers would not be a sensible fix, as the information gathered in this way would hardly be representative of the population. Thus the only way to obtain data in this field is through surveys. However, it turns out that sex is one topic about which people tend not to be very open. In some cases respondents are simply reluctant to speak, in other cases they tend exaggerate (often unconsciously) their responses. All this hinders the reliability of the resulting statistics, which nonetheless provide vital input for public debate and policy-making.
The similarities between reasoning and decision-making about sexual behaviour and graver problems like climate change, are striking. And similar problems often lead to similar solutions. So Spiegelhalter puts forward a “star rating” system for probabilistic statements. Intuitively this is a way to qualify statistical statements according to the reliability of the information which supports them. The author considers a five-valued rating system. Four stars are attached to “numbers that we can believe”. This essentially means data obtained through methodologically accurate random sampling. This is the ranking, for instance, of the statement to the effect that for every 20 girls, 21 boys are born. “Reasonably accurate” statistics give rise to three-star statements. Most of the Nats survey falls in this category, which is less accurate than the previous one primarily as a consequence of the respondents’ uneasiness about the topic. Then two-star statements are those which may be significantly unreliable. This is the class of statistical data which does not rely on random sampling. One-star is awarded to plainly unreliable numbers, which despite possibly having a rationale, are “useless” for statistical purposes. Finally zero-star statements are those which report “made up numbers”. An example: 80,000 being the number of prostitutes in London in the 1850s estimated by the bishop of Exeter.
This rating system is quite reminiscent of the two-dimensional way in which the Intergovernmental Panel Climate Change (IPCC) recommends expressing uncertainties in the “Guidance Notes for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties”. As I briefly recalled in the July 2015 issue of The Reasoner, one dimension is a (seven-valued) probabilistic scale, which ranges from the “virtually certain” to the “exceptionally unlikely”. On the other dimension, a five-valued confidence scale ranging from “very high” to “very low”.
Comparing the characterisation given by those two-dimensional scales of uncertainty, one has the impression that Spiegelhalter’s is much easier to understand, and presumably to communicate. This certainly owes to the fact that the uncertainties involved there are quantified objectively. In other words probabilities need not depart from (finite) relative frequencies. This is clearly not the case in climate science, where probabilities arise from incomparably more complex data. Be it as it may, this may give climate scientists one additional reason special to get peep into Sex by numbers!