The Law of Anomalous Numbers

One of my goals for this year is to improve my understanding of risk, probability and statistics. In pursuit of this I’ve been doing an online course, ‘An Intuitive Introduction to Probability’.

A really interesting concept which the course reveals is Benford’s law, also known as the law of anomalous numbers.

Benford’s law can be stated as follows

$$P[d] = \log_{10} (1+\frac{1}{d}), d\in\{1,2,\ldots,9\}$$

Benford’s law tells us about the expected frequency distribution of the leading digit in collections of naturally occuring numbers.

What’s most remarkable is that Benford’s law predicts a logarithmic distribution! The probability of the first digit being a one is over 30%, whilst the probability that the first digit is a nine is less than 5%. In other words, in naturally occurring collections of numbers, the first digit is likely to be small.

Digging a little deeper I found the original paper written by Frank Benford in 1937. The paper explains that groups of numbers, individually without any relationship and each composed of four or more digits, when considered collectively, are in good agreement with a logarithmic distribution law.

The more unrelated a group of numbers, the more they tend to be in agreement with Benford’s law. For example, population density data is likely to be in agreement, whilst molecular weights or physical constants are not.

A particularly clever use of Benford’s law is detecting accountancy fraud. A 2011 paper used Benford’s law to analyse EU governmental economic data and found that

The data reported by Greece shows the greatest deviation from Benford’s law among all euro states.

This was, in particular, true during the period leading up to Greece’s admission to the eurozone. Greece was followed by Belgium, and then Austria. The country with the least deviation from Benford’s law was the Netherlands.

Coming from a cryptography background, I’m used to the word ‘random’ being used to refer to uniformly distributed data. I therefore found it particularly surprising that a logarithmic distribution presents itself when applied to essentially random numbers. Benford hints in his paper at how this pattern is one which seems to extend to all human affairs, and I think that’s something worth thinking about.

In the future I’d like to try applying Benford’s to the financial reports of public companies, and then back-testing the resulting deviation against the normalised share price.