# Jeffreys' scale

Jeffreys' scale of evidence is a subdivision of the possible values of the Bayes factor into categories (or grades).

The Bayes factor quantifies the strength of the evidence in favor of a model, as compared to another model.

Jeffreys' scale is used to translate the value of the Bayes factor into a qualitative judgement on the evidence (substantial, strong, very strong, etc.).

## More than one scale

Statisticians use several different versions of Jeffreys' scale.

We report below three versions that are widely cited in the literature:

## The Bayes factor

Let be some data and and two models (i.e., two sets of probability distributions that could have generated the data).

Remember that the posterior odds ratio between and is

whereis the Bayes factor.

Values of the Bayes factor larger than are interpreted as evidence in favor of model (relative to ). The larger the values, the stronger the evidence.

On the contrary, values smaller than are interpreted as evidence in favor of .

## Jeffreys' original scale

Here is Jeffreys' (1939) original scale.

1 to 101/2 0 to 1/2 Barely worth mentioning
101/2 to 10 1/2 to 1 Substantial
10 to 103/2 1 to 3/2 Strong
103/2 to 102 3/2 to 2 Very strong
> 102 > 2 Decisive

Thus, the boundaries of the categories in Jeffreys' scale are for .

The grades above are for the evidence in favor of (values of the Bayes factor larger than ). When , we have categories of evidence in favor of , whose boundaries are the reciprocals of the boundaries in the previous table.

1 to 10-1/2 0 to -1/2 Barely worth mentioning
10-1/2 to 10-1 -1/2 to -1 Substantial
10-1 to 10-3/2 -1 to -3/2 Strong
10-3/2 to 10-2 -3/2 to -2 Very strong
< 10-2 < -2 Decisive

For the sake of historical accuracy, here is the relevant excerpt from Jeffreys' book.

Since the categories of evidence in favor of can be constructed mechanically from those in favor of , we do not report them for the other scales shown below.

## Lee and Wagenmakers' scale

Lee and Wagenmakers (2014) made two minor modifications to Jeffreys' scale:

1. they rounded the two boundaries and to and respectively;

2. they changed the labels of the categories; in particular, they changed "substantial" to "moderate", as they thought that the original label sounded too decisive.

Here is the result of their changes.

1 to 3 Anecdotal
3 to 10 Moderate
10 to 30 Strong
30 to 100 Very strong
> 100 Extreme

## Kass and Raftery's scale

Kass and Raftery (1995) simplified the scale by eliminating a category. Moreover, they raised the thresholds of strong and very strong evidence.

They also provided a logarithmic scale that is approximately equivalent to the ordinary scale.

B1,2 2 ln(B1,2) Grades of evidence
1 to 3 0 to 2 Barely worth mentioning
3 to 20 2 to 6 Positive
20 to 150 6 to 10 Strong
> 150 > 10 Very strong

## References

Jeffreys H. (1939) Theory of Probability, 3rd edition, Oxford University Press.

Lee M. D. and Wagenmakers, E.-J. (2014) Bayesian cognitive modeling: A practical course, Cambridge University Press.

Kass, R. E. and Raftery, A. E. (1995) Bayes factors, Journal of the American Statistical Association, 90, 773-795.