What is the probability of obtaining a rating increase of 400 points or more in one year, purely by chance, with no real improvement? There are perhaps 7.5 million registered players worldwide, see:
Given such a large number of players, we might expect someone somewhere at sometime to achieve a 400 point apparent improvement purely by chance. To make an estimate of the probability of this happening, we need a simple model. This model need not be completely accurate. We only need a rough answer here.
For simplicity, I will consider a player whose true rating is the same as the average true rating of his opponents. His score after an infinite number of games should then be 50%. I will assume that for every game, a coin is tossed twice. If we get two heads, our player wins. If we get two tails, he loses. If we get one head and one tail, the game is a draw. If our player plays N games, we toss the coins 2*N times, and his fractional score is the number of times that the coin comes up heads, divided by 2*N. For a simple account of the mathematics of coin tossing see:
My simple model has some limitations. The proportion of draws may not be 50%. A higher proportion of draws will reduce the variability of the score, and vice versa. (N.B. An increased variability of the score increases the chances of an inaccurate rating.) Our player’s opponents, may not all be of roughly equal strength. Some may be very strong and some very weak, in which case, the very strong players will nearly always win, and the very weak players nearly always lose. This will reduce the statistical variability of the score. Nonetheless, the simple model should be good enough for making rough estimates. These problems are, in any case, ignored by the Elo rating system.
For the USCF version of the Elo rating system, the expected fractional score s is given by:
s = 1 / (1 + 10 ^ -(d/400))
Where d is the player’s rating minus that of that of his opponent. See:
The table below gives the expected percentage score for rating differences of 100, 200 and 400 Elo points:
For N = 12 and d = 200, we expect to score 75.97%. To the nearest integer, we expect 18 heads when we toss the coin 24 times. The probability of scoring 18 or more half points can be found using the binomial distribution calculator:
The probability that we will receive a rating that is 200 or more points higher than our true rating is 0.01133, i.e. about 1 in 88. (N.B. The probability that we will receive a rating that is 200 or more points less than our true rating is the same, because of the symmetry between wins and losses. This is easily verified using the calculator.)
For N = 12 and d = 100, we expect to score 64.01%. We expect 15 heads when we toss the coin 24 times. The probability of scoring 15 or more half points is 0.1537, i.e. about 1 in 6.5.
For N = 12 and d = 400, we expect to score 90.91%. We expect 22 heads when we toss the coin 24 times. The probability of scoring 22 or more half points is 0.00001794, i.e. 1 in 55,741.
The probability that we will receive a rating that is 200 or more points lower than our true rating from the results of 12 games, and a rating that is 200 or more points higher than our true rating over another 12 games is about 1 in 88^2 = 7,744.
For N = 24 and d = 100, we expect 31 heads when we toss the coin 48 times. The probability of scoring 31 or more half points is 0.02973, i.e. about 1 in 34.
For N = 24 and d = 200, we expect 36 heads when we toss the coin 48 times. The probability of scoring 36 or more half points is 0.0003586, i.e. about 1 in 2,789.
The probability that we will receive a rating that is 200 or more points lower than our true rating from the results of 24 games, and a rating that is 200 or more points higher than our true rating over another 24 games is about 1 in 2,789^2 = 7,778,521.
I have ignored the effect of statistical variations in the opponents’ ratings in these calculations. These variations will increase the chances of a freak result. Nonetheless, they tend to average out, and it turns out that we can ignore them to a first approximation (unless the opponents are playing much fewer games).
In my experience, most competitive players play less than 24 rated games per season. For the USCF rating system, 25 games are needed for a full rating. For 24 games, the odds against a spurious 400 point rating improvement in one year are about the same as the number of competitive players (according to the estimate quoted above). On that basis, we would expect this feat to be achieved by someone somewhere about once a year. Of course, this achievement would not be at all persuasive, even to the statistically naive, unless the player happened to have no previous track record, and promptly retired from chess. That increases, the odds, but a lower number of games reduces them.
In this article, I have been looking at purely random variations in the player’s results. In practice the probability of a freak result will be greatly increased by any systematic inaccuracies in the opponents’ ratings (e.g. as a result of geographic variation, rapidly improving juniors, or the treatment of unrated players). There are also a variety of personal factors that can depress a player’s performance.
Michael de la Maza’s 400 points in 400 days may not be quite what it appears to be. [Michael de la Maza turns out to have played a large number of rated games. See my next article for an analysis of his results.] Others report suspiciously rapid training results too. Jeremy Silman said:
“I get hundreds of letters from students worldwide that gain hundreds of points in a few months from reading my strategically oriented books.” See:
Perhaps these results are not what they appear to be.