Hanging in there after a hundred

It is well known that some batsmen are better than others when it comes to going on to very big scores after getting a start. The differences between individuals can be surprising; for an extreme recent example look at two of today’s top opening batsmen, Matthew Hayden and Virender Sehwag. A comparison of the last 10 Tests centuries for each batsman shows a remarkable contrast.

Test hundreds by Hayden and Sehwag
Hayden Sehwag
138 130
111 195
118 309*
110 155
137 164
102 173
153 201
124 254
123 180
103 151

In this table, Sehwag has scored 912 runs after reaching 100, while Hayden has mustered only 219. In fact, Hayden has converted only one of his last 15 Tests centuries into a 150, whereas Sehwag has clocked up nine conversions in a row (a world record; not even Bradman managed this).

The contrast might be more understandable if Sehwag was by far the superior batsman, but of course this is not the case. Hayden scored his last ten centuries in the space of just 45 innings, where Sehwag needed 68 innings; Hayden averaged 60.0 in that time to Sehwag’s 54.2. Sehwag even spent some time on the Indian reserves bench in that time.

A deeper understanding of this might require an excursion into psychology; it’s better for the moment to leave it simply as an intriguing difference between two great players.

A wider examination of such differences is quite straightforward; just calculate the “century average” of all players. One way is to take a simple average of all Test centuries (ignoring the effect of not-outs); the leaderboard looks like this:

Average size of all scores over 100 (at least 10 Test hundreds)
Batsman 100s Average
Don Bradman 29 186.0
Kumar Sangakkara 16 180.9
Zaheer Abbas 12 179.8
Virender Sehwag 13 174.6
Brian Lara 34 173.2
Dennis Amiss 11 170.8
Sanath Jayasuriya 14 168.3
Wally Hammond 22 167.5
Bob Simpson 10 164.6
Marvan Atapattu 16 161.5
Herschelle Gibbs 14 159.0
Graeme Smith 13 158.6
Mahela Jayawardene 21 157.6

Now any measure of scoring that puts Don Bradman on top is all right by me, but there are better ways of doing this. Bradman, after all, made some very big scores in “timeless” Tests that would be curtailed under modern conditions, and that would bring down the average size. An alternative is to take a standard batting average of the centuries, accounting for not-outs.

Some care is required. For a proper comparison of the ability to progress beyond 100, the first 100 runs of each century must be set aside, otherwise anomalies occur. (For example, a batsman scoring 100 not out, 100, and 100 not out would end up with a century average of 300 even though he has never scored a single run past 100.) By ignoring the first 100 runs in each century, a score of exactly 100 becomes equivalent to a duck in a normal batting average, while a score of 100 not-out will have no effect on the average, equivalent to a score of 0 not-out. This is fair enough, since a score of 100 not-out tells us nothing about a player’s ability to score after reaching 100.

It is interesting that, when you calculate such averages, many batsmen come up with a century average similar to, or just a little higher than, their ordinary batting average (for example, Jacques Kallis 57.4, Greg Chappell 56.1, Allan Border 55.0, Sunil Gavaskar, 51.9, Adam Gilchrist 49.6, Marcus Trescothick 45.1; this applies even to Bradman, 108.0). However, there are notable exceptions, and Sehwag is among them.

Highest century averages (batting average of runs beyond the hundred
Batsman 100s Average
Kumar Sangakkara 16 129.4
Don Bradman 29 108.4
Andy Flower 12 100.0
Wally Hammond 22 99.0
Dennis Amiss 11 97.4
Zaheer Abbas 12 95.8
Javed Miandad 23 85.6
Dean Jones 11 82.7
Marvan Atapattu 16 82.0
Brian Lara 34 77.8
Garry Sobers 26 77.5
Virender Sehwag 13 74.6
Sachin Tendulkar 39 71.7
Len Hutton 19 71.1

When it comes to converting hundreds into giant scores, Kumar Sangakkara is a phenomenon. In his last thirteen Test centuries, he has been dismissed below 150 only once, while scoring six double-centuries plus that umpire-truncated 192 against Australia. It is also quite curious that, in addition to Sangakkara and Marvan Atapattu, the Sri Lankans Sanath Jayasuriya (68.3) and Mahela Jayawardene (67.2) are also in the all-time top 20.

Lowest century averages (batting average of runs beyond hundred)
Batsman 100s Average
Allan Lamb 14 22.0
Mohinder Amarnath 11 25.3
Mark Waugh 20 25.8
Mushtaq Mohammad 10 25.8
Andrew Strauss 10 26.5
Alvin Kallicharan 12 26.7
Damien Martyn 13 26.7
John Wright 12 28.0
Nasser Hussain 14 28.7
Colin Cowdrey 22 8.9
Michael Atherton 16 28.9

At the other end of the scale, while it is not surprising to see Mark Waugh (highest score 153) near the extreme, it is intriguing to compare his century average with his brother, who averaged 67.2. Honourable mention should go to Graeme Wood, who, with only nine centuries, did not qualify for the list, but whose century average was only 17.4. Wood was out for exactly 100 in three of his nine tons.

And what of Matt Hayden? His century average is 39.0, quite low, but it would be much lower still without his 380 against Zimbabwe. In fact, imagine if Hayden’s 380 had never happened, and we were to try to predict the major Australian batsmen most likely to ever make such a score. Hayden would have to be just about the least likely, with the exception of Mark Waugh.

Finally, here is a similar list for half-century averages, the batsmen most likely to go on to big scores after reaching 50.

Highest half-century averages (batting average of runs beyond the 50)
Batsman 50+ scores Average
Don Bradman 42 123.4
Dennis Amiss 22 86.1
Wally Hammond 46 85.3
Jimmy Adams 20 82.6
Virender Sehwag 26 77.0
Kumar Sangakkara 40 75.5
Marvan Atapattu 33 74.4
Garry Sobers 56 72.7
Dean Jones 25 69.8
Steve Waugh 82 69.2
Zaheer Abbas 32 68.6
Sachin Tendulkar 88 67.1
Brian Lara 82 65.8
Ricky Ponting 73 65.5


My previous post on the fastest and slowest innings attracted some lively comments. Some thought that the calculation was too complex, others thought that it needed more sophistication. The numbers of these comments seemed about equal, so perhaps I was doing something right.

Some pointed out that, because the distributions are skewed, comparing scores of different sizes could be unreliable. This is valid, up to a point. One could probably normalise the distributions, perhaps by taking the logarithms of the balls faced. This is a nuance that must await some future day; this is a cricket blog, not a statistics journal. My gut feel is that the results would not be significantly different if a fancier approach was taken.

Someone asked about Hanif Mohammad’s epic 337 against the West Indies. This is tricky, firstly because we don’t know the balls faced, and secondly because there are so few innings of similar size to compare it with. However, the z-score can be estimated at 6.42.

If you like, check out a detailed analysis of this innings on my blog. Scroll down to 23 June 2007.

Inevitably, there are those who come onto blogs like this to cleverly inform us that “statistics don’t tell the whole story” (or words to that effect). I have been following cricket stats for 40 years or so, and I have never heard anyone, statistician or otherwise, claim that stats DID tell the whole story. Just enjoy stats for what they are, an important dimension of the game.