Hanging in there after a hundred
It is well known that some batsmen are better than others when it comes to going on to very big scores after getting a start. The differences between individuals can be surprising; for an extreme recent example look at two of today’s top opening batsmen, Matthew Hayden and Virender Sehwag. A comparison of the last 10 Tests centuries for each batsman shows a remarkable contrast.
In this table, Sehwag has scored 912 runs after reaching 100, while Hayden has mustered only 219. In fact, Hayden has converted only one of his last 15 Tests centuries into a 150, whereas Sehwag has clocked up nine conversions in a row (a world record; not even Bradman managed this).
The contrast might be more understandable if Sehwag was by far the superior batsman, but of course this is not the case. Hayden scored his last ten centuries in the space of just 45 innings, where Sehwag needed 68 innings; Hayden averaged 60.0 in that time to Sehwag’s 54.2. Sehwag even spent some time on the Indian reserves bench in that time.
A deeper understanding of this might require an excursion into psychology; it’s better for the moment to leave it simply as an intriguing difference between two great players.
A wider examination of such differences is quite straightforward; just calculate the “century average” of all players. One way is to take a simple average of all Test centuries (ignoring the effect of not-outs); the leaderboard looks like this:
Now any measure of scoring that puts Don Bradman on top is all right by me, but there are better ways of doing this. Bradman, after all, made some very big scores in “timeless” Tests that would be curtailed under modern conditions, and that would bring down the average size. An alternative is to take a standard batting average of the centuries, accounting for not-outs.
Some care is required. For a proper comparison of the ability to progress beyond 100, the first 100 runs of each century must be set aside, otherwise anomalies occur. (For example, a batsman scoring 100 not out, 100, and 100 not out would end up with a century average of 300 even though he has never scored a single run past 100.) By ignoring the first 100 runs in each century, a score of exactly 100 becomes equivalent to a duck in a normal batting average, while a score of 100 not-out will have no effect on the average, equivalent to a score of 0 not-out. This is fair enough, since a score of 100 not-out tells us nothing about a player’s ability to score after reaching 100.
It is interesting that, when you calculate such averages, many batsmen come up with a century average similar to, or just a little higher than, their ordinary batting average (for example, Jacques Kallis 57.4, Greg Chappell 56.1, Allan Border 55.0, Sunil Gavaskar, 51.9, Adam Gilchrist 49.6, Marcus Trescothick 45.1; this applies even to Bradman, 108.0). However, there are notable exceptions, and Sehwag is among them.
When it comes to converting hundreds into giant scores, Kumar Sangakkara is a phenomenon. In his last thirteen Test centuries, he has been dismissed below 150 only once, while scoring six double-centuries plus that umpire-truncated 192 against Australia. It is also quite curious that, in addition to Sangakkara and Marvan Atapattu, the Sri Lankans Sanath Jayasuriya (68.3) and Mahela Jayawardene (67.2) are also in the all-time top 20.
At the other end of the scale, while it is not surprising to see Mark Waugh (highest score 153) near the extreme, it is intriguing to compare his century average with his brother, who averaged 67.2. Honourable mention should go to Graeme Wood, who, with only nine centuries, did not qualify for the list, but whose century average was only 17.4. Wood was out for exactly 100 in three of his nine tons.
And what of Matt Hayden? His century average is 39.0, quite low, but it would be much lower still without his 380 against Zimbabwe. In fact, imagine if Hayden’s 380 had never happened, and we were to try to predict the major Australian batsmen most likely to ever make such a score. Hayden would have to be just about the least likely, with the exception of Mark Waugh.
Finally, here is a similar list for half-century averages, the batsmen most likely to go on to big scores after reaching 50.
My previous post on the fastest and slowest innings attracted some lively comments. Some thought that the calculation was too complex, others thought that it needed more sophistication. The numbers of these comments seemed about equal, so perhaps I was doing something right.
Some pointed out that, because the distributions are skewed, comparing scores of different sizes could be unreliable. This is valid, up to a point. One could probably normalise the distributions, perhaps by taking the logarithms of the balls faced. This is a nuance that must await some future day; this is a cricket blog, not a statistics journal. My gut feel is that the results would not be significantly different if a fancier approach was taken.
Someone asked about Hanif Mohammad’s epic 337 against the West Indies. This is tricky, firstly because we don’t know the balls faced, and secondly because there are so few innings of similar size to compare it with. However, the z-score can be estimated at 6.42.
If you like, check out a detailed analysis of this innings on my blog. Scroll down to 23 June 2007.
Inevitably, there are those who come onto blogs like this to cleverly inform us that “statistics don’t tell the whole story” (or words to that effect). I have been following cricket stats for 40 years or so, and I have never heard anyone, statistician or otherwise, claim that stats DID tell the whole story. Just enjoy stats for what they are, an important dimension of the game.