Tackling not-outs, and answering reader queries
First let me explain the reasons for undertaking this whole exercise of extended batting averages:
Let me start by replacing the first para of my article with the following, just to put to bed the Tendulkar v Lara arguments. Consider the following two outstanding batsmen, among the best of their generation.
Richards’ average is nearly eight behind Kallis', but is he that far behind? One of the main reasons for the difference in average has been the wide disparity in not-outs between the two, 12 against 31. It might be partly because of the way Richards played, almost always in an attacking mode. Both Richards and Kallis have similar Batting Position Index values - which is the average batting position at which a batsman has batted in - of 4.16 (Richards) and 3.77 (Kallis), indicating almost similar batting positions. This analysis seeks a way to normalise such situations.
Now to respond to some of the comments that came in:
The 1500 runs cut-off wasn’t meant to exclude Vinod Kambli, as someone suggested (Kambli is incidentally one of my favourite players). It was determined that the overall runs per Test for a top-order batsmen was around 75. The 1500 runs meant that one would have played 20 tests, which is a fair number of games. It also allowed me to include Hussey, which ensured further discussion on this phenomenal cricketer. Selecting the top 25 batsmen was again done to allow to include Lara and Pietersen, who were two of the 5 batsmen whose EBA was greater than their Batting Average.
The average of last ten innings could be construed as an arbitrary decision. Come to think of it, if I had taken five innings, it would have seemed too few, while 20 might have seemed too many. Ten innings represents about seven tests, which in turn is a minimum of two Test series.
Chris made a valid point about the order of the first table, stating that it should have been ordered by batting average rather than the EBA. A valid point, and I apologise for overlooking the significance. Unfortunately I had split the EBA-ordered wide table into two smaller ones and should have re-ordered the same.
A number of people have commented that this exercise was not needed since the final EBA table is more or less the same as the batting average table. My argument is that the result does not invalidate the analysis process.
The question of not-outs
The extension of not-out innings has attracted the most comments and rightly so. The approach I have taken can be construed as arbitrary. However it must be remembered that what has been done is neither a statistical extension nor a simulation-based computation. It is a fourth-dimension prediction and should be taken as it is. I can only repeat that the EBA should be taken to complement the current and much more understood batting average. The EBA can never be a substitute for batting averages since the common man can neither compute the same on his own nor understand the same easily.
When the concept was first created, the batting average was added to the not-out innings. It was only when I reworked the same concept for this blog did I change it slightly to include current form.
Some of the responses to the not-out issue have been interesting. Stuart says:
A batting average measures the number of runs between dismissals. If you get 20* and 27, that is equivalent to a single innings of 47 for your batting average. It also means you cobbled together 47 runs before you got out, whether it was over two innings or one. As it stands, interpreted correctly, a batting average is a perfect measure and needs no adjustments or fiddling.
That’s a fine analysis, and we could take this as an additional measure.
One of the best alternatives, and quite simple to implement also, was provided by Arvind Agarwal. It is given below.
EBA = Batting Average x (1 - (Not Out Inngs / Total Inngs) ^ 2. The computed values are:
Lara = 52.80 (0.998 x Average)
Sachin = 53.82 (0.980 x Average)
Bradman = 97.93 (0.980 x Average)
Ponting = 58.08 (0.977 x Average)
M Hussey = 82.04 (0.945 x Average)
My gut feel is that Arvind's computations match mine almost completely without getting into any of the not-out extension complications and very easy to compute. Again this has to be taken as an additional measure rather than a replacement of the batting average.
There have been suggestions to take into account the match conditions, bowling attack etc., but it would be too complicated an exercise for this simple task. Similarly, the idea of using weighted averages instead of using the average of the last ten innings is a good one, but it makes the process more difficult and the results difficult to comprehend for the non-statiscally oriented people.
Glossus has suggested considering only those innings in which the batsman was dismissed, and ignoring the not-out innings. The table below has the results for this exercise.
|Batsman||Tests||Career average||Out batting average||Extended batting average|
Charles Davis, in his blog , has commented on this computation. Some of the answers to Charles can be found elsewhere in this article. Our first basis was the career average and would probably have been more apt. However I must point out to Charles that the "not exceeding the highest score" idea was only done to prevent extremely high scores, especially when batsmen (like Sangakkara/Yousuf/Kallis) are going through an outstanding run of form. That restriction may not be needed if the career average is used. However I must point out that the standard deviation differential between the career average and last 10 innings, according to Charles himself, is less than 10%. Charles, many thanks for your comments.
Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems