Series  Countries  Live Scores  Fixtures  Results  News 
Features

Photos  Video & Audio  Blogs  Statistics  Archive  Fantasy  Mobile 
First let me explain the reasons for undertaking this whole exercise of extended batting averages:
Let me start by replacing the first para of my article with the following, just to put to bed the Tendulkar v Lara arguments. Consider the following two outstanding batsmen, among the best of their generation.
Batsman  Tests  Innings  Notouts  Runs  Average 

Viv Richards  105  182  12  8540  50.24 
Jacques Kallis  111  189  31  9197  58.31 
Richards’ average is nearly eight behind Kallis', but is he that far behind? One of the main reasons for the difference in average has been the wide disparity in notouts between the two, 12 against 31. It might be partly because of the way Richards played, almost always in an attacking mode. Both Richards and Kallis have similar Batting Position Index values  which is the average batting position at which a batsman has batted in  of 4.16 (Richards) and 3.77 (Kallis), indicating almost similar batting positions. This analysis seeks a way to normalise such situations.
Now to respond to some of the comments that came in:
The 1500 runs cutoff wasn’t meant to exclude Vinod Kambli, as someone suggested (Kambli is incidentally one of my favourite players). It was determined that the overall runs per Test for a toporder batsmen was around 75. The 1500 runs meant that one would have played 20 tests, which is a fair number of games. It also allowed me to include Hussey, which ensured further discussion on this phenomenal cricketer. Selecting the top 25 batsmen was again done to allow to include Lara and Pietersen, who were two of the 5 batsmen whose EBA was greater than their Batting Average.
The average of last ten innings could be construed as an arbitrary decision. Come to think of it, if I had taken five innings, it would have seemed too few, while 20 might have seemed too many. Ten innings represents about seven tests, which in turn is a minimum of two Test series.
Chris made a valid point about the order of the first table, stating that it should have been ordered by batting average rather than the EBA. A valid point, and I apologise for overlooking the significance. Unfortunately I had split the EBAordered wide table into two smaller ones and should have reordered the same.
A number of people have commented that this exercise was not needed since the final EBA table is more or less the same as the batting average table. My argument is that the result does not invalidate the analysis process.
The question of notouts
The extension of notout innings has attracted the most comments and rightly so. The approach I have taken can be construed as arbitrary. However it must be remembered that what has been done is neither a statistical extension nor a simulationbased computation. It is a fourthdimension prediction and should be taken as it is. I can only repeat that the EBA should be taken to complement the current and much more understood batting average. The EBA can never be a substitute for batting averages since the common man can neither compute the same on his own nor understand the same easily.
When the concept was first created, the batting average was added to the notout innings. It was only when I reworked the same concept for this blog did I change it slightly to include current form.
Some of the responses to the notout issue have been interesting. Stuart says:
A batting average measures the number of runs between dismissals. If you get 20* and 27, that is equivalent to a single innings of 47 for your batting average. It also means you cobbled together 47 runs before you got out, whether it was over two innings or one. As it stands, interpreted correctly, a batting average is a perfect measure and needs no adjustments or fiddling.
That’s a fine analysis, and we could take this as an additional measure.
One of the best alternatives, and quite simple to implement also, was provided by Arvind Agarwal. It is given below.
EBA = Batting Average x (1  (Not Out Inngs / Total Inngs) ^ 2. The computed values are:
Lara = 52.80 (0.998 x Average)
Sachin = 53.82 (0.980 x Average)
Bradman = 97.93 (0.980 x Average)
Ponting = 58.08 (0.977 x Average)
M Hussey = 82.04 (0.945 x Average)
My gut feel is that Arvind's computations match mine almost completely without getting into any of the notout extension complications and very easy to compute. Again this has to be taken as an additional measure rather than a replacement of the batting average.
There have been suggestions to take into account the match conditions, bowling attack etc., but it would be too complicated an exercise for this simple task. Similarly, the idea of using weighted averages instead of using the average of the last ten innings is a good one, but it makes the process more difficult and the results difficult to comprehend for the nonstatiscally oriented people.
Glossus has suggested considering only those innings in which the batsman was dismissed, and ignoring the notout innings. The table below has the results for this exercise.
Batsman  Tests  Career average  Out batting average  Extended batting average 

Don Bradman  52  99.94  83.83  97.81 
Michael Hussey  18  86.18  69.05  81.34 
George Headley  22  60.83  45.61  61.33 
Herbert Sutcliffe  54  60.73  54.64  60.54 
Graeme Pollock  23  60.97  54.43  59.68 
Everton Weekes  48  58.62  54.88  58.53 
Ricky Ponting  112  59.40  49.46  58.52 
Wally Hammond  85  58.46  46.19  58.43 
Garry Sobers  93  57.78  44.06  58.16 
Ken Barrington  82  58.67  50.37  58.11 
Eddie Paynter  20  59.23  48.31  57.71 
Jack Hobbs  61  56.95  53.34  56.52 
Jacques Kallis  111  58.21  42.42  56.43 
Len Hutton  79  56.67  47.89  56.41 
Kumar Sangakkara  68  55.74  46.16  56.26 
Clyde Walcott  44  56.69  51.03  56.14 
Rahul Dravid  113  56.26  47.60  55.54 
Mohammad Yousuf  77  55.72  48.84  55.28 
Sachin Tendulkar  141  54.94  44.33  53.90 
Dudley Nourse  34  53.82  47.49  53.40 
Brian Lara  131  52.89  49.76  52.97 
Kevin Pietersen  30  52.69  50.44  52.84 
Greg Chappell  87  53.86  44.57  52.79 
Matthew Hayden  91  52.57  49.19  52.50 
Javed Miandad  124  52.57  41.97  51.62 
Charles Davis, in his blog , has commented on this computation. Some of the answers to Charles can be found elsewhere in this article. Our first basis was the career average and would probably have been more apt. However I must point out to Charles that the "not exceeding the highest score" idea was only done to prevent extremely high scores, especially when batsmen (like Sangakkara/Yousuf/Kallis) are going through an outstanding run of form. That restriction may not be needed if the career average is used. However I must point out that the standard deviation differential between the career average and last 10 innings, according to Charles himself, is less than 10%. Charles, many thanks for your comments.
Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratingsrelated systems
Feeds: Anantha NarayananKeywords: Trivia
© ESPN Sports Media Ltd.
 
Comments have now been closed for this article 

Anantha Narayanan
Anantha spent the first half of his fourdecade working career with corporates like IBM, Shaw Wallace, NCR, Sime Darby and the Spinneys group in ITrelated positions. In the second half, he has worked on cricket simulation, ratings, data mining, analysis and writing, amongst other things. He was the creator of the Wisden 100 lists, released in 2001. He has written for ESPNcricinfo and CastrolCricket, and worked extensively with Maruti Motors, Idea Cellular and Castrol on their performance ratingsrelated systems. He is an armchair connoisseur of most sports. His other passion is tennis, and he thinks Roger Federer is the greatest sportsman to have walked on earth.
I don't think EBA can be more than normal batting average, as shown for Sir Sobers and other follwoing batsman: Garry Sobers 93 57.78 44.06 58.16 George Headley 22 60.83 45.61 61.33 Brian Lara 131 52.89 49.76 52.97 Kevin Pietersen 30 52.69 50.44 52.84 Kumar Sangakkara 68 55.74 46.16 56.26
Kindly ractify.
There is nothing in the calculation methodology to conclude that the EBA cannot be greater than the normal Average. In fact it can be seen that I have referred to these 5 batsmen in my article.
Ananth
Posted by Don on (January 14, 2008, 13:06 GMT)The problem with not outs is that the batsmen's mission is to score as many runs as possible without getting out. So, if a batsmen scored 70 runs and isn't out by the end of the innings it should be worth more then a batsman who scored 70 and is out. Normally the rational for not deducting the inning for averages purposes is "who knows how many more runs he would have scored". I would argue that a better system would value each run in a notout score as 1.5 times that of a completed score (or any other multiplier). So, that 70 not out may become 105 for statistical purposes. This woulf give greater weight to the accomplishment of both batting tasks  scoring while protecting your wicket.
Posted by Joel on (December 19, 2007, 20:54 GMT)Hmm, good point Malcolm. I wonder if anyone out there would care to perform a statistical analysis of the top 25 players' MODE, to determine their most likely score on walking out to the middle? It may raise a few eyebrows, not to mention the ire of millions!!
Posted by Malcolm on (December 19, 2007, 10:59 GMT)The runs per innings is also deceptive. If you are a number 5 or 6 batsman coming in after four really good batsmen, you could realistically be called on to get small scores or have a large amount of your innings cut off by decalrations. You would then end up with a low average runs per innings. It would not be a true reflection of your talent which is what the average is supposed to be. Obviously there are more sophisticated statistical techniques that could be used to analyse the performance of a player but the average, strike rate and conversion rate that you get are an excellent indication of the quality of the player, remembering, of course that the accuracy of any statistic increases as the number of obvservation increases.
Posted by Sriram Subramanian on (December 19, 2007, 9:19 GMT)Precisely my point, Malcolm. In terms of expectation from an innings when a batsman walks in, you should be expecting a modal value, which in all batsmen's case whether a Lara or a Harmison is less than 20.
Batting averages are higher or lower depending on 3 factors  a) The shape of the distribution  while the U shape holds in general, for the better batsmen, the % of cases in the 10s, 20s and 30s tends to be higher, b) The really high scores, and I find this is a huge influencer  the really great batsmen tend to run up very high scores which drives their avergae and c) the proportion of notout innings.
Of the three, only a) the shape of the distribution really influences the modal value. My point being that if you compare someone with an average of 40 with someone with a 55, your start expectations may be significantly different; but between 55 and 57, your start expectations should be no different.
Posted by Malcolm on (December 19, 2007, 7:05 GMT)An average is also the sum of observations multiplied by the proability of that observation occuring ie the sum of all (x*P(x)). So while the probability of scoring on or around the average is probably very low, the average does take into the consideration that you might, as the fielding team, spend a few days watching a Lara compile a 400 run innings. To determine the plausibility of the statistic, you should ask your self, if you were fielding captain and Lara walked in, would you be expecting the mode (the value with the hghest probability) or the average.
Posted by Malcolm on (December 19, 2007, 7:03 GMT)An average is also the sum of observations multiplied by the proability of that observation occuring ie the sum of all (x*P(x)). So while the probability of scoring on or around the average is probably very low, the average does take into the consideration that you might, as the fielding team, spend a few days watching a Lara compile a 400 run innings. To determine the plausibility of the statistic, you should ask your self, if you were fielding captain and Lara walked in, would you be expecting the mode (the value with the hghest probability) or the average.
Posted by joel on (December 18, 2007, 21:52 GMT)the average should suggest to the spectator how many runs a particular player is likely to make before he leaves the field.. After all, having made 25 not out twice is less likely to help his team win than a battling fifty, followed by a 0 not out. both players would have an average of fifty but everyone would know that once player A got into the twenties, he would be on shaky, unfamiliar ground. I think therefore that Runs Per Innings is fairer to spectators, rather than the deceitful Average that is currently employed.
Posted by Sriram Subramanian on (December 18, 2007, 16:58 GMT)In many distributions the average as a measure of central tendency not only provides the 'average value' i.e. the the area under the curve divided by the no of instances, but is also a good predictor of the most likely value.
Batting averages though mean nothing of the sort. Batsman's scores invariably have an inverted bell (of U shaped) distribution  the greatest # of outs is single digit, followed by a large number before the batsman crosses twenty. Then you get a few instances around the actual 'batting average' and again the # of instances goes up towards the higher scores. This holds for every batsman, and Lara's and Dravid's distributions are not that dissimilar as you would expect. So the least probability is actually around the batting average. So if Ponting has an 'average' of 59, he is actually least likely to score b/w 50  60. So not really sure what the average indicates for a given innings. At best it can be used the way we use it  comparing greats across time...
Posted by Usman on (December 18, 2007, 13:02 GMT)To complicated for me, i prefer runs/ innings. Nice and simple average and what really matters in the match.
Not if he got out or not but how many did he make.
Wether a team gets 300/0 or 300 all out in an ODI does not really matter in the game; when trying to restrict the chase its just 300 runs.