Tackling not-outs, and answering reader queries

Readers have written in with replies and suggestions on batting averages, and the extended version

25-Feb-2013

First let me explain the reasons for undertaking this whole exercise of extended batting averages:

The purpose was not to replace the conventional batting average. It was a suggestion to complement the batting average.

It was not a Tendulkar v Lara article. Their figures were just used for comparison.

Let me start by replacing the first para of my article with the following, just to put to bed the Tendulkar v Lara arguments. Consider the following two outstanding batsmen, among the best of their generation.

Richards and Kallis in Tests
Batsman	Tests	Innings	Not-outs	Runs	Average
Viv Richards	105	182	12	8540	50.24
Jacques Kallis	111	189	31	9197	58.31

Richards’ average is nearly eight behind Kallis', but is he that far behind? One of the main reasons for the difference in average has been the wide disparity in not-outs between the two, 12 against 31. It might be partly because of the way Richards played, almost always in an attacking mode. Both Richards and Kallis have similar Batting Position Index values - which is the average batting position at which a batsman has batted in - of 4.16 (Richards) and 3.77 (Kallis), indicating almost similar batting positions. This analysis seeks a way to normalise such situations.

Now to respond to some of the comments that came in:

The 1500 runs cut-off wasn’t meant to exclude Vinod Kambli, as someone suggested (Kambli is incidentally one of my favourite players). It was determined that the overall runs per Test for a top-order batsmen was around 75. The 1500 runs meant that one would have played 20 tests, which is a fair number of games. It also allowed me to include Hussey, which ensured further discussion on this phenomenal cricketer. Selecting the top 25 batsmen was again done to allow to include Lara and Pietersen, who were two of the 5 batsmen whose EBA was greater than their Batting Average.

The average of last ten innings could be construed as an arbitrary decision. Come to think of it, if I had taken five innings, it would have seemed too few, while 20 might have seemed too many. Ten innings represents about seven tests, which in turn is a minimum of two Test series.

Chris made a valid point about the order of the first table, stating that it should have been ordered by batting average rather than the EBA. A valid point, and I apologise for overlooking the significance. Unfortunately I had split the EBA-ordered wide table into two smaller ones and should have re-ordered the same.

A number of people have commented that this exercise was not needed since the final EBA table is more or less the same as the batting average table. My argument is that the result does not invalidate the analysis process.

The question of not-outs

The extension of not-out innings has attracted the most comments and rightly so. The approach I have taken can be construed as arbitrary. However it must be remembered that what has been done is neither a statistical extension nor a simulation-based computation. It is a fourth-dimension prediction and should be taken as it is. I can only repeat that the EBA should be taken to complement the current and much more understood batting average. The EBA can never be a substitute for batting averages since the common man can neither compute the same on his own nor understand the same easily.

When the concept was first created, the batting average was added to the not-out innings. It was only when I reworked the same concept for this blog did I change it slightly to include current form.

Some of the responses to the not-out issue have been interesting. Stuart says:

A batting average measures the number of runs between dismissals. If you get 20* and 27, that is equivalent to a single innings of 47 for your batting average. It also means you cobbled together 47 runs before you got out, whether it was over two innings or one. As it stands, interpreted correctly, a batting average is a perfect measure and needs no adjustments or fiddling.

That’s a fine analysis, and we could take this as an additional measure.

One of the best alternatives, and quite simple to implement also, was provided by Arvind Agarwal. It is given below.

EBA = Batting Average x (1 - (Not Out Inngs / Total Inngs) ^ 2. The computed values are: Lara = 52.80 (0.998 x Average)
Sachin = 53.82 (0.980 x Average)
Bradman = 97.93 (0.980 x Average)
Ponting = 58.08 (0.977 x Average)
M Hussey = 82.04 (0.945 x Average)

My gut feel is that Arvind's computations match mine almost completely without getting into any of the not-out extension complications and very easy to compute. Again this has to be taken as an additional measure rather than a replacement of the batting average.

There have been suggestions to take into account the match conditions, bowling attack etc., but it would be too complicated an exercise for this simple task. Similarly, the idea of using weighted averages instead of using the average of the last ten innings is a good one, but it makes the process more difficult and the results difficult to comprehend for the non-statiscally oriented people.

Glossus has suggested considering only those innings in which the batsman was dismissed, and ignoring the not-out innings. The table below has the results for this exercise.

Out batting average, and extended batting averages
Batsman	Tests	Career average	Out batting average	Extended batting average
Don Bradman	52	99.94	83.83	97.81
Michael Hussey	18	86.18	69.05	81.34
George Headley	22	60.83	45.61	61.33
Herbert Sutcliffe	54	60.73	54.64	60.54
Graeme Pollock	23	60.97	54.43	59.68
Everton Weekes	48	58.62	54.88	58.53
Ricky Ponting	112	59.40	49.46	58.52
Wally Hammond	85	58.46	46.19	58.43
Garry Sobers	93	57.78	44.06	58.16
Ken Barrington	82	58.67	50.37	58.11
Eddie Paynter	20	59.23	48.31	57.71
Jack Hobbs	61	56.95	53.34	56.52
Jacques Kallis	111	58.21	42.42	56.43
Len Hutton	79	56.67	47.89	56.41
Kumar Sangakkara	68	55.74	46.16	56.26
Clyde Walcott	44	56.69	51.03	56.14
Rahul Dravid	113	56.26	47.60	55.54
Mohammad Yousuf	77	55.72	48.84	55.28
Sachin Tendulkar	141	54.94	44.33	53.90
Dudley Nourse	34	53.82	47.49	53.40
Brian Lara	131	52.89	49.76	52.97
Kevin Pietersen	30	52.69	50.44	52.84
Greg Chappell	87	53.86	44.57	52.79
Matthew Hayden	91	52.57	49.19	52.50
Javed Miandad	124	52.57	41.97	51.62

Charles Davis, in his blog , has commented on this computation. Some of the answers to Charles can be found elsewhere in this article. Our first basis was the career average and would probably have been more apt. However I must point out to Charles that the "not exceeding the highest score" idea was only done to prevent extremely high scores, especially when batsmen (like Sangakkara/Yousuf/Kallis) are going through an outstanding run of form. That restriction may not be needed if the career average is used. However I must point out that the standard deviation differential between the career average and last 10 innings, according to Charles himself, is less than 10%. Charles, many thanks for your comments.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems