November 28, 2007

# The new, improved batting average

The batting average is a simple and fairly convenient way of putting a number to a player’s batting ability, but it doesn’t give the entire picture

The batting average is a simple and convenient way of putting a number to a player’s ability with the bat, but often it doesn’t give the entire picture. One major problem with the conventional average – which is calculated by dividing the total number of runs scored by the number of completed innings – is the way it deals with not-outs. Consider the stats for two of the greatest batsmen in the modern era:

Brian Lara and Sachin Tendulkar in Tests
Batsman Tests Innings Not-outs Runs Average Runs per Test
Brian Lara 131 232 6 11,953 52.89 91.2
Sachin Tendulkar 141 228 24 11,207 54.94 79.5

Lara has scored nearly 750 more runs in ten fewer Tests than Tendulkar. His runs per Test is nearly 12 runs more than Tendulkar's. However his average is nearly two runs behind Tendulkar, primarily because of the number of not-outs that Tendulkar has had. It might be partly because of the way Lara played, almost always in an attacking mode. Possibly also because Tendulkar, with an average Batting Position Index, which is the average batting position at which a batsman has batted in, of 4.30 as against Lara's figure of 3.78, probably has a slightly higher chance of remaining not out.

I’ve developed a new measure, which I’ve named the extended batting average, that offers a solution to the problem created by the not-outs in the batting average. It is determined by allowing a batsman to complete his not-out innings in the fourth dimension, so to say, and then by dividing the new total of runs (current aggregate plus the additional runs deemed to have been scored) by the total number of innings played. This will be a fair measure of the batting average of batsmen.

The extension of an innings is done in a logical manner taking into account the batsman's form at the time he played the not-out innings. During the first 10 innings of his career, when an insufficient number of innings have been played to have a handle on his form, his not-out innings will be extended by his OBA (Out Bat Average, derived by dividing total number of runs in completed innings by the number of completed innings).

Afterwards, recent form takes over. The not-out innings is extended by a rolling innings average of his last 10 played innings. In this case even the not-outs are included so that a big not-out innings, indicating very good current form, is not ignored. Of course, a batsman might remain not-out on 10 and this will lower his recent form computation. However, that is more acceptable than ignoring an unbeaten 200.

Two examples illustrate this concept. Kumar Sangakkara, in the greatest form currently, has scored 984 runs in his last 10 innings at an innings average of 98.4. If he remains not out with, say, 32 in the next innings, it is fair to assume that he would extend his innings by another 98 runs, to 130, considering his outstanding form. A similar situation exists with Mohammad Yousuf and Kallis.

On the other hand, Sehwag is in the most wretched form of his career, having scored 189 runs in his last 10 innings with an innings average of 18.9. It is reasonable to expect that if he remained not out at 32, his innings will be extended by only 19 runs, to 51.

This is applied to each and every innings played by all the batsmen. Care is taken to ensure that the adjusted innings total does not exceed the batsman’s highest score. In other words, Lara's 375 will not be allowed to go past 400. However if the highest score by a batsman is a not-out innings, for example Lara's 400 not out and Tendulkar's unbeaten 248, that specific innings will be allowed to be extended. This, I think, is common sense.

Now the new total aggregate of runs is divided, this time with justification, by the total number of innings played.

Since this is a clear "what if", imagination-driven computation, practical factors such as the match getting over, the innings getting over, or a batsman running out of partners etc are ignored.

This is no mean task and there is no way can this be done manually since the "current form" computation has to be done for each and every innings played by a batsman.

The table for the top 25 batsmen (criterion 1500 Test runs), in order of extended batting average, is shown below. These are current up to the Delhi between India and Pakistan.

Top 25 batsmen in terms of averages
Batsman Tests Innings Not-outs Runs Average
Don Bradman 52 80 10 6996 99.94
Michael Hussey 18 29 7 1896 86.16
George Headley 22 40 4 2190 60.83
Herbert Sutcliffe 54 84 9 4555 60.73
Graeme Pollock 23 41 4 2256 60.97
Everton Weekes 48 81 5 4455 58.62
Ricky Ponting 112 186 26 9504 59.40
Wally Hammond 85 140 16 7249 58.46
Garry Sobers 93 160 21 8032 57.78
Ken Barrington 82 131 15 6806 58.67
Eddie Paynter 20 31 5 1540 59.23
Jack Hobbs 61 102 7 5410 56.95
Jacques Kallis 111 189 31 9197 58.21
Len Hutton 79 138 15 6971 56.67
Kumar Sangakkara 68 112 9 5741 55.74
Clyde Walcott 44 74 7 3798 56.69
Rahul Dravid 113 193 23 9564 56.26
Mohammad Yousuf 77 130 10 6686 55.72
Sachin Tendulkar 141 228 24 11,207 54.94
Dudley Nourse 34 62 7 2960 53.82
Brian Lara 131 232 6 11,953 52.89
Kevin Pietersen 30 57 2 2898 52.69
Greg Chappell 87 151 19 7110 53.86
Matthew Hayden 91 162 13 7833 52.57
Javed Miandad 124 189 21 8832 52.57

Now let’s apply the adjustments related to not-out innings, and then have a relook at the averages.

Extended batting averages: top 25
Batsman ORuns NRuns ARuns TRuns EBA % of ave Last 10 inngs
Don Bradman 5868 1128 829 7825 97.81 97.87 565
Michael Hussey 1519 377 463 2359 81.34 94.39 757
George Headley 1642 548 263 2453 61.33 100.81 389
Herbert Sutcliffe 4098 457 530 5085 60.54 99.67 406
Graeme Pollock 2014 242 191 2447 59.68 97.88 677
Everton Weekes 4171 284 286 4741 58.53 99.85 455
Ricky Ponting 7913 1591 1381 10,885 58.52 98.52 520
Wally Hammond 5728 1521 931 8180 58.43 99.95 256
Garry Sobers 6124 1908 1273 9305 58.16 100.64 406
Ken Barrington 5843 963 807 7613 58.11 99.05 315
Eddie Paynter 1256 284 249 1789 57.71 97.43 511
Jack Hobbs 5067 343 355 5765 56.52 99.25 353
Jacques Kallis 6703 2494 1468 10,665 56.43 96.94 937
Len Hutton 5890 1081 813 7784 56.41 99.53 270
Kumar Sangakkara 4754 987 560 6301 56.26 100.93 984
Clyde Walcott 3419 379 356 4154 56.14 99.03 493
Rahul Dravid 8092 1472 1156 10,720 55.54 98.73 329
Mohammad Yousuf 5861 825 500 7186 55.28 99.21 510
Sachin Tendulkar 9044 2163 1082 12,289 53.90 98.11 438
Dudley Nourse 2612 348 351 3311 53.40 99.23 393
Brian Lara 11,245 708 337 12,290 52.97 100.16 634
Kevin Pietersen 2774 124 114 3012 52.84 100.29 450
Greg Chappell 5883 1227 862 7972 52.79 98.02 478
Matthew Hayden 7329 504 672 8505 52.50 99.87 448
Javed Miandad 7051 1781 925 9757 51.62 98.20 263

"ORuns" are the Runs scored in the innings in which the batsman was dismissed. "NRuns" are the runs scored in the not-out innings. "ARuns" are the runs added to the not-out innings by extending these. "TRuns" are the new total runs, obtained by adding the runs in the previous three columns. "EBA" is the extended batting average, computed by dividing TRuns by the total number of innings played.

A few observations

In general the EBA benefits the batsmen with lower number of not-outs. Only five batsmen in this group, Headley, Sobers, Sangakkara, Lara and Pietersen, have benefited by the extended batting average, though in most cases the increase is marginal. Sangakkara has benefited quite considerably because of his recent form. The other batsmen have their extended batting averages lower than their normal batting averages by upto 5%. Hussey has lost the most, which is understandable since he has seven not-outs in the 29 innings he has played. Similarly Kallis has lost, which is explained by the fact that he has remained not out a whopping 31 times. However note Kallis' recent form.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems