January 30, 2009

A consistency index for batsmen

A stats analysis to measure the consistency of Test batsmen
63

One thing we admire in our cricketers is consistency. Full marks to the gritty player who scores 50 on a minefield, even though he gets out for 50 when well set on a featherbed. But do we admire so much his team-mate who gets a duck in the first instance, but makes amends by crashing an impressive 100 in the second? They have the same average - but do they provide the same value?

Consistency can measured by calculating the standard deviation, which, in simple terms, seeks to measure the average deviation that each score is from the overall mean. The lower the standard deviation, the lower the variation in the scores. We can obviously apply this to cricket scores, but a couple of issues need to be resolved: what to do with "not out" scores, and how can we use it to compare the consistency of players with different averages?

To resolve the first, I elected to add any uncompleted innings to the next innings, so that effectively, I was calculating the standard deviation of the runs made between dismissals. If the last innings was a "red ink", it was ignored.

To allow comparison of consistency between different players, I simply divided the calculated standard deviation by the batting average (ignoring the last innings if it was "not out").

I performed this exercise three times for Test cricketers; for those who scored at least 1000 runs, for those who scored at least 5000 runs, and for those who scored at least 10000 runs.

The first table lists the most consistent Test batsmen who have scored at least 1000 runs. Australia's Bruce Laird, who scored with such consistency without scoring a century in his brief late-70s career, heads the list, and is followed by the admirable Sutcliffe, whose consistency is astounding given the extent of his career. Alastair Cook and MS Dhoni are notable current players in this list.

Table 1: Consistency Index: Most Consistent (Minimum 1000 runs)
Batsman Team CI SD Average Matches Innings Not Out Runs
Bruce Laird Australia 0.75 26.48 35.29 21 40 2 1341
Herbert Sutcliffe England 0.78 47.22 60.73 54 84 9 4555
Douglas Jardine England 0.79 37.08 46.70 22 33 6 1296
Ashley Giles England 0.80 16.81 20.90 54 81 13 1421
Alastair Cook England 0.81 34.24 42.09 36 66 2 2694
Maurice Tate England 0.82 20.96 25.49 39 52 5 1198
Rusi Surti India 0.83 23.72 28.70 26 48 4 1263
Jock Cameron South Africa 0.83 25.05 30.22 26 45 4 1239
George Gunn England 0.83 33.39 40.00 15 29 1 1120
Chandika Hathurusingha Sri Lanka 0.84 24.74 29.63 26 44 1 1274
Ian Redpath Australia 0.84 36.62 43.46 66 120 11 4737
Sid Barnes Australia 0.85 53.39 63.06 13 19 2 1072
Mark Richardson New Zealand 0.86 38.33 44.77 38 65 3 2776
Taufeeq Umar Pakistan 0.87 34.22 39.30 25 46 2 1729
Imran Farhat Pakistan 0.88 29.02 33.10 27 51 1 1655
Charles Kelleway Australia 0.88 32.83 37.42 26 42 4 1422
Dwayne Bravo West Indies 0.88 28.74 32.73 31 57 1 1833
Peter Richardson England 0.88 33.08 37.47 34 56 1 2061
Chetan Chauhan India 0.89 28.07 31.58 40 68 2 2084
Colin Bland South Africa 0.89 43.67 49.09 21 39 5 1669
Trevor Goddard South Africa 0.89 30.67 34.47 41 78 5 2516
Deryck Murray West Indies 0.89 20.40 22.91 62 96 9 1993
Mahendra Singh Dhoni India 0.89 32.20 36.14 35 56 6 1807
David Sheppard England 0.89 33.70 37.81 22 33 2 1172
Alan Davidson Australia 0.89 21.97 24.59 44 61 7 1328

At the other end, we also have some current players in the least consistent category, notably Sinclair, Taibu, and until recently, Atapattu, who mixed a dreadful sequence of low scores early in his career with some heavy scoring later on:

Table 2: Consistency Index: Least Consistent (Minimum 1000 runs)
Batsman Team CI SD Average Matches Innings Not out Runs
Matthew Sinclair New Zealand 1.62 52.70 32.55 32 54 5 1595
Vinoo Mankad India 1.51 47.57 31.48 44 72 5 2109
Jacques Rudolph South Africa 1.49 53.81 36.21 35 63 7 2028
Guy Whittal Zimbabwe 1.48 43.65 29.43 46 82 7 2207
Tatenda Taibu Zimbabwe 1.45 42.94 29.60 24 46 3 1273
Wasim Akram Pakistan 1.44 32.57 22.63 104 147 19 2898
Mohammad Ashraful Bangladesh 1.43 34.10 23.88 48 93 4 2125
Javagal Srinath India 1.43 20.31 14.21 67 92 21 1009
Wasim Jaffer India 1.42 48.30 34.11 31 58 1 1944
Vic Pollard New Zealand 1.41 34.35 24.35 32 59 7 1266
Dilip Sardesai India 1.40 55.10 39.24 30 55 4 2001
Sidath Wettimuny Sri Lanka 1.39 40.31 29.07 23 43 1 1221
Marvan Atapattu Sri Lanka 1.39 54.40 39.02 90 156 15 5502
Matthew Elliot Australia 1.38 46.20 33.49 21 36 1 1172
Madan Lal India 1.38 31.27 22.65 39 62 16 1042
Ridley Jacobs West Indies 1.37 38.70 28.32 65 112 21 2577
Tim Robinson England 1.36 49.34 36.39 29 49 5 1601
Bill Ponsford Australia 1.35 65.00 48.23 29 48 4 2122
John Bracewell New Zealand 1.35 27.56 20.43 41 60 11 1001
Jimmy Adams West Indies 1.35 55.72 41.26 54 90 17 3012

Now for the serious Test batsmen:

Table 3: Consistency Index: Most Consistent (Minimum 5000 runs)
Batsman Team CI SD Average Matches Innings Not Out Runs
Jack Hobbs England 0.92 52.33 56.95 61 102 7 5410
Don Bradman Australia 0.94 93.49 99.94 52 80 10 6996
Arjuna Ranatunga Sri Lanka 0.94 33.48 35.50 93 155 12 5105
John Wright New Zealand 0.97 36.58 37.83 82 148 7 5334
Mark Waugh Australia 0.97 40.58 41.82 128 209 17 8029
Graham Thorpe England 0.98 43.25 44.23 100 179 28 6744
Rohan Kanhai West Indies 0.98 46.58 47.53 79 137 6 6227
Clive Lloyd West Indies 0.99 46.44 46.68 110 175 14 7515
Denis Compton England 1.00 49.9 50.06 78 131 15 5807
Sourav Ganguly India 1.00 42.22 42.18 113 188 17 7212
Bill Lawry Australia 1.03 48.43 47.15 67 123 12 5234
Ken Barrington England 1.03 59.9 58.28 82 131 15 6806
Matthew Hayden Australia 1.04 52.77 50.74 103 184 14 8625
Ricky Ponting Australia 1.05 59.47 56.88 128 215 26 10750
Michael Slater Australia 1.05 45.09 42.84 74 131 7 5312
Doug Walters Australia 1.06 50.86 48.10 74 125 14 5357
Marcus Trescothick England 1.06 46.34 43.80 76 143 10 5825
Sunil Gavaskar India 1.06 54.42 51.12 125 214 16 10122
David Gower England 1.07 47.29 44.25 117 204 18 8231
Vivian Richards West Indies 1.07 53.69 50.24 121 182 12 8540
Michael Atherton England 1.07 40.41 37.70 115 212 7 7728
Len Hutton England 1.07 60.86 56.67 79 138 15 6971

The higher Consistency Indices show that it is much harder to maintain consistency over a longer career. It is interesting to observe that the two most consistent batsmen are two "old-timers", Hobbs and Bradman - class will out! And who would have thought that the most consistent Australian after Bradman in this category was Mark Waugh!

At the other end of the scale for this category, we find Waugh's twin brother prominently placed:

Table 4: Consistency Index: Least consistent (Min 5000 runs)
Player For CI SD Ave M I NO Runs
Marvan Atapattu SL 1.39 54.40 39.02 90 156 15 5502
Zaheer Abbas Pak 1.32 59.29 44.80 78 124 11 5062
Kumar Sangakkara SL 1.31 71.23 54.38 78 129 9 6525
Virender Sehwag Ind 1.27 64.81 51.06 66 114 4 5617
Steve Waugh Aus 1.26 64.16 51.06 168 260 46 10927
Shivnarine Chanderpaul WI 1.25 62.37 49.72 114 196 31 8203
Brian Lara WI 1.24 65.33 52.89 131 232 6 11953
Herschelle Gibbs SA 1.24 51.85 41.95 90 154 7 6167
Ian Botham Eng 1.24 41.69 33.55 102 161 6 5200
Sanath Jayasuriya SL 1.23 49.15 40.07 110 188 14 6973
VVS Laxman Ind 1.22 54.24 44.46 102 169 24 6446
Aravinda de Silva SL 1.21 52.21 42.98 93 159 11 6361
Mark Taylor Aus 1.19 51.55 43.50 104 186 13 7525
Wally Hammond Eng 1.19 69.46 58.46 85 140 16 7249
Jacques Kallis SA 1.19 64.91 54.58 128 216 33 9988
Mahela Jayawardene SL 1.18 61.73 52.36 100 164 12 7959
Carl Hooper WI 1.18 43.09 36.47 102 173 15 5762
Sachin Tendulkar Ind 1.1 64.28 54.28 156 256 27 12429
Rahul Dravid Ind 1.17 61.07 52.28 131 227 26 10509
Stephen Fleming NZ 1.17 47.05 40.07 111 189 10 7172

The case of Chanderpaul is interesting. Ten years ago, he was heading towards being one of the most consistent batsmen ever, with a CI of 0.82. Over the last decade, while he has been one the Windies few shining lights, there has also been much greater variation in his scoring.

This group also contains a few batsmen who play more aggressively than most: Sehwag, Jayasuriya and Botham are notable here. One would expect, naturally, their consistency to suffer as a result of their aggression.

Finally, a table just for the mega-stars, those who have scored 10000 Test runs, plus Kallis, who will surely join them the next time he goes to bat:

Table 5: Consistency Index: Top eight run-scorers
Player For CI SD Ave M I NO Runs
Ricky Ponting Aus 1.05 59.47 56.88 128 215 26 10750
Sunil Gavaskar Ind 1.06 54.42 51.12 125 214 16 10122
Allan Border Aus 1.08 54.45 50.37 156 265 44 11174
Rahul Dravid Ind 1.17 61.07 52.28 131 227 26 10509
Sachin Tendulkar Ind 1.18 64.28 54.28 156 256 27 12429
Jacques Kallis SA 1.19 64.91 54.58 128 216 33 9988
Brian Lara WI 1.24 65.33 52.89 131 232 6 11953
Steve Waugh Aus 1.26 64.16 51.06 168 260 46 10927

I for one was surprised to find the Aussie captain heading this list, and Tendulkar so far down the table. And perhaps Gavaskar was a better player than he is perhaps given credit for.

I hope the browsers of this site find this a worthwhile exercise. I would value their comments.

Comments have now been closed for this article

  • Harsh Thakor on February 1, 2010, 11:57 GMT

    Jack Hobbs deserves his ranking considering his brilliant consistency,scoring over half his first class Centuries after the age of 40.

    I was impressed with the choice of Rohan Kanhai, ,who to me was the most consummate batsmen of all,scoring a fifty every 3 innings.Clive Lloyd was a more consistent batsman after gaining captaincy,being a champion in a crisis.

    Overall,I feel Geoff Boycott,should have atleast been in the top dozen for his remarkable consistency all over the world,including the Carribean .

    Rahul Dravid should have atleast made the top dozen,considering his overseas record,while Steve Waugh and Javed Miandad were the best batsman in their day in a crisis-the ultimate batsman to bat for your life.I would place them above Mark Waugh.

    To me Gavaskar and Hutton should have been ranked much higher as well as Sachin Tendulkar.Imagine Sachin retaining a 55+average after crossing 13000 test runs!Gavaskar faced the best bowling of all and broke batting records.

  • Harsh Thakor on January 29, 2010, 12:15 GMT

    I find it hard to understand the omission of Javed Miandad,Everton Weekes,George Headley and Geoff Boycott.

    No opener was more difficult to dismiss than Boycott who very rarely lost his form and dispalyed consistency all over the world,particularly in the West Indies,standing up against the graet battery of pacemen.

    Javed Miandad was brilliantly consistent at home and in his peak was more conistent than any Pakistani batsman away like in 1987-1988 in England,India and West Indies.He was neck to neck with Allan Border when the chips were down-perhaps the best batsman in a crisis.

    Everton Weekes in his era played more like Sir Don than any batsman at one stage scoring 5 consecutive hundreds!His test scores at his peak were like a Black Bradman batting.

    Mark Waugh ,no doubt a class act on his day and posessing talent in the Lara class was nowhere as consistent as his brother Steve.

    George Headley was an epitome of consistency-Imagine Bradman was called the White Headley!

  • Harsh Thakor on January 29, 2010, 12:00 GMT

    Jack Hobbs deservingly wins his place because of his outstanding consistency.No batsman displayed such consistency in such a long period-Imagine scoring more than half of his first-class centuries after the first World War!Another deserving selection is Rohan Kanhai,perhaps the most complete batsman of all,who has often not been given his true worth.Clive Lloyd,considering the great consistency he displayed after becong skipper deserves his ranking.Considering they opened the batting Len Hutton and Gavaskar should have even been ranked ahead,close to Sir Jack Hobbs. However it is ridiculous that batsman like Rahul Dravid and Steve Waugh are ranked so low in consistency.Both batsman were champions in a crisis and rarely were out of form.Sachin Tendulkar,should atleast have made the top dozen.Imagine retaining a 54+average crossing 13000 runs!Infact Brian Lara,in comparison suffered from bouts of loss of form. Modern day greats played much more cricket than yesteryear players.

  • shaphysics on March 6, 2009, 21:07 GMT

    I like the spirit of the analysis. However, the CL number proposed here is meaningless.If we were to come up with a number indepepndent of pitch and weather conditions then I would consider suggestions by TomC and Koos van Zyl above as serious suggestions. Personally average only plays role defining whether batsman is recognized specialist or not. I could settle for a simpler critereasuch as shown below:

    consistency index = [number of times a batsman crosses a scorde of 40] / [total innings played]

    I am ssunming tht a recognized btsman should at least cross a score of 40 to be considered s a recognized batsman.

  • shaphysics on March 6, 2009, 21:07 GMT

    I like the spirit of the analysis. However, the CL number proposed here is meaningless.If we were to come up with a number indepepndent of pitch and weather conditions then I would consider suggestions by TomC and Koos van Zyl above as serious suggestions. Personally average only plays role defining whether batsman is recognized specialist or not. I could settle for a simpler critereasuch as shown below:

    consistency index = [number of times a batsman crosses a scorde of 40] / [total innings played]

    I am ssunming tht a recognized btsman should at least cross a score of 40 to be considered s a recognized batsman.

  • Andrew on February 26, 2009, 22:51 GMT

    Mark "Audi" Waugh ranked consistent despite the Sri Lanka debacle, but Attapattu has not been able to shrug off his horror start. Presumably he's middling consistency once he actually got going.

    This tends to punish batsmen for going on with big scores - MEWaugh never made the massive scores that drag down Bradman to only 2nd place.

    Maybe too technical, but how would a "semi-variance" measure go, penalising only for downside scores below the average?

  • Jeremy Gilling on February 17, 2009, 3:09 GMT

    Two surprising omissions from the least consistent (1000 run minimum) table are Ken Rutherford (NZ) and Bill Edrich (England), both of whom had Atapattu-like horror starts to their careers.

    Ric's comment: Edrich had an index of 1.18, reasonably high, and Rutherford only 1.06. The latter scored quite consistently in the second half of his career, and with only 3 centuries, had little at the top end to stretch his standard deviation.

  • Tom on February 7, 2009, 17:07 GMT

    Doug Walters on the list of consistent batsmen surprised me; and as much as I loved to watch Doug bat, consistency was not one of the traits he was known for. Therefore, I think some of the criticism regarding exactly what is being measured here needs a look at.

  • keyur on February 3, 2009, 5:50 GMT

    good analysis but i have a few queries regarding it. firstly, i believe notout innings must be excluded. even if a batsman made a unbeaten innings of 50 last time,he does not start his next innings on 50 but on a fresh zero the batting conditions and opposition bowling also vary. further this method punishes those who have streak of notouts & hence notouts should be excluded.admittedly leaving notouts will reduce the avg. as well but as we are measuring consistency it shouldn't make a difference. secondly as has been noted in the very first comment, the CI tends to rise with the no. of innings played or no. of dissmissals. i believe it is easier to maintain consistency over say 50 innings or dismissals than over 200. so to correct this error the consistency index as obtained by you should be further divided by square root of the no. of dissmissals. (i think this is how variance is measured in stats) this will allow fair comparison of all players irrespective of no. of innings played

    Ric's comment: I did actually do it ignoring incompleted innings - Hobbs was still top of the 5000+ group, but Bradman dropped, Clive Lloyd was second - Laird dropped to 3rd in the 1000+ group, David Hookes was top - Tendulkar and Gavskar came down to equal Ponting, Lara was still high in the 10000+ group, Kallis was even lower. But I question the validity of doing this - whole slabs of players' careers are ignored, and anyway, does not the common old batting average measure the runs scored between dismissals, rather than in each innings? Given that, I still reckon the way I presented it is the best methodology. Thanks for your input, Keyur!

  • Ambuj Saxena on February 2, 2009, 23:41 GMT

    Interesting analysis, though I do not agree with the procedure. If I were to analyze consistency, I would have used the following formula:

    C(X)=Percent of times a batsman scores less than X% of his average.

    As you might have noticed, this formula allows for a lot of different standards of consistency measurements. For example, one can measure the C(50) consistency of all batsman, which would be calculating the percent of times each batsman scores less than half of his average. I have not been able to figure out what will be a good value of X for the most objective analysis and I believe this will remain subjective. What I like most about this formula is that it doesn't punish the batsman for scoring big in a few innings. It also doesn't punish the batsman within a few runs of his average, and is universal enough to compare most batsman in a single statistics pool.

    Can you please do an analysis based on this formula and share the results.

    Ric's comment: I've quickly used your method to determine the percentage of scores below 50% of the Test averages of Ranatunga (consistent in my analysis) and Attapattu (inconsistent), two player who had similar overall averages and aggregates. Ranatunga had 39% of his scores below 50% of his average, while Atapattu had 53%. I suppose this means (on the basis of a very small sample!) that both methods are going to produce more or less the same outcome. With my method though, every single score counts, whereas with yours, all scores below 50% of the average carry the same weight (eg, a duck carries the same weight as a score of 18) while those above it are not taken into account at all. I think your method if applied to all players would produce an interesting point of discussion, but I'm not sure it is as effective in determining overall consistency. Thanks for your input.

  • Harsh Thakor on February 1, 2010, 11:57 GMT

    Jack Hobbs deserves his ranking considering his brilliant consistency,scoring over half his first class Centuries after the age of 40.

    I was impressed with the choice of Rohan Kanhai, ,who to me was the most consummate batsmen of all,scoring a fifty every 3 innings.Clive Lloyd was a more consistent batsman after gaining captaincy,being a champion in a crisis.

    Overall,I feel Geoff Boycott,should have atleast been in the top dozen for his remarkable consistency all over the world,including the Carribean .

    Rahul Dravid should have atleast made the top dozen,considering his overseas record,while Steve Waugh and Javed Miandad were the best batsman in their day in a crisis-the ultimate batsman to bat for your life.I would place them above Mark Waugh.

    To me Gavaskar and Hutton should have been ranked much higher as well as Sachin Tendulkar.Imagine Sachin retaining a 55+average after crossing 13000 test runs!Gavaskar faced the best bowling of all and broke batting records.

  • Harsh Thakor on January 29, 2010, 12:15 GMT

    I find it hard to understand the omission of Javed Miandad,Everton Weekes,George Headley and Geoff Boycott.

    No opener was more difficult to dismiss than Boycott who very rarely lost his form and dispalyed consistency all over the world,particularly in the West Indies,standing up against the graet battery of pacemen.

    Javed Miandad was brilliantly consistent at home and in his peak was more conistent than any Pakistani batsman away like in 1987-1988 in England,India and West Indies.He was neck to neck with Allan Border when the chips were down-perhaps the best batsman in a crisis.

    Everton Weekes in his era played more like Sir Don than any batsman at one stage scoring 5 consecutive hundreds!His test scores at his peak were like a Black Bradman batting.

    Mark Waugh ,no doubt a class act on his day and posessing talent in the Lara class was nowhere as consistent as his brother Steve.

    George Headley was an epitome of consistency-Imagine Bradman was called the White Headley!

  • Harsh Thakor on January 29, 2010, 12:00 GMT

    Jack Hobbs deservingly wins his place because of his outstanding consistency.No batsman displayed such consistency in such a long period-Imagine scoring more than half of his first-class centuries after the first World War!Another deserving selection is Rohan Kanhai,perhaps the most complete batsman of all,who has often not been given his true worth.Clive Lloyd,considering the great consistency he displayed after becong skipper deserves his ranking.Considering they opened the batting Len Hutton and Gavaskar should have even been ranked ahead,close to Sir Jack Hobbs. However it is ridiculous that batsman like Rahul Dravid and Steve Waugh are ranked so low in consistency.Both batsman were champions in a crisis and rarely were out of form.Sachin Tendulkar,should atleast have made the top dozen.Imagine retaining a 54+average crossing 13000 runs!Infact Brian Lara,in comparison suffered from bouts of loss of form. Modern day greats played much more cricket than yesteryear players.

  • shaphysics on March 6, 2009, 21:07 GMT

    I like the spirit of the analysis. However, the CL number proposed here is meaningless.If we were to come up with a number indepepndent of pitch and weather conditions then I would consider suggestions by TomC and Koos van Zyl above as serious suggestions. Personally average only plays role defining whether batsman is recognized specialist or not. I could settle for a simpler critereasuch as shown below:

    consistency index = [number of times a batsman crosses a scorde of 40] / [total innings played]

    I am ssunming tht a recognized btsman should at least cross a score of 40 to be considered s a recognized batsman.

  • shaphysics on March 6, 2009, 21:07 GMT

    I like the spirit of the analysis. However, the CL number proposed here is meaningless.If we were to come up with a number indepepndent of pitch and weather conditions then I would consider suggestions by TomC and Koos van Zyl above as serious suggestions. Personally average only plays role defining whether batsman is recognized specialist or not. I could settle for a simpler critereasuch as shown below:

    consistency index = [number of times a batsman crosses a scorde of 40] / [total innings played]

    I am ssunming tht a recognized btsman should at least cross a score of 40 to be considered s a recognized batsman.

  • Andrew on February 26, 2009, 22:51 GMT

    Mark "Audi" Waugh ranked consistent despite the Sri Lanka debacle, but Attapattu has not been able to shrug off his horror start. Presumably he's middling consistency once he actually got going.

    This tends to punish batsmen for going on with big scores - MEWaugh never made the massive scores that drag down Bradman to only 2nd place.

    Maybe too technical, but how would a "semi-variance" measure go, penalising only for downside scores below the average?

  • Jeremy Gilling on February 17, 2009, 3:09 GMT

    Two surprising omissions from the least consistent (1000 run minimum) table are Ken Rutherford (NZ) and Bill Edrich (England), both of whom had Atapattu-like horror starts to their careers.

    Ric's comment: Edrich had an index of 1.18, reasonably high, and Rutherford only 1.06. The latter scored quite consistently in the second half of his career, and with only 3 centuries, had little at the top end to stretch his standard deviation.

  • Tom on February 7, 2009, 17:07 GMT

    Doug Walters on the list of consistent batsmen surprised me; and as much as I loved to watch Doug bat, consistency was not one of the traits he was known for. Therefore, I think some of the criticism regarding exactly what is being measured here needs a look at.

  • keyur on February 3, 2009, 5:50 GMT

    good analysis but i have a few queries regarding it. firstly, i believe notout innings must be excluded. even if a batsman made a unbeaten innings of 50 last time,he does not start his next innings on 50 but on a fresh zero the batting conditions and opposition bowling also vary. further this method punishes those who have streak of notouts & hence notouts should be excluded.admittedly leaving notouts will reduce the avg. as well but as we are measuring consistency it shouldn't make a difference. secondly as has been noted in the very first comment, the CI tends to rise with the no. of innings played or no. of dissmissals. i believe it is easier to maintain consistency over say 50 innings or dismissals than over 200. so to correct this error the consistency index as obtained by you should be further divided by square root of the no. of dissmissals. (i think this is how variance is measured in stats) this will allow fair comparison of all players irrespective of no. of innings played

    Ric's comment: I did actually do it ignoring incompleted innings - Hobbs was still top of the 5000+ group, but Bradman dropped, Clive Lloyd was second - Laird dropped to 3rd in the 1000+ group, David Hookes was top - Tendulkar and Gavskar came down to equal Ponting, Lara was still high in the 10000+ group, Kallis was even lower. But I question the validity of doing this - whole slabs of players' careers are ignored, and anyway, does not the common old batting average measure the runs scored between dismissals, rather than in each innings? Given that, I still reckon the way I presented it is the best methodology. Thanks for your input, Keyur!

  • Ambuj Saxena on February 2, 2009, 23:41 GMT

    Interesting analysis, though I do not agree with the procedure. If I were to analyze consistency, I would have used the following formula:

    C(X)=Percent of times a batsman scores less than X% of his average.

    As you might have noticed, this formula allows for a lot of different standards of consistency measurements. For example, one can measure the C(50) consistency of all batsman, which would be calculating the percent of times each batsman scores less than half of his average. I have not been able to figure out what will be a good value of X for the most objective analysis and I believe this will remain subjective. What I like most about this formula is that it doesn't punish the batsman for scoring big in a few innings. It also doesn't punish the batsman within a few runs of his average, and is universal enough to compare most batsman in a single statistics pool.

    Can you please do an analysis based on this formula and share the results.

    Ric's comment: I've quickly used your method to determine the percentage of scores below 50% of the Test averages of Ranatunga (consistent in my analysis) and Attapattu (inconsistent), two player who had similar overall averages and aggregates. Ranatunga had 39% of his scores below 50% of his average, while Atapattu had 53%. I suppose this means (on the basis of a very small sample!) that both methods are going to produce more or less the same outcome. With my method though, every single score counts, whereas with yours, all scores below 50% of the average carry the same weight (eg, a duck carries the same weight as a score of 18) while those above it are not taken into account at all. I think your method if applied to all players would produce an interesting point of discussion, but I'm not sure it is as effective in determining overall consistency. Thanks for your input.

  • Charles Davis on February 2, 2009, 23:15 GMT

    Interesting analysis, Ric, and interesting that some seem to miss the point. I suppose it is because cricket commentators (mis)use the word 'consistent' when they mean 'consistently good'.

    The question of whether consistency is a desirable quality is still open: I don't know the answer, although it appears that batsmen who are predictable and those who are not can both be rated very highly by judges of the game. But perhaps someone like Lara would be less well remembered if he had traded some of those double centuries for a whole bunch of fifties, so in his case 'inconsistency' was a positive thing. Given the plight of the West Indies, 50s from Lara were almost never match-winning innings.

    In Chanderpaul's case, his 'inconsistency' seems to derive from his tendency to string together consecutive not outs.

  • Henry on February 2, 2009, 13:46 GMT

    "The whole exercise is about the presence or otherwise of outliers - why would you want to ignore them? Those high scores contribute to the mean; they therefore have to be considered when measuring consistency."

    Well yes, but it would be nice to get more of a feel for the distribution. Two batsmen with the same CV/S.D. might have differently shaped distributions (i.e. narrow with a few large outliers, fat with no outliers). The intuition for this analysis is as follows - the utility of runs is rarely linear - obvioulsy big hundreds are better than small hundreds, but when the score is very high, there is always a suspicion of a weak/understrength bowling attack and/or a flat pitch and the consequent result is often a draw. Clearly you want to be as model free as possible, but I could be acknowledged that there are multiple ways to increase one's CV/S.D.

  • Sankar Vasudevan on January 31, 2009, 22:34 GMT

    Its quite surprising that many who have posted are not able to understand the exact meaning of this analysis, in spite of being well informed about statistics. This analysis is not about exclusively about pitch state, crisis situations or against the quality of attack. The more number of matches one plays (the runs being representative of that), these factors would become equivalent amongst players. So by combining the standard deviation and the average together (don't look at SD alone..), we can determine to a reasonable extent, who is more consistent.

    Of course, there could be possible exceptions like when a batsman scores a string of 3 zeroes in a match heading for a draw and coming up a double hundred to clinch a game. His S.D is going to be high of course and statistics cannot judge these things...but then, these are exceptions and not the rules. Although I would agree with one of the posts here about providing the median and mode for a better analysis. Good work all in all.

  • waterbuffalo on January 31, 2009, 19:49 GMT

    Interesting to see Mark Waugh at 5 in the Most Consistent list and Steve Waugh at 5 in the Least Consistent list. I have always thought that Mark Waugh was underated whilst Steve was overated. One only has to look at how difficult Steve makes batting look and how easy Mark made it look. Add to that the dozens of LBW's not given against Steve by Aussie umpires and you can see that the reason why Steve Waugh is held in such high esteem is he simply had a far bigger mouth.

  • KarachiFrog on January 31, 2009, 18:59 GMT

    After looking at these stats the only conclusion I can come to is SO WHAT!. When Ashley Giles figures (4th) in the list of the world's most consistent batsmen, then the list has no relevance to much at all - and given Giles 13 'not outs' thanks to batting at the tail, it suggests that the whole concept is skewed. This is not a list of history's most valuable or useful players (though there are a number of quality batsmen in the list). Apart from Herbie Sutcliffe & Sid Barnes none of the 'consistent' list averaged over 50 and fall into the 'high quality' bracket. Your comment "One thing we admire in our cricketers is consistency" is not necessarily shared by all - don't speak for me Ric. I'd sooner see Afridi go out there and belt 75 off 45 balls, though he doesn't do it every innings, rather than watch some dude who plods out 25 or 30 runs time after time. The guys that are going to create cricket excitement in the coming years are Duminy & Warner & their ilk. I look foward to them doing it in tests as well as the shorter games. Go back to work Ric.Your 'early retirement' from mathematics teaching seems to have suckered you into the inane. Take up golf & get a life.

    Ric's comment: I'm not sure I ever claimed I was measuring "history's most valuable or useful players" - its simply a ranking of those whose scores varied little from their average (or varied widely, depending which table you look at). Nor is it a list of the cricket's excitement machines.

    The golf's fine, thanks!

  • Vinay on January 31, 2009, 18:57 GMT

    Well, these are just another statistics that doesnt actually let you know anything. For ex, the stats say that Sachin is not as consistent as we think but Sachin Tendulkar has received the maximum number of wrong decisions. What about them? If the decisions went his way he would have been atleast 2000 runs more than what he actually has..

  • GMnorm on January 31, 2009, 18:29 GMT

    yes Ponting is very consistent vs Harbhajan

  • Harish Raj on January 31, 2009, 18:20 GMT

    I just glanced at the article. And I have to say I am most pleasantly surprised to see Ganguly is the most consistent of the Indian batting greats including the golden generation. Especially considering he's valued to be the least in terms of batting talent among them all. But then,numbers dunt tell the real story..!!

  • Sumit Sanghai on January 31, 2009, 14:14 GMT

    I think the simple notion of % of N+ scores where N could be 30, 50, 70, etc would do the trick. As many have already said STD deviation isn't the best performance indicator. Also, one should take the best consecutive N years (N could be 10) of a batsman to do this analysis. The reason being that some batsmen really lose form as they age and some are blooded too early which will lead to less consistency.

  • Shriram on January 31, 2009, 12:27 GMT

    I think the word "consistent" is generally associated with consistently good scores only. Therefore, this analysis is more about "predictability" of a batsman. While certainly interesting, I think this analysis is geared towards answering the question "What's the confidence with which we can predict that a batsman would score his average score in a particular innings"

    Another suggestion for the similar analysis for ODIs...throw strike rates into the mix as well and see what you get!

    Ric's comment: Good comment, Shriram. Not sure how the strike rates suggestion would work, though....

  • Aaron on January 31, 2009, 9:31 GMT

    Ha! No one from New Zealand will be surprised to see Mathew Sinclair waaaay ahead of the pack when it comes to inconsitency. With a double century on debut and another one soon after he was looking likely to be a lynch pin of the side for years to come. Since then however he's been consistently bad. The good news is he's got a few more years left to re-enforce his position on the number one spot.

    Although I wonder, if he stays consistently bad for another 5 years (which is quite likely), and the selectors keep giving him another go, might that actually bring his inconsistency rating down a little?

  • Raghu on January 31, 2009, 5:36 GMT

    Nice.

    Can you do a Sharp Ratio as well? Which is pretty much what you've done I'd think, but not quite.

    Cheers!

  • Saurabh Somani on January 31, 2009, 5:21 GMT

    have often thought of doing the same exercise myself, but there is a problem with the Not Out scores. when you add them to the next out score, what you're effectively doing is increasing the batsman's standard deviation. e.g. sir gary sobers scores have actually varied from 0 to 365, but in the analysis his scores would have a range of 0 to 490 (this is 365 + 125 that he made in the next innings). increasing the range would per force increase the standard deviation.

    I would be very grateful if you could respond to this with your thoughts.

    Ric's comment: You are absolutely right - what, in effect, I have done is to measure the variation of scores made between dismissals rather than between innings. Mostly these coincide, but as you point out, sometimes they don't. The Sobers example you refer to treats him as having one score of 490.

    Other options would be to ignore not outs completely, or to treat them as dismissed scores (eg 0 not out becomes 0). I believe what I have done is better than those options.

  • Anand on January 31, 2009, 5:09 GMT

    Ric, you made a good point. But if someone is making consistently 40-50's in test matches, its of no use. You need someone to score big hundreds to win matches. Who consistently do that are matchwinners & great players. Saurav, Thorpe, Atherton, Arjun Ranatunga, John wirght might be more consistent. But they never produced big hundreds consistently to win matches. So this being an interesting analysis does not hold much water. Yes you can look at 10 K + club to see who is more consistent May be other way is to look at folks who are averaging between 40-45, 45-50,50-55 and more than a certain number of runs and then look at the CI index. It for me would be a real assesment of consistency.

    I still liked the different analysis.

    Ric's comment: I think you are underestimating the value of someone who reliably scores 40-50 in Tests. Put two of them together, and you have a partnership of 100 runs nearly every time - I think most teams would take that. As I commented before, you are right in looking a certain groups based on similar averages and runs - I chose not to present it that way, because I wanted to show the whole range of results without a long ponderous post with a squillion tables. But you can pick players of similar experience out of the tables I have presented - eg Kanhai, Lawry, Walters.

    If this exercise has made you think deeply about this modelling, then I am happy - that's all I want! Thanks.

  • Rajin on January 31, 2009, 3:52 GMT

    Well personally consistency depends simply on a high avg. hovering on or about 50 and your conversion rate i.e the ability to convert 50s to 100s and 100s into big scores.You see to me a batsman can score many hundreds but have an avg. of under 40 so that's not consistent also he can have an avg. close to 50 but not many 100s meaning he might have many not outs to push his avg. up so on my take it needs to be a combination of the 2.To me lara and Tendulkar is the most consistent bcuz they have very high conversion rates,gr8 avgs. and carry on to turn 100s into big ones on a fairly regular basis.

  • Marcus on January 31, 2009, 3:46 GMT

    I've always just judged a batsman's consistency by their innings/50 ratio (ie. No. of innings/no. of 50+ scores). This is a little more sophisticated! But it is interesting that Mark Waugh has a higher CI than Steve, because his ratio is 3.12, whereas Steve's is 3.17 - so I'm glad I'm not completely on the wrong track!

    Ric's comment: Well done! The method I have used allows us to measure the consistency of players who have never scored a 50.

  • TropicalSky on January 31, 2009, 1:50 GMT

    Anyways, on the whole I'm buying your "Consistency Index: Top eight run-scorers" part of the analysis; given that they are all good batsmen who have played same amount of cricket and scored similar amount of runs; and all have averages above 50. Thanks.

  • TropicalSky on January 31, 2009, 1:46 GMT

    OK, How about 50 to 75 with average between 40 to 45, 45 to 50 and >50; 75 to 100 tests with average between 40 to 45, 45 to 50 and so on? That should pretty much do it; I guess? Then we can figure out who are the consistently good batsmen among a particular group.

  • TropicalSky on January 31, 2009, 1:18 GMT

    Ric, Thanks for your earlier response. I think your analysis will look much more reasonable if you compare batsmen with similar averages. (above 50, 45 to 50, 40 to 45 etc).What do you think?

    Ric's comment: I agree, but you need to be careful that the players you are lumping together in the one average range (eg 45-50) have also played around the same amount of cricket, because as someone else noted ,the CIs tend to sneak up as they play more. That's why I prepared the tables as I did, on runs, rather than average, since there is more likelihood that two players with a similar number of runs have played a similar amount of cricket.

  • Azfar Alam on January 31, 2009, 0:39 GMT

    Ric, the reactions you are getting from the readers including me is due to the fact the word 'consistent' in Cricket has an entirely different meaning than in Mathematics or Statistics. It is just unacceptable to find Steve Waugh as the most 'inconsistent' among the top run-getters. Any such statistical analysis is done with a purpose. But from your analysis, no logical conclusions can be drawn. Don't get me wrong I am Cricket Stat buff myself.I love doing and reading this kind of analysis..and quite saddened by the news of Bill Frindall's death.

    Ric's comment: Someone has to be last in any list, Azfar, and in the list you speak of, it happens to be Steve Waugh. There is a good mathematical reason for this - his scores varied more widely from his mean than did the others from theirs. I don't necessarily agree that the meaning of "consistent" in cricket is different from that in other contexts. A batsmen who scores 7 ducks in a row is surely consistent - he certainly couldn't be branded as inconsistent! I think you are taking the word consistency to mean a run of what we would consider to be "good" scores. My meaning of the word is wider than that.

    Bill Frindall was certainly a legend in his own right. I loved his spirit of independence that didn't tie him to bureaucratic decisions of a statistical nature that he regarded as ridiculous.

  • David Barry on January 31, 2009, 0:13 GMT

    sdr, using the co-efficient of variation sort-of assumes an exponential or geometric distribution, rather than a Gaussian distribution. Cricket scores are reasonably close to a geometric distribution, though it's skewed towards zero, and there are more large scores. So this should be a reasonable measure for a batsman's consistency.

    Whether or not it's as good as, say, percentage of scores greater than half the average, I don't know. But I agree with Ric, those two measures would probably be well correlated.

  • Azfar Alam on January 30, 2009, 22:58 GMT

    Ric, I am sorry to say I am not at all impressed with this analysis. This proves nothing and just confuses people.Consistency in Cricket is not about standard deviation as in mathematics. No wonder the 'most consistent' batsman in your analysis are mostly non-batsman or players who played little test cricket. You could have defined consistency as players who most consistently cross a score of (say) 40 which means everytime that player goes out to bat, the team can bank upon him making a good contribution. By your yardstick, perhaps the most consistent players of the modern era, Steve Waugh & Dravid, come you as the least consistent.

    Ric's comment: I think you will find that players "who consistently cross a score of (say) 40" do well in this analysis. Dravid certainly doesn't come out as being the least consistent in the tables above, while I invite you to check out the number of ducks Steve Waugh made compared with the others in the 10000+ club

  • deep on January 30, 2009, 21:04 GMT

    ha ha ganguly is in the 'most consistent' list while dravid joins tendulkar in the 'least consistent' list. now i am the biggest fan of our dada, but this is certainly news to those who have followed indian cricket the last 2 decades, and the only reason is the large number of scores between 30-70 range that ganguly seems to have (and only 16 centuries in 113 tests which is quite low for a batsman of his class). while dravid and tendlya both have careers liberally interspersed with purple patches and mammoth innings which lead to lot of deviations from the mean-thus, ironically, somehow making them 'inconsistent' by the same token as it makes them consistent - (in turning out big scores regularly). Thus the really brilliant batsmen who scores 100s and 200s often will always be considered less 'consistent' (except the Don because he scored a 100 less than every two tests. So in your study, consistency is synonymous with average - read "score around your average"

  • Rommel Ramotar on January 30, 2009, 20:27 GMT

    I was pleasantly surprised to see the great Rohan Kanhai listed in your stats. Rohan was simply among the top 5 greatest batsmen of all time. Stats was never a part of his game. His daring, breathtaking strokeplay has said it all. He always so correct and proper while beating an attack to the dust. He invented the famous falling sweep and I saw him with my own eyes playing a back handed sweet fro four against Neville McCoy of Jamaica(he went on to score 187 retired in that game). Many players including Zaheer < Gavaskar and Bob Taylor(and a famous umpire, can't remember his name) have claimed that he was the best batsman they have seen. Can you imagine if Rohan had played in more tests than he did between the age of 25-35? It is time that someone writes something substantive on this man, so that others can judge him properly..without his use of helmets or padding to every square inch of the body(except for the eyes, and that is debatable as well)which facing the music of fiery leather

  • TropicalSky on January 30, 2009, 20:24 GMT

    Consistency Index calculated via Standard Deviation is never going to be a good index. For example, a player who is consistently bad will have an excellent CI. If you still want to use Standard Deviation, then you have to categorise batsmen on the basis of the number of innings they have played (with a minimum of 50, minimum of 100 and so on) rather than the number of runs scored. Simply speaking, with your analysis a batsman who has scored 20 to 25 runs in every innings and went on to score a total of 1000 runs in 50 innings will look like a much more consistent batsman than another who scored "at least" 20 to 25 runs in every innings, but has also converted a few of them into centuries and scored 1000 runs in a much less 30 innings. You know what i mean?

    Ric's comment: Yes, the CI I used only measures the variation of each player around their own mean. No account is taken of consistently bad v consistently good. In other words, it doesn't seek to show how good players are - just how much or little variation there is in their scoring.

  • Chitraj Singh on January 30, 2009, 18:55 GMT

    I think this is a fantastic approach to measuring a players caliber.Oddly enough as a project between friends,I devised the exact same model of dividing the standard deviation with the mean- however called it the "risk" of a player.

    I then benchmarked this index with the overall "risk" of current batsmen who bat in the similar positions as the batsman being compared.

    The problem I noticed with this approach was that an erraticly poor player i.e. high SD but low average can get the same score as someone who is "consistent" with a high average And as Koos van said a triple hundred can in fact adversely effect your index since standard deviation pays no attention to the direction of the deviation. I havent found a solution to this but I was considering a formula along the lines of Risk = [1-(players score/team total)]* SD/Mean Although this does not necessarily solve the problem adds credibility to scoring more.Obviously this is after you adjust for not outs like you mentioned.

  • Raghav Bihani on January 30, 2009, 18:21 GMT

    If you remove 2 scores of 375 and 400* from Lara's career, it will make him a much more consistent batsman, still with over 11000 runs. But it will not make him a better player.

    Once you are looking at a list of greats, the number of runs, hundreds, consistency, longevity are nothing but mere stats. All that matters is your ability to win matches for your team especially when under pressure.

  • Mohamed Z. Rahaman on January 30, 2009, 17:34 GMT

    This is useless stats. It's impossible to make such lists because you have to take each innings in the contex of the playing conditions, how the rest of the batsmen performed, what kind of support the player had, etc. How do you take Wasim Akram for example... 1. he bats relatively late in the innings, if his team fields first, then he must have spent enormous energy bowling.. etc. And as for Chanderpaul, I don't need stats to tell me that over the last 5 years he's been teh most consistent batsman in the WI in not the world. His innings must be taken in context of the team he's playing for and the role he's asked to play. E.g., opening the batting to batting at #5 or 6.

  • V Prabhu on January 30, 2009, 17:26 GMT

    I think that this analysis is flawed. Good batsmen are those who keep on scoring when they go about 20. Sehwag will always have high standard deviation because all of his last 100's are above 150. I think that a better measure of consistency would be to estimate a probability that a person scores 50. Although you can do curve fitting, and estimate the area of distribution of scores below 50, a simple measure would be to compute ratio of number of 50 plus scores or 30 plus scores to all scores, or the gaps in each 50 plus socres, and take the mean or standard deviation of that. That we we can say, a batsman has consistently scored above 30, or so on. By your logic, a lower order batsman with a highest score of 40 will never have standard deviation higher than 40. That does mean that he is consistent, but is that what we are looking for. Or rather, is he more consistent than Attapattu in scoring 30 runs?

  • aj on January 30, 2009, 17:13 GMT

    i think that this is a result of ponting's consistency over the last 8 years or so. he has been exceptionally consistent. dravid wld be higher if it wasn't for last couple of years. sachin hasn't been as consistent for last 8 years, in contrast to ponting.

  • Siddhartha on January 30, 2009, 16:57 GMT

    I agree with Koos above. The way you have measured it is appropriate to measure 'consistency' However, a far more meaningful analysis would probably come out of measuring how the standard deviation of scores below the median compares with the median. (can use any other percentile instead of median also)

    After all, no one has a problem with batsmen scoring more runs than the expectation. Why not redesign the measure to disregard outliers in that direction?

    Ric's comment: The whole exercise is about the presence or otherwise of outliers - why would you want to ignore them? Those high scores contribute to the mean; they therefore have to be considered when measuring consistency.

  • knight on January 30, 2009, 16:40 GMT

    Another question is does Consistency necessarily mean better batsman. If a batsman fail in one test but make up with a huge hundred in next test, than chances are that he might win the the second test for his team. But someone consistently makes about around 40-50 runs regularly cannot be called a match winner .

    Ric's comment: ...unless everyone else is only scoring 20!

  • Pratik on January 30, 2009, 16:37 GMT

    Nice analysis of consistency, but you might want to look at a cases where a batsman performs well in a crisis, but not when scoring is easy. Some players (like Steve Waugh) have played consistently well in tough situations, but have got out without scoring much in easier situations. Hence, they have a relatively poor consistency record, from a mathematical viewpoint. But, from the opening paragraph of this article I had the impression that one of the motives behind this exercise was "who brings more value" to the side. Despite Mark Waugh showing higher consistency, almost everyone would rather have Steve instead of mark in a crisis situation.

    By the way, it is interesting to see the much maligned Saurav ganguly leading the list of Indian batsmen with more than 5000 test runs. Perhaps dada is a better player than many would make him out to be? It is also a bit surprising to find Dravid so far down. Is this was mainly due to his recent form slump? Any idea about his CI pre-2007 ?

    Ric's comment: Dravid's CI prior to 1905 was 1.16. By the end of the 2006-07 season, it was 1.14. It is now 1.17!

    I think Mark Waugh performed pretty well in crisis situations, and his low CI bears this out. I can recall some crucial innings he played in tight situations. Steve certainly did have a better reputation in this respect though, deservedly or otherwise. His CI is possibly a bit higher than expected by the high number of not outs he had, a result, some cynics would say, of often batting through the tail from a relatively low batting position.

  • Nishith Prabhakar on January 30, 2009, 15:50 GMT

    Not convincing at all. What you have calculated is simply call the co-efficient of variance in statistics. Again, as long as you were calling it a consistency index, it probably was acceptable. But the moment you equated that to "greatness" or the measure of how good a batsman was, the analysis became quite unnecessary.

    Anyway, the coefficient of variance only measures the dispersion of the data, but should ONLY be used with ratio scales. Cricket batting scores are ordinal scale.

    Ric's comment: I was measuring consistency, not greatness. The two, however, are not mutually exclusive in my view. Bradman was both great and consistent. I don't think that was an accident.

  • sdr on January 30, 2009, 14:50 GMT

    What follows is a bit statsy:

    Interesting idea. However, by computing mean and sd in this manner, you are implicitly assuming that scores can be modeled as a Gaussian distribution. The biggest problem with this is that you are not taking into account the fact that scores can not be negative. A Gamma distribution might be more appropriate? Once fitted, you could still compute the variance. It would also better model the occasional high scores which will punished by the mean, sd method.

  • Waqas on January 30, 2009, 14:29 GMT

    The definition of consistency index is little biased. Extreme value for example a score of zero and a score of 220 will create a lot of discrepancy in standard deviation. In my opinion, the best way to check the consistency is by crossing rate around a particular value/values. In plain word, the higher the crossing rate the lower is the consistency.

  • A G on January 30, 2009, 14:13 GMT

    Would be interesting to see something similar for ODI's.

    Ric's comment: I can certainly do that!

  • David brennan on January 30, 2009, 14:01 GMT

    Good article. I am surprised Steve Waugh does not feature more promininatly. From 1995 to the 5th Test at the Oval in 2001-he averaged 70 in first innings and any time Australia struggled he always seemed to score at least a half century.

  • Anonymous on January 30, 2009, 13:59 GMT

    Actually it's not surprising to find Tendulkar down on that list. Out of the 10000 run plus club Tendulkar has more injuries than all the rest. I think if you take Tendulkar as preinjury ,say before 2001 and post injuries after 01/02 that's where you will get some difference. In 2003 I think he scored some 150 runs at an avg of 15!! I remember jokes going around that time about how even Murli had more runs and a better avg. than Tendulkar!!

  • Koos van Zyl on January 30, 2009, 13:55 GMT

    I've thought about a consistency index for batsmen before, but I do think the simple StDev/Ave calculation is too naive. For one, scoring a triple hundred, say, will seriously affect your Consistency Index in a bad way.

    When I think of consistent, I think of someone who makes the runs, every time. Someone who will always cross the 40-run mark, almost guaranteed.

    I don't have the resources to check which indicator would be a good one, but I have a few suggestions which might work better?

    * Use the median score instead of the average score in the standard deviation calculation. * Calculate the standard deviation only for scores less than the player's average/median. * Maybe even a simple [#scores>50 between dismissals]/[#dismissals] will do the trick, where 50 can maybe be replaced by some other number, like average-5 or something. * perhaps a study of the type of distribution scores take (I don't think it's the bell curve...) and using some parameter as the index?

    Ric's Comment: Sure, Koos, there are many ways to do this, probably, but my gut feeling is that doing it this way gives a reasonably accurate reflection of actuality - as may do the ways you suggest.

    Scoring a triple hundred will only affect your CI adversely if you don't do it regularly. If you don't, then surely you are being inconsistent! Bradman's CI is quite low, despite his high scores, because he did it regularly.

  • Henry on January 30, 2009, 13:26 GMT

    Wouldn't skew have a dis-proportionate effect on the s.d. rather than the mean? Did you assess the distributions of the data? Perhaps a transform might reduce the impact of outliers?

  • PMG on January 30, 2009, 13:22 GMT

    Seems that a lot of the emotion and excitement of batting is clustered around those in the high level inconsistency block.

  • Ashutosh Sinha on January 30, 2009, 13:11 GMT

    Just one question? How do you take into account the conditions of the pitch as you had mentioned this in your problem statement at the beginning of the article?

  • TomC on January 30, 2009, 13:10 GMT

    In the originally proposed forumla the correspondent sugegsted using SD (standard deviation) to measure consistency. Shouldn't SD beused only for normal distributions? Batsmen's runs follow follow a positively skewed distribution, i.e. where the distribution curve tapers further on the right of the curve than on the left. Is SD then the appropriate statistic?

  • Pak on January 30, 2009, 13:07 GMT

    I feel that ricky pointing is under rated as shown by this list He is one of the most consistent batsmen and as good as Tendulkar and lara

  • ram on January 30, 2009, 13:02 GMT

    was really insane..these statistics are mere description of ones class

  • Jeff on January 30, 2009, 12:49 GMT

    Hi Ric,

    Thanks for the analysis.

    On Chanderpaul, until 1998 i'd say he was consistent but average. Then from 1999 to 2006 i'd say he was good but inconsistent. But from 2007 onwards he has been consistently brilliant - I had a quick look at his scores over the past 2 years and he's averaging over 100 with a CI of about 1.

    As for whether it's a good thing to be consistent, I think this is partly related to your average... the lower the players average, the more valuable the inconsistency becomes.

    At the extreme end, i'd rather a player who averages 1 in 100 inns scores all of those runs in a single innings rather than 1 run in every inns.

    However, at the other extreme, if a player averages 100 over 4 innings, i'd rather they were 4 innings of 100 rather than one Lara-esque inns of 400 and 3 ducks.

    I hope this makes sense.

    Ric's comment: I can see what you are getting at! You make a good point! You are saying it is good to be consistently good, but bad to be consistently bad!

  • Ross on January 30, 2009, 12:49 GMT

    That's quite surprising - particularly Jacques Kallis, who I've always thought is so consistent is quite prominently on the least consistent list?

  • Ajay Nair on January 30, 2009, 12:43 GMT

    Why would a batsman more consistent according to the formula defined here be better than one who is less consistent, especially among the serious batsmen (5000+ runs)? Most of these guys with one notable exception, average between 45-60. In that case someone who has less standard deviation has more scores in that range, I'd guess; while a Lara or Tendulkar will have more extremes. However, a 45-60 score is of almost no use in a test match - it's certainly unlikely to have a match-winning impact. Meanwhile, a batsman who'd make substantial scores (80+) more often than the 'more consistent' batsmen are likely to have a match-winning/saving impact in these games.

  • D.V.C. on January 30, 2009, 12:42 GMT

    Mark Waugh makes perfect sense. His highest score was just 143 and his average was about 5 runs less than his team mates as a result of him never going on to big hundreds. He needed to be consistent to hold his spot, and he was, he almost never looked out of form in Tests.

    Ric's comment: It's good when the stats match the perceptions!

  • smilingbuddha on January 30, 2009, 12:37 GMT

    Great post, found it very satisfying.

  • Anand on January 30, 2009, 12:31 GMT

    I find these things little confusing. May be stats reveal a part of cricketing ability. If you look at players who have scored more than 10,000 runs. None of them have CI less than 1. Which shows that you will have fluctuations when you play over a longer period. So you can't really look at these CI index seriously. This never brings up something unique out! All it does it get people confused.

    Increase the number of runs by 1000 from 5000. It will reveal that the lowest CI is consistently increasing. Well Don being Don is an exception.

    Ric's comment: The idea is not to view the CI's themselves, but to see how batsmen with similar records compare in their consistency. It doesn't make sense to compare the CI of someone who has played three Tests with someone who has played 150. That's why I have grouped them as I have. The fact that the CIs increase as they play more is interesting, but irrelevant.

  • aartist on January 30, 2009, 12:09 GMT

    Have country wise listing.

  • No featured comments at the moment.

  • aartist on January 30, 2009, 12:09 GMT

    Have country wise listing.

  • Anand on January 30, 2009, 12:31 GMT

    I find these things little confusing. May be stats reveal a part of cricketing ability. If you look at players who have scored more than 10,000 runs. None of them have CI less than 1. Which shows that you will have fluctuations when you play over a longer period. So you can't really look at these CI index seriously. This never brings up something unique out! All it does it get people confused.

    Increase the number of runs by 1000 from 5000. It will reveal that the lowest CI is consistently increasing. Well Don being Don is an exception.

    Ric's comment: The idea is not to view the CI's themselves, but to see how batsmen with similar records compare in their consistency. It doesn't make sense to compare the CI of someone who has played three Tests with someone who has played 150. That's why I have grouped them as I have. The fact that the CIs increase as they play more is interesting, but irrelevant.

  • smilingbuddha on January 30, 2009, 12:37 GMT

    Great post, found it very satisfying.

  • D.V.C. on January 30, 2009, 12:42 GMT

    Mark Waugh makes perfect sense. His highest score was just 143 and his average was about 5 runs less than his team mates as a result of him never going on to big hundreds. He needed to be consistent to hold his spot, and he was, he almost never looked out of form in Tests.

    Ric's comment: It's good when the stats match the perceptions!

  • Ajay Nair on January 30, 2009, 12:43 GMT

    Why would a batsman more consistent according to the formula defined here be better than one who is less consistent, especially among the serious batsmen (5000+ runs)? Most of these guys with one notable exception, average between 45-60. In that case someone who has less standard deviation has more scores in that range, I'd guess; while a Lara or Tendulkar will have more extremes. However, a 45-60 score is of almost no use in a test match - it's certainly unlikely to have a match-winning impact. Meanwhile, a batsman who'd make substantial scores (80+) more often than the 'more consistent' batsmen are likely to have a match-winning/saving impact in these games.

  • Ross on January 30, 2009, 12:49 GMT

    That's quite surprising - particularly Jacques Kallis, who I've always thought is so consistent is quite prominently on the least consistent list?

  • Jeff on January 30, 2009, 12:49 GMT

    Hi Ric,

    Thanks for the analysis.

    On Chanderpaul, until 1998 i'd say he was consistent but average. Then from 1999 to 2006 i'd say he was good but inconsistent. But from 2007 onwards he has been consistently brilliant - I had a quick look at his scores over the past 2 years and he's averaging over 100 with a CI of about 1.

    As for whether it's a good thing to be consistent, I think this is partly related to your average... the lower the players average, the more valuable the inconsistency becomes.

    At the extreme end, i'd rather a player who averages 1 in 100 inns scores all of those runs in a single innings rather than 1 run in every inns.

    However, at the other extreme, if a player averages 100 over 4 innings, i'd rather they were 4 innings of 100 rather than one Lara-esque inns of 400 and 3 ducks.

    I hope this makes sense.

    Ric's comment: I can see what you are getting at! You make a good point! You are saying it is good to be consistently good, but bad to be consistently bad!

  • ram on January 30, 2009, 13:02 GMT

    was really insane..these statistics are mere description of ones class

  • Pak on January 30, 2009, 13:07 GMT

    I feel that ricky pointing is under rated as shown by this list He is one of the most consistent batsmen and as good as Tendulkar and lara

  • TomC on January 30, 2009, 13:10 GMT

    In the originally proposed forumla the correspondent sugegsted using SD (standard deviation) to measure consistency. Shouldn't SD beused only for normal distributions? Batsmen's runs follow follow a positively skewed distribution, i.e. where the distribution curve tapers further on the right of the curve than on the left. Is SD then the appropriate statistic?