April 13, 2012

Consistency in Test batsmen: a new look

A statistical analysis of consistency among Test batsmen

Allan Border has the lowest standard deviation among the top-20 batsmen Adrian Murrell / © Getty Images

This is based on an idea given by Prashanth. After giving the idea and participating in a discussion or two, he disappeared off the radar. However I thank him for providing the spark.

This follows the article on "Consistency in Test bowlers: a new look" (click here). The relevant points are explained below.

1. I had used 5 Tests as the basis for bowling. However there are many Tests in which a batsman does not get a chance to bat, because of heavy top-order batting, innings wins, big wicket wins et al. Hence I have taken 10-Innings slices as the basis for batsmen analysis. This is a reasonable number and normally covers 2-3 months of Test cricket. This is normally 5-6 Tests.
2. 10 innings means that batsman can go through a Test or two of limited opportunities to bat or non-batting because of emphatic wins etc. There will be enough opportunities within the 10-innings slice to catch up.
3. There is enough time to get over short duration loss of form.
4. To measure consistency, only runs scored will be used. The fundamental cricket dictum that batsmen should score runs and bowlers should take wickets is followed. Averages are important mainly over a career and for comparisons across players.
5. Why not average? Let us take couple of examples to understand why not. Sehwag and Younis Khan have career averages just over 50 and RpT values of around 85. In a 10-innings period, match context being comparable, Younis scores 330 at 55 and Sehwag scores 450 at 45. Who has performed closer to his career figures and for that matter, better. Certainly Sehwag, despite the lower slice average.
6. Let us not forget that we remember numbers like 974 (Bradman), 774 (Gavaskar) and 688 (Lara) rather than the averages.
7. The career slices should be non-overlapping and equal, other than the last one. Gooch's 333 should be part of one career slice only. Hence the concept of rolling number of innings is not valid.
8. 10 innings might seem arbitrary but represents a long enough career slice. It represents a long 5/6 Test series.
9. The keyword is consistency with reference to the player's own career performance levels.
10. We are not looking about high and low values but only relative to the concerned player's career figures. Over a 10-innings stretch Graeme Smith is expected to score 408 runs and Habibul Bashar is expected to score 300 runs. This will be the basis. If Smith scored 350 runs, it is a below-average performance and if Bashar scored 350 runs, it is an above-average performance.
11. Adjustment is made for the last career slice if the same is fewer than 10 innings.
12. The criteria for selection is 3000 or more Test runs. 162 batsmen qualify. It is unfortunate that a few top batsmen like Graeme Pollock and George Headley do not make the cut.
14.The Standard Deviation (SD) of the slice ratios is used to determine consistency.
15.There were suggestions that I should use more Tests/innings as the basis. I have resisted that idea mainly because I want to be hard on the players. If English batsmen had a great five- Test stint in summer and a poor five-Test sojourn in winter, I want these to be treated as two out-of-the-normal occurrences and do not want to get the 10 Tests together, get a nice, middle-level performance which papers over cracks. Same with all teams. Let us also agree. If a batsman scores 180 runs in 10 innings, it is a major cause for concern and should not be covered up by 600 runs in 10 innings before or after this barren period..

The following 5 groups are formed for purposes of determining consistency. For each career-slice of 10-innings, a ratio is formed between that concerned slice's runs and the career-average runs for 10 tests. This ratio is called SPF (Slice Performance Factor). Suppose the batsman has scored 284 runs and his 10-innings and his career-RpI value is 40, the SPF value is 0.71. If he scored 501 runs, the SPF is 1.25.

A. SPF  below 0.67:  Well below average - Falls into the inconsistent bracket.
B. SPF 0.67 - 0.90:  Below average
C. SPF 0.90 - 1.10:  Around average
D. SPF 1.10 - 1.33:  Above average
E. SPF  above 1.33:  Well above average - Falls into the inconsistent bracket.

Groups B, C and D are considered to be well within the average levels. Standard Deviation is also used to determine the consistency.

First some data tables. The complete table is available for download. The tables and graphs are presented with least comments. Let me allow the erudite readers to come out with their own comments.

BatsmanTeamInningsRunsAvgeRpIMeanStdDevMid3%GrpsGrp AGrp BGrp CGrp DGrp E
Tendulkar S.RInd3111547055.4549.70.990.32568.832410756
Dravid RInd2861328852.3146.50.990.33272.42949574
Ponting R.TAus2761319653.4347.81.010.41260.72855846
Kallis J.HSaf2571237956.7848.21.000.34869.22645674
Lara B.CWin2321195352.8951.50.990.27875.02438373
Border A.RAus2651117450.5642.20.990.24181.527261063
Waugh S.RAus2601092751.0642.01.000.33361.52647546
Jayawardene D.PSlk2171044351.1948.11.000.35259.12254454
Gavaskar S.MInd2141012251.1247.31.000.35068.22246453
Chanderpaul SWin234970949.2841.51.010.30466.72445564
Sangakkara K.CSlk183938254.8751.30.990.42557.91944524
Gooch G.AEng215890042.5841.40.990.36372.72245652
Javed MiandadPak189883252.5746.71.000.38457.91936415
Laxman V.V.SInd225878145.9739.00.990.30769.62337634
Hayden M.LAus184862650.7446.90.990.38547.41961624
Richards I.V.AWin182854050.2446.90.990.40673.71927613
Stewart A.JEng235846539.5636.01.000.36866.72446644
Gower D.IEng204823144.2540.30.990.29785.72127651
Sehwag VInd167817850.8049.00.990.42852.91745314
Boycott GEng193811447.7342.00.990.33270.02034643
Smith G.CSaf174804349.6546.20.990.35066.71828224
Sobers G.St.AWin160803257.7850.21.000.30768.81633442
Waugh M.EAus209802941.8238.41.000.28376.22126643
Fleming S.PNzl189717240.0737.91.000.24784.21918442
Chappell G.SAus151711053.8647.11.000.25581.21624271
Bradman D.GAus80699699.9487.51.000.27275.0812131
Flower AZim112479451.5542.80.980.43666.71224402

To clarify the table contents. RpI mean Runs per innings. Mean is the mean of the SPF values and is close to 1.0 for all batsmen. StdDev is the Standard Deviation for all the SPF values. Mid3% is the % of the Groups B, C and D over the total number of Career Slices, which is the next column: Grps. Grp A to Grp E are self-explanatory. The complete file is available for downloading. The link is provided at the end. The first one is the core table of batsmen who have scored over 8000 runs in their Test career. In addition, Don Bradman (no need to explain), Greg Chappell (a modern great), Stephen Fleming (New Zealand) and Andy Flower (Zimbabwe) are included.

Contrary to what all of us may have perceived, Lara is remarkably consistent on this 10-innings basis. His SD of 0.278 is second only to Border amongst the top-20 batsmen. Just to confirm that this is not a fluke, look at his Mid3% which is quite high at 75.2. Again, bettered only by Border and Gower.

Consistency is determined in two ways. The first is statistical. The Standard Deviation (SD) is determined for all the ratios. Low SD values indicate consistent players and high SD values indicate inconsistent players. The usual method of using the Coefficient of Variation is not required since the means for almost all players is around 1.00. Shown below are the SD tables with the low-20 SDs indicating very consistent batsmen.

BatsmanTeamInningsRunsAvgeRpIMeanStdDevMid3%GrpsGrp AGrp BGrp CGrp DGrp E
Greig A.WEng93359940.4438.70.990.171100.01002530
Redpath I.RAus120473743.4639.51.000.19591.71204431
Ranatunga ASlk155510535.7032.91.010.20293.81607441
Hassett A.LAus69307346.5644.50.990.20485.7711230
Fredericks R.CWin109433442.4939.81.000.205100.01103440
Pietersen K.PEng143665449.2946.51.010.21086.71513731
Knott A.P.EEng149438932.7529.51.000.22886.71514541
Saeed AnwarPak91405245.5344.51.020.230100.01004150
Smith R.AEng112423643.6737.81.000.23683.31214421
Hutton LEng138697156.6750.50.990.23785.71415251
Wright J.GNzl148533437.8336.01.000.23880.01516242
Border A.RAus2651117450.5642.20.990.24181.527261063
Ijaz AhmedPak92331537.6736.00.980.24690.01004231
Fleming S.PNzl189717240.0737.91.000.24784.21918442
Mushtaq MohammadPak100364339.1736.41.000.24870.01013222
Hunte C.CWin78324545.0741.61.000.24887.5803311
Collingwood P.DEng115426040.5737.00.980.24991.71213440
Strauss A.JEng167660441.0239.51.000.25082.41709233
Sutcliffe HEng84455560.7354.20.980.25277.8912411
Chappell G.SAus151711053.8647.11.000.25581.21624271

Tony Greig is the surprise leader in this table, with a low SD value of 0.171. The most notable modern batsman in this table is Pietersen with an excellent SD value 0.210. Other than Pietersen there is no current batsman in this list. Like Lara. he has certainly surprised us. Maybe there is a lot of substance behind that exaggerated swagger. He talked about the many hours of practice put in while talking of his Colombo classic. Maybe that is paying off. It is also possible that unlike what one associates with him, he does not have extensive bad patches nor purple patches. I also wish he stops making silly statements.

The alternate method is common-sense-based rather than on a statistical measure. The two extreme group numbers, A and E, are considered significant departures from the career levels. The middle three group numbers are added and divided by the total number of slices to get the Mid3%. This reflects the consistency of the players. Shown below are the SD tables with the high-10 Mid3% values.

BatsmanTeamInningsRunsAvgeRpIMeanStdDevMid3%GrpsGrp AGrp BGrp CGrp DGrp E
Fredericks R.CWin109433442.4939.81.000.205100.01103440
Saeed AnwarPak91405245.5344.51.020.230100.01004150
Greig A.WEng93359940.4438.70.990.171100.01002530
Ranatunga ASlk155510535.7032.91.010.20293.81607441
Redpath I.RAus120473743.4639.51.000.19591.71204431
Collingwood P.DEng115426040.5737.00.980.24991.71213440
Ijaz AhmedPak92331537.6736.00.980.24690.01004231
Hunte C.CWin78324545.0741.61.000.24887.5803311
Pietersen K.PEng143665449.2946.51.010.21086.71513731
Knott A.P.EEng149438932.7529.51.000.22886.71514541
Gower D.IEng204823144.2540.30.990.29785.72127651
Cook A.NEng135618448.6945.81.000.29185.71415521
Hutton LEng138697156.6750.50.990.23785.71415251
Slater M.JAus131531242.8440.50.980.26385.71414441
Hassett A.LAus69307346.5644.50.990.20485.7711230

These are the batsmen with high middle three group % values indicating a high degree of consistency. In the bowler tables, there were six bowlers with 100% of their groups in the middle-3 groups. It seems like batting is slightly more difficult since there are only three batsmen. These all belong to the 70s/80s/90s. Roy Fredericks, the attacking West Indian batsman leads the three-some, followed by Saeed Anwar and Tony Greig. Collingwood is there as also Pietersen and Cook. Possible reason for England's pre-eminence.

Now for some special graphs.

Top run-scoring batsmen

Top run-getters in Tests career
© Anantha Narayanan

The top-9 batsmen, who have crossed 10000 Test runs, are featured. It can be clearly seen that most of these batsmen do not exhibit a high level of consistency. The only exceptions seem to be Allan Border and for the first two-thirds of his career, Jayawardene.

Most consistent: Based on low SD values

batsmen with low standard deviation values
© Anantha Narayanan

As already discussed this table is led by Tony Greig. A fairly low SD of 0.171 indicates a very consistent career. This is borne out by his placement in the next graph also. However it should be noted that the lowest SD value for bowlers is a much lower 0.124. Pietersen finds a place in both the consistency graphs.

Most consistent: Based on high Middle-3-group % values

Batsmen with high middle-3 group % values
© Anantha Narayanan

Unlike bowlers where there were six with 100% in the middle categories, amongst batsmen, there are only three: namely Fredericks, Saeed Anwar and Greig.

Least consistent: Based on high SD values

Batsmen with high standard deviation values © Anantha Narayanan

These graphs look like the dying person's cardiograph. These batsmen have had moves up and down throughout their career. Exemplified by Gambhir who had a poor start, great move up and then fell off equally badly. Vettori has had such a Jekyll and Hide career that it is not surprising to see him here. In the first 70 innings Vettori averaged 18. In the next 100 innings he averaged well over 35.

Least consistent: Based on low Middle-3-group % values

Batsmen with low middle-3 % values
© Anantha Narayanan

It is clear that these two methods of determining consistency are quite different. There are different sets of batsmen in the two graphs.

Batsmen with top RpI figures

Batsmen with highest RPI figures
© Anantha Narayanan

Just to complete the analysis I have given here the charts for the top batsmen - by Runs per innings, since most of them would have missed the first chart: by career runs scored. Again inconsistency seems to be the trend here.

I think mention must be made of two batsmen, Tony Greig and Kevin Pietersen. Tony Greig never went off the middle three groups. That is some level of consistency. Pietersen, amongst the modern batsmen, has surprised us with his high degree of consistency.

To download/view the Excel sheet containing the complete data for 162 batsmen please click/right-click here. I have strengthened the Excel sheet by colour coding the individual SPF values through dynamic formatting.

Ed Smith's thought-provoking piece on randomness and form "When is poor form just randomness?" (click here) made me realize that this particular measure I have created can be applied to Ed Smith's axiom. Suppose I summed the SPF values of the top six batsmen or top four bowlers for every Test/innings, we would know what are the lowest SPF averages (very poor form, as a group of six/four players) and the highest SPF averages (very rich form, as a group of six/four players). That, for a later article.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems

Comments have now been closed for this article

  • testli5504537 on April 25, 2012, 8:40 GMT

    Consistency against individual career average itself is inconsistent. There should be a bench mark just like your 3000 run marks, either by particular number for career average or particular number of runs per innings. What is the purpose if we are analysing against whether a batsman consistently scoring 30 runs or 25 runs if his career average is 30 or 25? First of all, what is consistency ? Whether he is consistently 'an average batsman' or consistently 'a good batsman' or consistently 'a weak batmsan'? There should be some purpose. Secondly for sure slice of 10 is too big. Better to have slice of 4 and 6 and 8 etc, and then average it. By doing all thes,e We can excel in mathematics,charting,analysing or deriving something, but what is the purpose? We should have some target to derive something.

  • testli5504537 on April 23, 2012, 16:25 GMT

    @Anant, @Prashanth: Ananth, Prashanth is only trying to take you back to his unanswered original qn, which was quite relevant. The qn: how often did a batsman let down his team vs how often he delivered? I suggest the following answer. Let's make 3 cases.

    Case 1 = only 1 innings; Case 2 = he played in both innings & upto 6 wkts fell in 2nd; Case 3 = both innings completed.

    Let's say fail = 1 if

    1. less than A1= min of (30 runs & 25% team runs) if Case 1. 2. less than A2= min of (50 runs & 25% team runs) if Case 2. 3. less than A3= min of (70 runs & 20% team runs) if Case 3.

    I use "minimum of" function to pass smaller scores in low scoring matches. Then, success = 1- fail. In this binary rating, fail = 1 or 0, & success = 0 or 1. This can be fine tuned to

    fail =

    1. B1 = runs scored/ A1 (if Case 1); 2. B2 = runs scored/A2 (if Case 2); 3. B3 = runs scored/A3 (if Case 3).

    This lets fail be between 0 & 1. We may still rate success as 1 or 0, as per Prashanth. [[ Might be, but there is a way of doing things especially when I have said that I will look at it later. I really do not want to look at it now. Thanks, Alex. Ananth: ]]

  • testli5504537 on April 23, 2012, 13:14 GMT

    Ananth, ???! OK- my last post on the matter. Complete logical inconsistency and breakdown on your part I fear. Your argument seems to be that since this has turned out to be a statistician’s delight we must ignore reality. A Test “series” is a “collection” or a “series” of INDIVIDUAL matches. It is not ONE single match, it is a collection of individual matches. The “end” result of a series depends on the results again of INDIVIDUAL matches. Again ,to use an extension of my previous example if we have a 3 Test series: If a batsman “lets his team down” in say the first 2 and then scores heavily in the 3rd…then overall the batsman may actually be labelled “matchlosing” ! Say a string of 30,30,25,35, 130,60 The 2 matches which may have been “saved” or won have been lost…and higher scores in the 3rd do NOT impact them. The summing up of the entire “string” does NOT have any real cricketing meaning. Re. all the various statistical delights you point out – that is all very well and good fun. But they unfortunately do not have any connection to cricketing reality.

    Refusing to accept this fact is actually colouring reality in favour of statistical abstractions. [[ Thank you for your contributions so far. Your negative views, the disdain you are showing for me and the other readers and the words you have used make me think that you should find better blogs to visit. Ananth: ]]

  • testli5504537 on April 23, 2012, 10:28 GMT

    Milpand Thank you for taking time out for your clear and lucid explanation. All well appreciated. I only wish I was capable of such clarity. But I will try one last time. My point is that the statistical methods and theory used (however valid and fundamentally acceptable in themselves) must actually find parallels in “real life” if they are to be of any value. There exists no such thing as a “slice” (of more than a single match) in cricket. It is a pure fabrication. So, straight off, we have committed a cardinal sin. We are immediately indulging in fantasy. When we use multiple “strings” or “slices” of innings we immediately lose contact with “real life”. A real life Test match comprises of maximum 2 innings. The longest string we can use is 2 innings and that too ONLY in the match in question. Anything else is artificial and un-real. As mentioned if a batsman is credited with a “matchwinning” performance in a particular match that performance is perforce restricted to that particular match alone. The scores in that particular match cannot be spread out, dispersed, or clumped together with other innings in other matches- which is precisely what is being done when clumping together various strings/slices. This is non-sensical in cricketing terms. Perhaps the easiest way of looking at it would be from the team’s point of view. If a batsman underperforms significantly in any given match this may lead to a team losing that particular match. Subsequent or prior innings on either side have absolutely NO impact to the match in question. An eg. would be scores of 20, 30 (in one match), and then 100 in the first innings of the next match. If the 20, 30 from a main batsman “leads” to a team losing, the result cannot be reversed by the subsequent 100 in the NEXT (or prior) match. This is simply not how things transpire in real life. A slice/string comprising of 8/10 (or such) innings is a purely artificial construct with no parallels whatsoever in real life cricket. A poor match by a batsman may well contribute to his team losing. Scores on either side simply do not matter. The only slice we ACTUALLY ever encounter in cricket is that of a single, solitary match. I full well know I am being repetitive – but hope that I am getting at least some of my argument across. [[ So, according to you, a Test Series is a fabrication and analysis based on a Test Series a cardinal sin. For that matter, an ODI tournament. Okay a Test series has the benefit of same opponent and same location. The slice, however may be against different opponents and in different locations but has the advantage of uniformity in size for comparison purposes. I agree that I did not address your specific point. I have also told that I would look at it later. But why are you pulling down this analysis. It has a major weakness in the impact of starting point. But how many people have derived good insights and revealing facets of player careers. And how many valuable aspects of individual innings distribution it has led us into. And what are all the types of analysis of basic statistical measures it has taken us into. Coloured glasses are great in the sun, not inside where they should be taken off to get a clear view. Ananth: ]]

  • testli5504537 on April 23, 2012, 10:09 GMT

    Thanks Milind for your lucid example.

    Ananth: As Milind pointed out, Individual scores suffer from the asymmetric distribution syndrome. But samples of averages/RpIs from a population (career innings) are normally distributed, and the method that I wrote to you about in detail could be evaluated further.

    Also, an analysis based on averages/RpIs in various innings, against various countries, across various results etc. with S.D. as a consistency measure can be used in conjunction with the consistency measure derived using the boundary values of Median, HQ and LQ, either for combination or for comparison? [[ Yes, all this makes sense. At a later date, let us re-visit this. You guys don't forgot what you wrote so that if I miss something you could remind me. Ananth: ]]

  • testli5504537 on April 23, 2012, 8:36 GMT

    Oh! and BTW, may be a similar analysis for bowlers. I suspect that Hadlee, Murali and Kapil may do rather well [[ Samir, I think you must have had a tough day today !!!. The article before the Test batsmen consistency analysis was the Test bowlers consistency one. Ananth: ]]

  • testli5504537 on April 23, 2012, 8:34 GMT


    I would like to know who are the Atlases of cricket. How well does a batsman do when others have failed? I know that you have done analysis of individual innings that have stood out in a match. Have you done an analysis over a career? An interesting stat is how many times in his career (as a ratio of the total number of innings) has a batsman faced more than 33% of the balls bowled or scored more than 40% of the runs. It may be a good idea to give a higher weightage to a fourth innings score as compared to a first innings score. A match saving innings is again more valuable than a match winning innings (because we are looking at Atlases). My suspicion is that batsmen like Dravid, Border, Gavaskar, Headley and Lara will do well as compared to Tendulkar, Ponting and Richards. Not sure about Bradman. He was so far ahead of everyone else, it may be a good idea to exclude him from this analysis. [[ Samir The article starts with a reference to the Batsmen peer comparison analysis which was done a few months back. Ananth: ]]

  • testli5504537 on April 23, 2012, 4:23 GMT

    @Ananth & @milpand: Nice observation on Martin. Pommie Mbangwa was even more consistent, I think. He too has a median of zero and trumps Martin on both ave (2 vs 2.16) and SR (17 vs 20). [[ I am not going to take this nonsense lying down !!!. I admit, when I did the worst Test batsmen piece a few years back, Pommie upstaged Martin and many Kiwis were miffed, justifiably so. No, sir, not now. One day I will revisit that theme and this time it WILL be Chris Martin since I will set the cut-off sufficiently high for poor Pommy to stay out of the analysis and stay in the commentator's box and lament. I may compromise on Lara but never on Martin. Ananth: ]]

  • testli5504537 on April 22, 2012, 23:21 GMT

    Ananth, a simple measure that ranks these selected players on the basis of their likelihood to post a score within a range closer to their own middle values (Mean, Median, RpI, Average) as below:

    Greig: 1.38 Border: 1.41 Dravid: 1.51 Lara: 1.56 SRT: 1.66 Ponting: 1.8 Bradman: 1.9 Gambhir: 2.11

    makes sense to me.

    [[ THis is the (HQ-Median)/(Median-LQ) value. Looks to be a very sensible methodology. Anyhow we have more time for the follow-up work. Ananth: ]]

    I am a great fan of Border. If this measure rates him highly then it must be good.

    I like Chris Martin for his batting. This measure is rubbish if he does not fare well. [[ I am not going to take this from you. I am a greater fan of Martin than you are. He is the only one I will pay to watch bat. Between 1 and 10 balls of fun. So much so, I created Martin's career file and that is summarized below. 12, 7,7,7, 5,5,5,5, 4,4,4,4,4,4,4,4, 3,3, 2,2,2,2,2,2,2, 1 (14 times) and 0 (59 times). HQ=2; Median=0; LQ=0. The above ratio is (2-0)/(0-0) ??? "What do I do when the edian is 0 ???". You solve the problem. I would give him an honorary 1.00. Ananth: ]]

  • testli5504537 on April 22, 2012, 23:18 GMT

    I have deliberately included a number of innings to get past the familiar figures of 334, 309, 299, 254 and 0. Let us revisit LQ, Median, Mean, HQ at the end of 40th and 80th innings. After 40: 17, 46, 88.2, 125 After 80: 17.5, 56.5, 87.5, 133.5

    Stdev can be used in a normal distibution. Hence statements like: "About 68% of values drawn from a normal distribution are within one standard deviation away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations." are not unfamiliar.

    But skewed distributions are asymmetic, not normal. Skewness is a measure of asymmetry. If distribution is symmetric then mean equals median. Therefore there is some value in looking at the ratio of median to mean to understand asymmetry. Also HQ & LQ are the best available alternatives where stdev can't be used.

    Box plot is the pictorial view for a skewed distribution that combines 6 important values other than famous cricket average. [[ Excellent explanation, Milind. I am sure readers would benefit a lot. I think there is a lot of value in looking at the three key figures, HQ, Median, LQ and the Mean. For cricket distributions the High score is useless. What does it matter if Lara's HS is 300 or 400. And almost all will have a string of low scores, at 0. Ananth: ]]

  • No featured comments at the moment.