January 3, 2012

Pitch quality analysis across all Tests

A detailed analysis of pitches using Runs-per-wicket and Balls-per-wicket values
41

Shane Bond in the Hamilton Test match in 2002 when 36 wickets fell for just 507 runs Chris Skelton / © Photosport

This is the second of three very important and significant articles on batting performances against differing conditions and players. The first did a revised take on the Bowling Quality Index. This one covers the Pitch Quality and the third one would combine both and do an analysis of runs scored by batsmen.

Before I get into the article I have to report a very significant move forward in my database contents. Most readers would know that I had the ball data for only around 740 matches, a meagre 37%. In order to redress this situation, I had approached a few readers and five of them, Raghav, Boll, Rameshkumar, Ranga and Anshu, responded magnificently.

Over the past two weeks, the six of us have shared the work and downloaded over 600 scorecards. I have incorporated the balls played information for all these and also took the opportunity to post the 4s/6s information also. Now my database is looking wonderful with 1374 scorecards (68% - a far cry from 37%) containing balls played and 4s/6s data. From match no 1070 (1987), I have an unbroken sequence of 957 scorecards with complete data. This opens up many new avenues of analysis, especially in the analysis of boundaries hit.

Once again my heartfelt thanks to Raghav, Boll, Rameshkumar, Ranga and Anshu, who spent hours during the holiday season. May their tribe flourish.

There is nothing to be gained by looking at history to determine the quality of pitch. The following example will convince anyone on the futility of such a view. Let us look at happenings in the same ground, Hamilton, in two matches played within 12 months of each other.

Tale of two Tests at Hamilton

2003: Ind 99 ao & 154 ao. Nzl 94 ao and 160 for 6. 36 wkts at 14.1.
2003: Nzl 563 ao & 96/8. Pak 463 ao. 28 wkts at 40.1, despite the last innings.

Let us move north for a few thousand kilometres from the Waikato river to Sabarmathi river and into dusty Ahmedabad. Again two matches within an year of each other.

India at Ahmedabad within a 18-month period

2008: India 76 ao against South Africa. RpW: 33.2.
2009: India 760/7 against Sri Lanka. RpW: 76.1.

If I averaged these two figures and come out with 400 or so, I would be correctly laughed off the pitch. That is the type of mistake non-informed and superficial analysts would do. So I am not going to look at history. There are hundreds of such examples. There is nothing worse than averaging such widely varying values.

Instead I am going to look only at the specific match. I am not also going to make the mistake of going down to innings. The recent match from "The Twilight Zone" at Cape Town is enough reason to stay away from this. 284 ao, 96 ao, 47 ao, 236/2 neither indicates a horror pitch of RpW (Runs per Wicket) of 7.2 considering the middle two innings nor a very comfortable pitch with an RpW of 43.3 looking at the first and fourth innings. This is a match with a RpW value of 20.7. A difficult pitch but not an impossible one to bat on.

Thus it is clear that the overall match RpW seems to be closer to the actual pitch condition. This will take care of many a match in which the innings scores change significantly. Four innings and five days present us with a chance to come very close to determining the way the match went.

I will only consider the first 7 batsmen. The last four batsmen will distort the picture. Thus I am going to look at a maximum of 28 innings and determine the average. This is called T7-RpW.

Now we come to a different problem. The RpW works well in most cases. However there are some matches in which the extremely low rate of scoring means that very few runs were scored but quite a few balls were faced. This indicates not a diabolical pitch but only a difficult pitch. To a considerable extent the ultra-defensive approach of the batsmen would have contributed to this situation.

First let us see two examples, at either end, where the RpW and BpW (Balls per wicket) are in sync.

Match  216: Scores 36 ao, 153 ao & 45 ao. RpW: 8.5. BpW: 22.3.
Match 1374: Scores 537/8 & 952/7.         RpW: 113. BpW: 215.

These represent the two extremes. Whatever measures are taken the pitches remain where they are: diabolical to the nth degree in either direction. Hence in these and hundreds of other matches the RpW values are sufficient.

Now let us take a look at three matches, taken as two comparisons.

Match 1037: Saf 109 & 101,  Aus 230.      RpW: 14.1. BpW: 33.7.  RpO: 2.50
Match 0438: Saf 164 & 134, Eng 110 & 130. RpW: 14.3. BpW: 60.8. RpO: 1.40

These two matches have similar RpWs of around 14. If we consider only the RpW, we may conclude that batsmen of both matches had the same level of difficulty. However that is far from the truth. In the second match the BpW is nearly twice that in the first match. If it took the bowlers nearly 61 balls to claim a wicket, that is below the career strike rates of Walsh, Swann, Caddick, Zaheer Khan and Snow, how can the pitch be that difficult. It is clear that the pitch is far from diabolical but the batsmen have made a meal and half of it. The scoring rate has been abysmal. The RpW value has to be adjusted slightly upwards.

Match 0438: Saf 164 & 134, Eng 110 & 130.    RpW: 14.3. BpW: 60.8. RpO: 1.40
Match 0782: Pak 417 & 105/4, Nzl 157 & 360.  RpW: 33.8. BpW: 60.7. RpO: 3.23

In this pair of matches, one being repeated, the RpW values show nearly 150% variation. However the BpW values are similar. In the second match the batsmen have been positive and achieved very respectable RpW value. The first team has gone on the defensive and got only a sub-15 RpW value.

I am sure the readers are going to say that the period, viz., the 1950s, when Test no 438 was played, was a defensive one and teams did not think twice about scoring very slowly, irrespective of the situation. The slowest Test innings ever is New Zealand scoring 69 for 6, during 1955, against Pakistan in 90 overs: Yes, it is correct, not a misprint. In fact it is this very trend which has to be taken care of. New Zealand lost only 6 wickets in 90 overs, leading to an innings RpW of 90. Was it a 11.5-RpW quagmire of pitch. No, certainly not. It was probably a 30-RpW wicket. It is this adjustment I am referring to.

After trying out a few scenarios I have hit upon a simple method. One which would be clearly understood and accepted by all. I have realized that both RpW and BpW are important. Hence I have determined the Pitch Quality Index by adding the two values together. However I feel that the RpW value is more important. Hence the RpW gets three-fourths weight and the BpW value, one-fourth weight.

This is not as arbitrary as it looks. With an overall RpO of 3.0 across all matches, the BpW value is around twice that of the RpW value and 75-25 looks perfect. 67-33 will make this skewed too much towards the BpW value. As an example, take RpW of 30 and BpW of 60. 67-33 takes PQI to 40 (20 + 20). 75-25 comes out with a PQI of 37.5 (22.5 + 15). The final PQI values do not matter. The almost equal weight, as shown in the first case, negates my requirement that RpW should have a higher impact.

Again readers should realize that the seemingly arbitrary nature of this does not matter since it is only a derived interim index value. It is also applied across all matches. I will call this composite value PQI (Pitch Quality Index).

I can anticipate a question why the BpW has been taken and not BpI since that seems to make more sense. However since I am adding two disparate figures, it is not correct to have one based on wickets and the other based on innings. If a top-7 batsman could not be dismissed, then to that extent the pitch has to be related to that. If Lara's 582 balls are not included in the computation of the BpW (as against the BpI) figure at Antigua, that would correctly increase the BpW value by about 35 (since 16 top order wickets fell in that match), bringing to light the true dead nature of the pitch.

It can be seen that the period in which the match is played does not have any relevance since only what happened in the match is taken into account. For instance take two Tests in 2001: Test# 2021 (Aus vs Nzl) has a PQI of 27.2 (PG-5), Test# 2016 (Saf-Aus) has a PQI of 27.7 (PG-5). Now look at Test# 2008 (Slk vs Aus) with a PQI of 77.6 (PG-1) and Test# 2009 (Pak-Slk) with a PQI of 77.6 (PG-1). Such examples abound even during the notorious batting-friendly periods. Readers should never forget that this is a post-match actual measure.

Given below are the matches with extreme PQI values. Teams are all out if not mentioned otherwise.

The top-10 and bottom-10 Tests in PQI table


MtId Year Hme Awy< ------Top-7 Batsmen------ > Ins No Runs BpW RpW PQI Grp Innings scores

0028 1888 Eng Aus 28 1 190 7.0 19.1 10.1 5 (116, 53, 60, 62) 0216 1932 Aus Saf 21 0 178 8.5 22.3 11.9 5 (36, 153, 45) 0238 1935 Win Eng 28 3 248 9.9 20.6 12.6 5 (102, 81/7, 51/6, 75/6) 0030 1888 Eng Aus 21 0 199 9.5 23.1 12.9 5 (172, 81, 70) 0027 1888 Aus Eng 28 0 229 8.2 27.5 13.0 5 (113, 42, 137, 82) ... ... ... 0878 1980 Pak Aus 11 2 931 103.4 216.9 131.8 0 (612, 382/2) 1374 1997 Slk Ind 14 2 1366 113.8 215.4 139.2 0 (537/8, 952/6) 0696 1972 Win Nzl 14 5 903 100.3 270.1 142.8 0 (365/7, 543/3, 86/0) 0418 1955 Ind Nzl 14 5 1012 112.4 292.3 157.4 0 (450/2, 531/7, 112/1). 1781 2006 Pak Ind 10 3 1025 146.4 190.7 157.5 0 (679/7 and 410/1)


Three of the low PQI matches were played during before WW1 and two during the 1930s. On the other hand the high PQI Tests have been distributed over the years. The PQI runs from 10.1 for the 1888 Test through 11.9 during 1932 at MCG and through 157.4 at New Delhi during 1955 and finally 157.5 during the run-deluge at Lahore during 2006.

A brief referral back to the three matches we had considered earlier.

1037: 14.1 & 33.7 lead to 20.6 (Group 5 - but lower)
0438: 14.3 & 60.8 lead to 29.8 (Group 5)
0782: 33.8 & 60.7 lead to 42.8 (Group 3).

It can be seen that the differing BpW figures has certainly separated these three matches in a clear manner.

The distribution is quite skewed. This is confirmed by the statistical measures. The distribution has a mean of 48.5 and a Standard Deviation of 16.6 which means the Coefficient of Variation is a rather high 0.34.

The range of the PQI is so wide that all calculations go awry. Remember that this is post-match measure, unlike the BQI which is a pre-match expectation. Hence it was possible to put limits on BQI. Here nothing like that can be done. The distribution is also not a Normal one like the BQI. There are only 80 matches between 86 (half of 172) and 172. So the PQI cannot be allocated blindly or by making standard assumptions. Hence I have done the following. The 27 is a starting point determined by looking at low RpW and BpW values. Then reasonable gaps are allowed for subsequent groups. It is possible that some fine-tuning will be done when I do the Batsmen analysis, especially in the formation of groups.

Summary of PQI Grouping


Below 27.0: Group 5 - 101 ( 5.0%) A nightmare for the batsmen 27.0 - 37.0: Group 4 - 388 (19.2%) Difficult to bat on 37.0 - 47.0: Group 3 - 563 (27.8%) Good pitch - slightly favouring bowlers 47.0 - 60.0: Group 2 - 576 (28.4%) Good pitch - slightly favouring batsmen 62.0 - 80.0: Group 1 - 313 (15.4%) A belter (I have had too much of Shastri!!!) Above 80.0 : Group 0 - 85 ( 4.2%) Where "open season" is declared on bowlers.
Finally while the memory is fresh, let me show the relevant values for the two Tests which finished a few days back.

Boxing Day Tests

Match 2025: Scores 333, 282, 240 & 169. RpW: 25.6. BpW: 52.9. PQI: 32.5
Match 2026: Scores 338, 168, 279 & 241. RpW: 29.0. BpW: 57.6. PQI: 36.1

Note how similar the matches have been. If the sequence is ignored, the innings are virtually identical. The total runs scored in the two matches are 1024 and 1026 respectively. The only significant difference is that the Top-7 in the Melbourne Test have performed poorly. This, and the slightly higher BpW value has pushed the PQI for the Kingsmead Test slightly higher. However both are in Pitch Group 4, the second toughest one. I think very few will disagree with this. It is of interest to note that the 8-11 Batting average for the Melbourne Test is 21.3.

To download/view the document containing the complete PQI tables please click/right-click here.

In the next article I will form a composite of BQI and PQI for each innings and arrive at a Bowling-Pitch Group. This would be a true indicator of the conditions the batsmen played in and the attack he faced. I will do a analysis of the runs scored by batsmen against different combination groups.

Revision of PQI calculation based on Arjun's alternative method.


Given below is the revised Pitch Group allocation based on MT10 (Top 10 innings of match) values. This was suggested by Arjun Hemnani. This has a lot of pluses going for it, mainly the inclusion of all performances irrespective of the batting position.

Match# 684 is a perfect example of why we should not take Average but Runs per Innings.

Win 363 ao. Ind 376 ao. Win 307/3. Ind 123/0.

The third and fourth innings had two high score not outs. The first two innings had a high score not out each. Net result is 754 runs in 4 (10-6) innings leading to a MT10-Avge of 188.5, the third best. A total farce. This is the only Test with 6 MT10 not outs. 5 not outs, there are two matches. One seems okay. The other not. Then come the matches with 4 not outs, led by two matches with the highest MT10-Avge of 191.0 and 190.3. Only the later, Match# 1426 truly deserves this number. This is the Taylor-334 match.

So taking average is truly out. The current match, with 3 top innings already as not outs, would also go that way. One possibility to is to limit the number of not outs to 3. Works well but rather artificial.

Finally the simplest and most elegant solution is to take the MT10-RpI. After all these are the top-10 innings of the match. So remaining unbeaten does not mean that much of a difference. What does it matter whether we take 400*/329* or 400/329. The RpI has worked out very well.

For the T7-PQI I used the BpW as an additional measure. However here there is no need to do that for reason given below.

The 28 innings used to determine the T7-RpW had a number of small innings, with varying balls played associations. Hence I used the BpW measure to smoothen these wide variations within a match and across matches. However in this case we would select only the top-10 innings of the match. As such I have found that the MT10 Runs have a strong correlation to the MT10 Balls played. Hence there is no need to incorporate the Balls played information, which anyhow gets determined on a pro-rata basis for a third of the matches.

Summary of PQI Grouping - Based on Top-10 innings in match


Below 40.0: Group 5 - 112 ( 5.5%) A nightmare for the batsmen 40.0 - 52.5: Group 4 - 358 (17.7%) Difficult to bat on 52.5 - 65.0: Group 3 - 490 (24.2%) Good pitch - slightly favouring bowlers 65.0 - 80.0: Group 2 - 599 (29.5%) Good pitch - slightly favouring batsmen 80.0 - 95.0: Group 1 - 342 (16.9%) A belter (I have had too much of Shastri!!!) Above 95.0 : Group 0 - 125 ( 6.2%) Where "open season" is declared on bowlers.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems

Comments have now been closed for this article

  • Surya on January 20, 2012, 5:05 GMT

    Brilliant! I read your Jan 17th piece first (batsmen numbers) and I was extremely curious about the PQI metric. Needless to say, there were lot of debates on this metric in various circles. This is a very objective way of assigning a numerical value to a metric which often remains the realm of specialized knowledge of the match.

  • Ramesh Kumar on January 11, 2012, 7:53 GMT

    Ananth,

    I still feel that the methodolgy favours batsmen in strong bowling teams. If the team has only 1/2 good batsmen and many good bowlers, all the more better.

    The pich difficulty is also a function of bowlers being available to exploit the pitches. If you take 1974 India tour of Eng, right thru the series, England scored runs and India could not. This type has happened many times.You may argue that the pitch was not devilish as it was made out to be and not as placid as England scores would show. But if you start extrapolating batsmen individual performance based on averaging match scores and then going further saying these are stand out performances or otherwise, it might be a twist to the evaluation. My humble submission is that this methodology is good for pitch valuation from overall match point of view and cannot be drilled into individual performance evaluation. [[ Ramesh, This is not an Innings Ratings analysis. It is to look at what proportion of runs are scored in which circumstances based on two key components, bowling quality and pitch type. Combining these into a single factor lets us get a handle of what the batsman was against.I suggest you wait for the article. Ananth: ]]

  • Gerry_the_Merry on January 10, 2012, 16:19 GMT

    The team I and team II numbers given in your comments on Jan 7, 8:09 AM match exactly with cricinfo, not that i ever doubted.

    One last question remains before you pull the trigger...in your previous article, i had mentioned that weighting the home and away BQI by taking average of BQI yielded a delta between home and away of ~4.3%, whereas this delta, using sigma runs / sigma wickets from cricinfo yielded a delta of ~10%.

    The difference in the sigma-runs/sigma-wickets and the average of BQI was shown by you to be around 3.9% (i think) but this impact would have been similar for home and away.

    Hence my query remains: When raw cricinfo data suggests 10% delta for home/away, your adjustments not being home/away specific compress this delta to around 4.3%. Does that not bother you? [[ No, not at all. But what bothers me more is your sticking to a single inflexible doctrine and not letting go of that. I explained, using Philander's career, how the macro calculations differe widely with micro calculations. One is done at match level. The balls are used to do the weighting. The balls can vary between 1 and 2000+. These values are then added for all matches. You have derived a mean.. The other is two totals across 70000 innings and then a final division. How can you expect these to be in sync especially when the average values are multiplied by a factor between 1 and 2012. A great bowling team could dismiss the opponents in 75 balls while a poor bowling team might bowl 250 overs or, theoretivally, vice versa. Ananth: ]]

  • Gerry_the_Merry on January 10, 2012, 9:06 GMT

    Ananth, is the current XL file on the MT-10 method? Also the group 5 cutoffs seem very harsh if one takes "modern cricket" starting from Don Bradman's time. Very few tests make it. I suspect pre-1930 tests drag the cutoff down. If the idea is relative ranking, then the toughest conditions that apply to modern cricket should be classified group 5 i feel. [[ I have worked on the basis that the 5 and 0 groups are the real extremes and should not happen often. The 4 and 1 groups are the ones strongly fabouring bowlers and batsmen. The 3 and 2 are the middle groups tilting slightly either side. From that point of view the 5 and 0 values of 5.5% and 6.2% seem to be okay. The key is what do 5/4/3 combine to. That is 47.4% and 3/2/1 come to 52.6%. I can bring these two closer to 50.0. That is a welcome tweak. Also please understand that these are post match values and relate to actual figures. When I combine the RSI and BQI I get good groupings. Although let me say there are going to be fireworks. Ananth: ]]

  • Prashant on January 10, 2012, 3:23 GMT

    It seems to me that there may be a certain amount of double counting when taking into account both bowling and pitch qualities combined. (Though RSI : Run scoring index is probably the better term)

    For eg.: 1)Good bowling attacks will reduce the total amount of runs scored. [[ Not necessarily. The great batsmen will bide his time and score runs. Maybe that is what we are looking for here. Double counting is in a measure like ODI Batting index which is Runs * S/R which works to Runs*Runs/Balls. Here one may lead to other but not necessarily double counting. The idea is to recognize batting performances in difficult situations, (i-e) let us say both indixes 3 and above. Ananth: ]]

    This in turn will make it appear that the pitch is worse than it may be. But ,then again, credit is separately already given for the good bowling attack. Chicken and egg.

    2)Similarly, good batsmen may make a pitch appear better than relatively poor batsmen.

    So,seems to me that a combined (bowling/pitch) quality index would be accurate only when also factoring in the quality of batting involved (batting quality index).

    As a rough eg. In general 50+ avg batsmen would be expected to put up a better score than 40+ avg batsmen. [[ Don't forget that this is not a Innings Rating analysis. As such many match-contextual factors do not come in. The BQI is a pre-match prediction of how good the bowling attck is or expected to be. The RSI is a post-match determination of how the pitch turned out to be. The combination of the two should put things in place. Almost all the 300s are in Group 0 or 1. Only one, # 226, Hammond's 336 match is in Group 3. That too because after 336 and 83 comes 24. But then Hammond is put in place because he faced college students and part-timers. Bradman, Inzamam and Edrich have their 300s in Group 2. Looking at the bowling side, three performances likely to make anybody's top-10, Hadlee's 9-wkt haul and Murali's 9-wkyt haul were in matches with RSI of 1. Holding's is the only 8-wkt haul in RSI group 0. Calcultta 2001 is in Group 0 (RSI-98.9). However Laxman gets some credit because the expected bowling was good. Ananth: ]]

  • Gerry_the_Merry on January 7, 2012, 12:56 GMT

    Your numbers are very intriguing. The overall 1.14 is outside the other two numbers 1.17 and 1.19? Is 1.14 bowling and the other two batting? Even then, for batting the overall should work out to something like 1.185 (closer to 1.19 than 1.17). Still quite different from 1.14 for bowling. [[ These are Batting averages extracted from Cricinfo. The first line is 1-11 batsmen. "All" was a bit confusing. The 1.14 is the ratio between 1/2 and 3/4. It only shows that there is a greater delta between the "first" and "second" innings so far as the top-order is concerned. Probably the late order does not really care. As shown by the relative freedom with which the late order batsmen have added runs at both SCG and MCG in the "second" innings. I would suggest that you validate these numbers. Ananth: ]]

    Agree that peer-group burning deck can be only for innings rating, not career.

  • Gerry_the_Merry on January 7, 2012, 8:09 GMT

    BQI takes care of the quality of opposition. PQI takes care of scoring runs in teh context of RPI of the match. What about peer-group outperformance from within own team. Let us take Lara's example again. Examples of 2005-2006 matches. Folks scored truckloads of runs against them. Lara scored alone against some good quality opposition. But in BQI, the own team peer outperformance will not come in. In PQI, the opposite team scoring against weak WI bowling dilutes Lara's achievement, and in any case it is meant to be PQI. But what is the metric to assign value to the "burning deck" metric. This was a valuable feature of your Wisden 100. There are several methods one can propose, but I am sure you already have an adequate inventory. I am not talking in an innings rating context at all (in this an in all future comments). I am saying that if you ever did a second version of "Batsmen Across Bowler Groups" such items also should ideally feature. [[ The minute Peer comparisons come in these relate not to a specific match/innings but to the entire career. How will this be relevant in an analysis which has as its base a match/team-innings. i aceept your point about 1/2 and 3/4 innings but this does not seem relevant. Ananth: ]] Also the team Ist team IInd inn delta... [[ Since this analysis has the match/team-innings as the basis and the 1/2 and 3/4 inning separation occurs strictly within the match, you want a delta applied. This is validated by the following summaries

    Class  1-2   3-4
    All:  31.71  27.68 1.14
    1-7:  38.30  32.29 1.19
    8-11: 15.98  13.62 1.17
    I can do that but probably only at the weighted average determination level. The problem is that I combine the PQI and RSI together and form a composite index. This index is used to determine a BRI. I cannot apply the delta at this stage.It will confuse everyone. However I am also working on the overall Sum (Runs x BRI) / Sum(Runs). I will apply this delta at the summing time. Ananth: ]]

  • Arjun on January 6, 2012, 11:46 GMT

    Ananth,

    Now pqi looks simple and easy to understand. As you have mentioned earlier bqi is a pre-match calculation and pqi is post-match calculation so instead of 'pitch quality', the better term could be 'Level of Run Scoring' in the match. [[ I agree. Also Pitch Quality indicates that at one level there is lack of quality. In reality quality exists at the middle levels with assistance to both batsmen and bowlers. I would probably combine both ideas and call it RSI: Run Scoring Index Ananth: ]]

  • Ananth on January 6, 2012, 10:09 GMT

    Arjun, the revised PQI table has been uploaded. The SCG Test came under PQI 0 (MT10-RpI=103.7). As per the Top-7 BpW it was 1. However considering the way it played for most of the match, it was a 0 or 1 level. So either seems okay. As I have mentioned in Ranga's comment, 622 for 1 certainly points to the flattest of pitches.

  • Ranga on January 6, 2012, 8:29 GMT

    An interesting observation of the summary of 2026 Tests ( both methods throw a sort of normal distribution), is that Arjun's method threw a more normalized distribution, ie., more identical numbers of 0&5, 1&4, 2&3, than Ananth's method. Whether any population, whatever be the data points is something that I dont know. Ananth's PQI supports batsmen (ie., the number of PQI >3 is greater than the PQI<3) - Whereas Arjun's is a mirror image of that!!! Ananth: 1052/974 vs Arjun 960/1086). . . . This is a significant difference. While one method shows that batting is slightly tougher, the other says batting is slightly easier. Is this because of MT10, which would include some aberrations like No.8 and 9s scoring 50s more often than we think they would? Or is it not that simple as I think? [[ Ranga, it must be recognized that my earlier method considered only 64% of the population. As such Arjun's method is more correct. Even today, Ashwin's 62 and Zaheer Khan's 35 come into the Top-10 adding credence to the easy paced nature of the pitch. In the T7 method these two innings would have been ignored and the top order failures highlighted. Even then you will see that the Pitch Group is either 1 or 0. That will necessarily and correctly lower the value of all innings, including Clarke's. If 622 for 1 is not an absolutely flat track I don't know what is. Ananth: ]]

  • Surya on January 20, 2012, 5:05 GMT

    Brilliant! I read your Jan 17th piece first (batsmen numbers) and I was extremely curious about the PQI metric. Needless to say, there were lot of debates on this metric in various circles. This is a very objective way of assigning a numerical value to a metric which often remains the realm of specialized knowledge of the match.

  • Ramesh Kumar on January 11, 2012, 7:53 GMT

    Ananth,

    I still feel that the methodolgy favours batsmen in strong bowling teams. If the team has only 1/2 good batsmen and many good bowlers, all the more better.

    The pich difficulty is also a function of bowlers being available to exploit the pitches. If you take 1974 India tour of Eng, right thru the series, England scored runs and India could not. This type has happened many times.You may argue that the pitch was not devilish as it was made out to be and not as placid as England scores would show. But if you start extrapolating batsmen individual performance based on averaging match scores and then going further saying these are stand out performances or otherwise, it might be a twist to the evaluation. My humble submission is that this methodology is good for pitch valuation from overall match point of view and cannot be drilled into individual performance evaluation. [[ Ramesh, This is not an Innings Ratings analysis. It is to look at what proportion of runs are scored in which circumstances based on two key components, bowling quality and pitch type. Combining these into a single factor lets us get a handle of what the batsman was against.I suggest you wait for the article. Ananth: ]]

  • Gerry_the_Merry on January 10, 2012, 16:19 GMT

    The team I and team II numbers given in your comments on Jan 7, 8:09 AM match exactly with cricinfo, not that i ever doubted.

    One last question remains before you pull the trigger...in your previous article, i had mentioned that weighting the home and away BQI by taking average of BQI yielded a delta between home and away of ~4.3%, whereas this delta, using sigma runs / sigma wickets from cricinfo yielded a delta of ~10%.

    The difference in the sigma-runs/sigma-wickets and the average of BQI was shown by you to be around 3.9% (i think) but this impact would have been similar for home and away.

    Hence my query remains: When raw cricinfo data suggests 10% delta for home/away, your adjustments not being home/away specific compress this delta to around 4.3%. Does that not bother you? [[ No, not at all. But what bothers me more is your sticking to a single inflexible doctrine and not letting go of that. I explained, using Philander's career, how the macro calculations differe widely with micro calculations. One is done at match level. The balls are used to do the weighting. The balls can vary between 1 and 2000+. These values are then added for all matches. You have derived a mean.. The other is two totals across 70000 innings and then a final division. How can you expect these to be in sync especially when the average values are multiplied by a factor between 1 and 2012. A great bowling team could dismiss the opponents in 75 balls while a poor bowling team might bowl 250 overs or, theoretivally, vice versa. Ananth: ]]

  • Gerry_the_Merry on January 10, 2012, 9:06 GMT

    Ananth, is the current XL file on the MT-10 method? Also the group 5 cutoffs seem very harsh if one takes "modern cricket" starting from Don Bradman's time. Very few tests make it. I suspect pre-1930 tests drag the cutoff down. If the idea is relative ranking, then the toughest conditions that apply to modern cricket should be classified group 5 i feel. [[ I have worked on the basis that the 5 and 0 groups are the real extremes and should not happen often. The 4 and 1 groups are the ones strongly fabouring bowlers and batsmen. The 3 and 2 are the middle groups tilting slightly either side. From that point of view the 5 and 0 values of 5.5% and 6.2% seem to be okay. The key is what do 5/4/3 combine to. That is 47.4% and 3/2/1 come to 52.6%. I can bring these two closer to 50.0. That is a welcome tweak. Also please understand that these are post match values and relate to actual figures. When I combine the RSI and BQI I get good groupings. Although let me say there are going to be fireworks. Ananth: ]]

  • Prashant on January 10, 2012, 3:23 GMT

    It seems to me that there may be a certain amount of double counting when taking into account both bowling and pitch qualities combined. (Though RSI : Run scoring index is probably the better term)

    For eg.: 1)Good bowling attacks will reduce the total amount of runs scored. [[ Not necessarily. The great batsmen will bide his time and score runs. Maybe that is what we are looking for here. Double counting is in a measure like ODI Batting index which is Runs * S/R which works to Runs*Runs/Balls. Here one may lead to other but not necessarily double counting. The idea is to recognize batting performances in difficult situations, (i-e) let us say both indixes 3 and above. Ananth: ]]

    This in turn will make it appear that the pitch is worse than it may be. But ,then again, credit is separately already given for the good bowling attack. Chicken and egg.

    2)Similarly, good batsmen may make a pitch appear better than relatively poor batsmen.

    So,seems to me that a combined (bowling/pitch) quality index would be accurate only when also factoring in the quality of batting involved (batting quality index).

    As a rough eg. In general 50+ avg batsmen would be expected to put up a better score than 40+ avg batsmen. [[ Don't forget that this is not a Innings Rating analysis. As such many match-contextual factors do not come in. The BQI is a pre-match prediction of how good the bowling attck is or expected to be. The RSI is a post-match determination of how the pitch turned out to be. The combination of the two should put things in place. Almost all the 300s are in Group 0 or 1. Only one, # 226, Hammond's 336 match is in Group 3. That too because after 336 and 83 comes 24. But then Hammond is put in place because he faced college students and part-timers. Bradman, Inzamam and Edrich have their 300s in Group 2. Looking at the bowling side, three performances likely to make anybody's top-10, Hadlee's 9-wkt haul and Murali's 9-wkyt haul were in matches with RSI of 1. Holding's is the only 8-wkt haul in RSI group 0. Calcultta 2001 is in Group 0 (RSI-98.9). However Laxman gets some credit because the expected bowling was good. Ananth: ]]

  • Gerry_the_Merry on January 7, 2012, 12:56 GMT

    Your numbers are very intriguing. The overall 1.14 is outside the other two numbers 1.17 and 1.19? Is 1.14 bowling and the other two batting? Even then, for batting the overall should work out to something like 1.185 (closer to 1.19 than 1.17). Still quite different from 1.14 for bowling. [[ These are Batting averages extracted from Cricinfo. The first line is 1-11 batsmen. "All" was a bit confusing. The 1.14 is the ratio between 1/2 and 3/4. It only shows that there is a greater delta between the "first" and "second" innings so far as the top-order is concerned. Probably the late order does not really care. As shown by the relative freedom with which the late order batsmen have added runs at both SCG and MCG in the "second" innings. I would suggest that you validate these numbers. Ananth: ]]

    Agree that peer-group burning deck can be only for innings rating, not career.

  • Gerry_the_Merry on January 7, 2012, 8:09 GMT

    BQI takes care of the quality of opposition. PQI takes care of scoring runs in teh context of RPI of the match. What about peer-group outperformance from within own team. Let us take Lara's example again. Examples of 2005-2006 matches. Folks scored truckloads of runs against them. Lara scored alone against some good quality opposition. But in BQI, the own team peer outperformance will not come in. In PQI, the opposite team scoring against weak WI bowling dilutes Lara's achievement, and in any case it is meant to be PQI. But what is the metric to assign value to the "burning deck" metric. This was a valuable feature of your Wisden 100. There are several methods one can propose, but I am sure you already have an adequate inventory. I am not talking in an innings rating context at all (in this an in all future comments). I am saying that if you ever did a second version of "Batsmen Across Bowler Groups" such items also should ideally feature. [[ The minute Peer comparisons come in these relate not to a specific match/innings but to the entire career. How will this be relevant in an analysis which has as its base a match/team-innings. i aceept your point about 1/2 and 3/4 innings but this does not seem relevant. Ananth: ]] Also the team Ist team IInd inn delta... [[ Since this analysis has the match/team-innings as the basis and the 1/2 and 3/4 inning separation occurs strictly within the match, you want a delta applied. This is validated by the following summaries

    Class  1-2   3-4
    All:  31.71  27.68 1.14
    1-7:  38.30  32.29 1.19
    8-11: 15.98  13.62 1.17
    I can do that but probably only at the weighted average determination level. The problem is that I combine the PQI and RSI together and form a composite index. This index is used to determine a BRI. I cannot apply the delta at this stage.It will confuse everyone. However I am also working on the overall Sum (Runs x BRI) / Sum(Runs). I will apply this delta at the summing time. Ananth: ]]

  • Arjun on January 6, 2012, 11:46 GMT

    Ananth,

    Now pqi looks simple and easy to understand. As you have mentioned earlier bqi is a pre-match calculation and pqi is post-match calculation so instead of 'pitch quality', the better term could be 'Level of Run Scoring' in the match. [[ I agree. Also Pitch Quality indicates that at one level there is lack of quality. In reality quality exists at the middle levels with assistance to both batsmen and bowlers. I would probably combine both ideas and call it RSI: Run Scoring Index Ananth: ]]

  • Ananth on January 6, 2012, 10:09 GMT

    Arjun, the revised PQI table has been uploaded. The SCG Test came under PQI 0 (MT10-RpI=103.7). As per the Top-7 BpW it was 1. However considering the way it played for most of the match, it was a 0 or 1 level. So either seems okay. As I have mentioned in Ranga's comment, 622 for 1 certainly points to the flattest of pitches.

  • Ranga on January 6, 2012, 8:29 GMT

    An interesting observation of the summary of 2026 Tests ( both methods throw a sort of normal distribution), is that Arjun's method threw a more normalized distribution, ie., more identical numbers of 0&5, 1&4, 2&3, than Ananth's method. Whether any population, whatever be the data points is something that I dont know. Ananth's PQI supports batsmen (ie., the number of PQI >3 is greater than the PQI<3) - Whereas Arjun's is a mirror image of that!!! Ananth: 1052/974 vs Arjun 960/1086). . . . This is a significant difference. While one method shows that batting is slightly tougher, the other says batting is slightly easier. Is this because of MT10, which would include some aberrations like No.8 and 9s scoring 50s more often than we think they would? Or is it not that simple as I think? [[ Ranga, it must be recognized that my earlier method considered only 64% of the population. As such Arjun's method is more correct. Even today, Ashwin's 62 and Zaheer Khan's 35 come into the Top-10 adding credence to the easy paced nature of the pitch. In the T7 method these two innings would have been ignored and the top order failures highlighted. Even then you will see that the Pitch Group is either 1 or 0. That will necessarily and correctly lower the value of all innings, including Clarke's. If 622 for 1 is not an absolutely flat track I don't know what is. Ananth: ]]

  • Ranga on January 6, 2012, 8:10 GMT

    I just managed to have a complete look of the PQI Table. Before this I had merely glanced. Now, I did look at it in detail. On face value, the numbers and the results, are bound to evoke a serious controversy amidst the fans of a country which has over 1 Billion cricket supporters. We could argue and try to pacify saying "If . . . " "If . . . " but the reality is actually startling (or may be, eye opener).

    Whatever be the discussion of the article, I am sure the real lovers of the game, who see it objectively, would enjoy the analysis. I for one, have been very profusely using the word "Legend" for most people in the modern era . . . I stand corrected. Without naming any cricketer here, I do feel that there are a lot of great batsmen, but very very few legends.

    The PQI also shows the capacity of teams to win on tough wickets. Let the truth come out. It is bound to create ruffles.

    Wonderful work Ananth!!

  • Arjun on January 6, 2012, 7:34 GMT

    Ananth,

    there is the summary of pqi groupings but where is the complete table ? [[ I still have it only as a text file. Will export into a XL file and load by evening. Ananth: ]]

  • Ananth on January 5, 2012, 15:37 GMT

    To: Arjun in particular and all readers in general The revised Pitch group tables based on the MT10-RpI have been posted.

  • Anshu N Jain on January 5, 2012, 11:55 GMT

    [[ Anshu Please go through the entire exchange of comments with Arjun. Now the BpW is out. Ananth: ]]

    I have. I understand and agree that BpW is both unavailable and untenable for all matches.

    My observations were on alternatives to calculate the RpW:

    1)The top 15 innings in the match which would cover about half the number of innings in most matches (the overall average number of innings per match across all tests to date is 35, completed innings is 30) 2)The top 50% of the innings in the match (rather than limiting artificially to a pre-determined number, allowing for the actual number of innings played in THE match to determine the top half to be considered)

  • Ranga on January 5, 2012, 7:40 GMT

    Hi Ananth - belated wishes for a very happy, healthy and prosperous new year. Now you have built the second block of the 3 article series. I feel this is a bit more key as most arguments on the best batsmen revolve around their performances in x or y type of pitch, based on historical evidence. You gave a thumping to history by showing 2 Motera Tests where the same pitch defied history.

    Talking of history, Clarke did create one for himself when he selflessly declared at 329*. Indian batting hasnt lasted for more than 120 overs for the last 9 away tests and saving 2 days would have been daunting. He had a good chance to ping 401*. He could have still declared tomorrow and would have had enough runs and time to bowl India out. Instead he said, "I'm not obsessed with numbers". I'm sure Clarke should figure in the list of future legends. The respect for him sure, would keep growing. Australian batting beyond their fab-3 still appears weak, but their leadership is in safe pair of hands. [[ While Ravi and Sourav were talking about Michael Clarke going for 400, he dropped a bombsheell. I myself was totally surprised. I was sure he would not go for the 400 since that would have taken the rest of the day. But I thought he would play on for 30 minutes to take the lead to 500. However that was indeed a great decision. He did not even want to go past 334. I wonder how many other captains would do it. And he got Sehwag because of this declaration. Ananth: ]]

  • Arjun on January 5, 2012, 6:55 GMT

    Yes Ananth, average of Top-10 is to be used and not aggregate. My only concern is that if there are 3 or more notouts in top-10, than it might dostort the figuges completely eg. in the current sydney test 329* and 150*.

    Is Runs per Innings of Top-10 a good option ? [[ Yes, Arjun. I had got the table and started the preliminary vetting work when your comment came in. Match# 684 is a perfect example of why we should not take Average. Win 363 ao. Ind 376 ao. Win 307/3. Ind 123/0. The third and fourth innings had two high score not outs. The first two innings had a high score not out each. Net result is 754 runs in (10-6) inns leading to a MT10-Avge of 188.5, the third best. A total farce. This is the only Test with 6 MT10 not outs. 5 not outs, there are two. One seems okay. The other not. Then come the matches with 4 not outs, led by two matches with the highest MT10-Avge of 191.0 and 190.3. Only the later, Match# 1426 truly deserves this number. This is the Taylor 334 match. So taking average is truly out. Yes, today's match, with 3 top innings already as not outs, would go that way. One possibility to is to limit the number of not outs to 3. Works well but rather artificial. Finally the simplest and most elegant solution is to take the MT10-RpI. After all these are the top-10 innings of the match. So remaining unbeaten does not mean that much of a difference. What does it matter whether we take 400*/329* or 400/329. Will do on that basis and post the summaries tomoroow morning. Ananth: ]]

  • Gerry_the_Merry on January 5, 2012, 6:54 GMT

    MT10 makes very good sense. E.g. if Lara plays in a weak team but scores well, he should not get elevated merely because he was better than a bunch of mediocre batsmen from his own team. In such cases the MT10 will feature more batsmen from the opponents and the RpW will not get diluted because of poor batting in a weak team, and falsely suggesting that it was a bad pitch. There are several such hypothetical situations one can think of, but this seems to cover up well. Hard to fault it. [[ Pl see my latest response to Arjun. Yours is a very valid point. Many queries are answered on the new basis. Ananth: ]]

  • Anshu N Jain on January 5, 2012, 6:53 GMT

    Ananth, If the BpW measure is to be rid of, then I'd suggest the following alternatives be evaluated while calculating the match RpW: 1)The top 15 innings in the match which would cover about half the number of innings in most matches (the overall average number of innings per match across all tests to date is 35, completed innings is 30) 2)The top 50% of the innings in the match (rather than limiting artificially to a pre-determined number, allowing for the actual number of innings played in THE match to determine the top half to be considered) [[ Anshu Please go through the entire exchange of comments with Arjun. Now the BpW is out. Ananth: ]]

  • Ananth on January 5, 2012, 6:07 GMT

    Arjun, I am coming around to using the Top-10 match innings (MT10) mainly beacuse it does not ignore any innings. Relevant innings, in whichever positions these are played, will be considered. However I will not come out with a corollary article. There are too many things on the plate for me to assign two consecutive articles to the same topic. I will post a Pitch Group summary in this article and refer to it in the next (Batsman Analyaia) article. I have a few points and need your and the other readers' inputs. 1. Using the aggregate does not work out. Many matches were placed wrongly. 2. So I have to use the MT10 average. The idea is that if a batsman remained not out, to that extent the PQI value should be higher since he has not been dismissed. This results are also in line with the T7-RpW group sizes. 3. For the T7-PQI I used the BpW as an additional measure. However here there is no need to do that for reason given below. 4. The 28 innings used to determine the T7-RpW had a number of small innings, with varying balls played associations. Hence I used the BpW measure to smoothen these wide variations within a match and across matches. However in this case we would select only the top-10 innings of the match. As such I have found that the MT10 Runs have a strong correlation to the MT10 Balls played. Hence there is no need to incorporate the Balls played information, which anyhow gets determined on a pro-rata basis for a third of the matches. 5. I will post the new table, based on MT10-Avge innings in the main article itself. Will do it by tomorrow morning.

  • Ramesh Kumar on January 5, 2012, 4:28 GMT

    Ananth,

    Very good analysis and looks sound. Only one issue--RpW & BpW views look very sound in most of the cases except when you look at a team with one/two good batsmen and has very good bowlers. A good batsman in a weak batting team will get favorable RpW & BpW due to weak batting support and his good bowlers will ensure low runs and high wicketsin the match thus making his runs as if scored in a difficult pitch. One example could be Lara's WI wich had very good bowlers. I have a feeling that Lara may move up due to this view--only my perception, I haven't seen the numbers. But I don't see a way out as the method answers most of the situations and I don't see an alternate method. [[ Pl see responses to Arjun. Ananth: ]]

  • Ananth on January 4, 2012, 17:05 GMT

    Arjun, Given below is the partnership summary. Total runs: 1988571 1-6 wkts Partnerships: 41216. Runs: 1548387 (77.9%). Avge: 37.6. 7-10 wkts Partnerships: 22866. Runs: 440184 (22.1%). Avge: 19.3.

  • rohit on January 4, 2012, 16:03 GMT

    though its late but still.... a very happy new year to u ananth... sorry what am posting is out of context but just couldn't help myself... what about preparing a list of all time batting greats according to the position they batted?? lets say u take into account first six positions and make a list of all time top 5 who played on these respective positions... me n my frds had a very heated argument about this but couldn't reach to any conclusion.... so asking for your help... [[ I have already done this analysis during APril 2010. There are two articles. The links are given below. http://blogs.espncricinfo.com/itfigures/archives/2010/04/best_batsmen_at_each_position.php http://blogs.espncricinfo.com/itfigures/archives/2010/04/test_batting_position_averages.php Ananth: ]]

  • Ananth on January 4, 2012, 14:09 GMT

    Arjun, the first of the analyses I promised. Top-10 innings across 2026 Tests: 20188 1-7 Batting position innings: 17869 8-11 Batting position innings: 2319 So your estimate of 30% is quite high. It is around 11.5%. Not insignificant but not as high as 30% also. But high enough to warrant consideration.

  • Arjun on January 4, 2012, 13:59 GMT

    No Ananth, Match RpW taking all innings is not correct. (it includes out-of-form batsmen, colapses not because of pitch, ducks due to good balls intially, and many other)

    Also see what is the ratio of aggregate of first 6 partnerships and last four. I think last 4 partnerhsips have added 25+% runs ever. That is about 1/4th, too high to ignore. [[ Will do. Ananth: ]]

  • Smudge on January 4, 2012, 13:07 GMT

    Ananth, Yes I am saying to a degree that lower order runs can show that the pitch is not as bad as the top 7 made it look (or in the case of Bressie this summer, as bad as the other side's top 7 made it look!). I intend no disservice or for that undue credit to them, merely an observation. It could also be argued that the larger sample of the innings also partially mitigates the effect of transient conditions, such as overcast weather although I realise separating the literal "pitch" conditions from atmospheric ones is a minefield and not one I would expect to be taken into account. Test 1971 (ENG vs Pak at the Oval) for example looks very different with a top 7 vs top 10 methodology. Is this a statistical anomoly which should be discounted? My point is a minor one however. [[ Pl see my response to Arjun. I think the low order should not be ignored. Ananth: ]]

  • Arjun on January 4, 2012, 12:19 GMT

    Ananth,

    On a difficult pitch it is very difficult to develop a big partnership since wickets fall regularly. I assume in atleast 75% of Tests there were 30 partnerships or more; that is enough data. I think range/size of partnership runs will throw more light on pitch type qualifications. [[ All these are possibly true. Ananth: ]] About 30% of time, lower order make significant contribution to team's final score. In your Top-7 method that is completely ignored. [[ I feel myself that 30% is probably on the higher side. However both of us are fishing in the dark. I will do some work on this and post the results. Things would then be clear. To start with, Look for the 8-11 innings in the top-10 innings of the match. Look for 8-11 innings which are higher than 1-7 for each innings. Not fully comfortable because of the not outs. Should be able to post these results by the morning. And then we can continue this. Finally what about the Match RpW taking in all innings. I already have this information. I can also do a aggregate comparison of T7-RpW and All-RpW. Ananth: ]]

  • Ross Cameron on January 4, 2012, 12:17 GMT

    Thanks- keep up the good work.

    Something the stats do not tell us about is the quality of the bowling and batting. Many dropped catches, poor lines and lengths, poor shot selection may exagerate or understate the difficulty of the pitch. Just wondering how the PQI fares with wholly one sided matches and matches where both sides were poor? [[ Ross, the scorecards tell only one side of the story and I am only using that. Everything else is subjective. These are not available in the scorecard. Today when Australia compiled 366 for 1 in 90 overs, a few misguided Indian fans have been telling that this was a flat track. Not yesterday, when barring Tendulkar for a short while, no one was comfortable. Let me take a match scoreline, quite possible. Ind 191. Aus: 600/7 decl. Ind 192. Does the pitch start as a green-top, change to a flat track and then change back to a green-top. This conclusion is what I am objectively try to avoid. As I have mentioned in the article, this would neither be a 20-RpW green-top nor a 100-RpW flat track. The match RpW is likely to be a 40-RpW good wicket providing assistance to both. Ananth: ]]

  • Smudge on January 4, 2012, 9:35 GMT

    Ananth, I'm curious about why you are only using the top 7 as the tail "distorts" the figures. Is this because tailender not-outs dispropotionately influence averages? I would have thought a lot of runs scored at 8 or 9 (some of Broad and Bresnan's recent efforts for example, if you will forgive a little anglocentricism) would have something "to say" statistically.

    Congrats to you (and your newly recruited army of volunteer elves) on a year of excellent analysis by the way. [[ Pl see my response to Arjun. I am averse to taking the top-10 innings of the match as suggested by Arjun provided there are convincing arguments favouring one method over the other. Pl remember that when you espouse Bresnan's case, you are really doing him a disfavour. What you are really telling me is that because Bresnan/Broad scored runs, the pitch was not as bad as the top-7 made it look like. Ananth: ]]

  • Arjun on January 4, 2012, 7:43 GMT

    Ananth,

    Another alternative to 'Top-10' individual scores of the match is sum of 'Top-10' partnerships. Many big partnerships in a match are indicators of good batting wicket. Similarly too many small partnerships suggests pitch was difficult to bat on. [[ Arjun, All three methods are eminently acceptable. Recently I have done all work with Top-7 XpW and am quite comfortable with that having visually indpected quite a few matches. I did some tentative work on Top-10 innings. If I used the aggregate, the wjole thing went haywire and I could see many anamolies. However when I tweaked this into a Top-10 innings average, things came back well. However I must have a clear case for switching. There have to ne convincing arguments. Only one I can see is the inclusion of innings such as Ian Smith's 173. Even there I can see that the two group values are comparable. Partnership is going to be similar. Let me hear from you and other readers arguments favouring one basis or other. Ananth: ]]

  • Gerry_the_Merry on January 4, 2012, 5:37 GMT

    Ananth, you have chosen Group 5 upper bound with 5% fill at 27.0. Is that to match the previous article upper bound? Should you not normalize to get 13% fill just like in previous article?

    By normalize i mean scale down all PQI in such a way that group 5 at 27 yields a 13% fill. The other groups matter less, but one needs to keep a watch on those also. [[ No, not really. I had 20 as my first cut-off when I had only RpW. Then when I changed to BpW + RpW I got 27 (66.7/33.3 basis). It worked well. When I adjusted that to 75-25, I changed to 25. However the % went below 3. So I went back to 27. I think this distribution is quite different to Bowling which is quite nicely normalized. Here there is a long and flat [portion at the higher end. And I have 5 to 0. Ananth: ]]

  • Gerry_the_Merry on January 4, 2012, 3:13 GMT

    Dr. Naveed has forgotten the most important aspect. If the others had played as long as Tendulkar would they have scored their 100th international ton by now?

    For almost an year Indian cricket has paid the price for the national obsession with meaningless numbers like 100th international century. Tendulkar became the first to score 50 test tons, and that was a milestone to feel proud of. Not this. Now even Micheal Clarke does not let go an opportunity to write about Tendulkar...it has become a joke now. [[ Boucher has effected 984 international dismissals. He is ahead of Gilchrist by 10%. No one talks about it. For that matter there is no assurance that he will be allowed to reach that landmark. The way Tendulkar plays nowadays is amazing. He is ahead of the other Indian batsmen by a mile. However the elusive landmark is like albatross around his neck (a better example than dear friend Ravi Shastri's "monkey on his back"). He himself starts so well but is weighed down by the prospect when he goes past 50. Why wouldn't he when one channel kept on displaying "20 to go", "19 to go", "16 to go" etc during the Mumbai Test. Ananth: ]]

  • dr naveed on January 3, 2012, 16:39 GMT

    i want to know, if you can provide with a list of batsmen who would have played similar number of tests , innings , their average , their runs per test ,how many centuries, half centuries, zeros they would have scored had they played similar number of tests etc etc. for example ,how many runs would bradman or lara or gavaskar or miandad or border and other top batsmen would have scored in details mentioned above ,had they played similar number of tests played by tendulkar,and similarly how many runs and other details would tendulkar had achieved had he played similar numbers of tests ,innings and other details of each individual player, i think mr tendulkar would come far down in that list. what do you say ? we will see who was the greatest. [[ I understand what you are getting at. However that is an exercise I am not going to do. Unlike some obscure Australian academic who wanted to get the attention at this time, I have no intentions of doing that sort of futile exercises. I have done lot of peer comparisons of specialized batting numbers and these show Bradman with an imposing figure of around 2.80. The next one is Sutcliffe with 1.70. A number of batsmen are in the 1.60-1.70 range.Most modern batsmen, including SRT, Kallis, Javed are just below 1.50. Others like Ponting,Younis, Yousuf, Dravid, Lara are still lower. This is not a peer comparison of runs scored (as Dr.Rohde has done) but average, which is a performance measure. For me that is enough. Bradman, huge daylight through which a truck can be driven, and then the others. As far as Test cricket is concerned. And ODI comparisons should be within the 40 years. There cannot be any fourth-dimensional extrapolation. The advantage with this compariosn is that this negates all other factors like period, pitches, bowling, cricket laws, helmet, etc. I had done this work over 2 years back and will probably re-do this soon since so much has happened since then. Ananth: ]]

  • Arjun on January 3, 2012, 12:16 GMT

    Ananth,

    good work. Still believe that Top-10 scores of the batsmen accross whole Test Match irrespective of batting position is better indicator of pitch quality. [[ Yes, I remember our discussions earlier on this. I think there is a strong correlation between the top-10 batsmen scores and top-7 RpW. Is it worthwhile doing a corollary piece on top-10 batsman runs and balls and then conclude the series with the batsman analysis. If so I can interject that article. Balls faced is important since that has an influence on how the pitch played. Let us throw this to the other readers also. Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 12:01 GMT

    Ananth, I completely understand what you are saying in the context of an innings rating exercise. I should have clarified upfront - if this and the next article are for innings rating, I take back my comment as you would automatically incorporate innings importance. I perfectly remember that it was your Wisden-100 which got me to realize the extra weightage that should be given for second innings. So when you do an innings rating work, I have no doubt this will automatically come in, so no need for an explicit layer. Hence here we are on the same page.

    I am ONLY talking in the context of "batsman across bowling group across ages" type of analysis. In this, in the May 2011 version, there was no differentiation between team Ist and team IInd. So TOUGHER second innings runs were not given their due. I am surprised that you don't find it correct to incorporate this differential in a batting career stratification exercise, but I have absolutely no doubt about what I am saying. [[ Gerry, I am looking at doing Batsman analysis Home/Away. I can also do by innings. These are all different aspects of batting. The reason why the Bowler and Pitch are done together is because they are interlinked together. It is nice to look at the possibilities of Marshall/Ambrose on flat pitches or Shakib/Mashrafe on bowler-friendly tracks. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 12:00 GMT

    {For the example given, the GM of 30 and 60 is 42.4 which is way too high}

    Couldnt understand "way too high"; compared to what? The actual value shouldnt matter as long as the underlying process is agreed and understood to be the most apt. [[ By "too high" I mean that the weight given to BpW is too high in proportion to the RpW. The 42.4 is too close to the straight mean of the two numbers. i want this to be weighted towards the RpW figure. In reality either separately or together can be used. Let us wait for the other comments. Ananth: ]]

    If this is an in-match index, why should we use an RpO figure of 3.0, which is across ALL matches played to date, to "derive" suitable weights to assign to the RpW and BpW? [[ That is only an example. Has not been used at all. Ananth: ]]

    It unnecessarily burdens any and every match with what's happened outside of it, rather than sticking to actual in-match facts, which is the premise that this analysis of Pitch Quality rests on.

  • Anshu N Jain on January 3, 2012, 11:25 GMT

    {What do I do for the 750 matches for which balls played info is not available}

    So, T7-RpW is the average of the scores of the top 7 batsmen across all innings.

    How is the T7-BpW calculated? Is it the average of the balls faced by the Top 7 batsmen (dismissed) across all innings?

    If so, how is it calculated for matches which have no balls played information available at the Batsmen level? Is it then based on the overall BpW for the match? Or is it derived from the T7-RpW and overall RpO for the match? [[ My blind spot. And my apologies. I should have really said ""I do not want to use a number which has been developed on a pro-rata basis for over 750 scorecards."" However I agree with you that the BpW figure is a good enough basis. My worry is that the scoring rate variation is so much across ages, probably 2.0 to 4.0, that the runs scored by top-7 batsman cannot be ignored. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 9:06 GMT

    You may also want to look at using the geometric mean of the RpW and BpW to get rid of the "perceived" arbitrariness and subjectivity of the allocation of weights to these figures. The same end-result is achieved without having to explain why the weights are what they are. [[ At the outset a good suggestion. However will not work. For the example given, the GM of 30 and 60 is 42.4 which is way too high. If we take RpW as x and BpW as 2x (based on RpO of 3.0), the following are the results. 75/25: 1.25 x 66.7/33.3: 1.33x GM: 1.414 x. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 8:29 GMT

    First up, a very happy new year to you Ananth, and your loved ones!

    Understand that you have used the Top 7 Batsmen in the batting order (in each of the 4 innings) to calculate the overall RpW and BpW.

    An alternative that could be looked at is to use the Top 7 Batsmen by the number of balls faced in the innings.

    That way, a larger part of each innings will be accounted for in the final RpW and BpW figures, thereby making the RpW and BpW more "representative" of the overall match. Also, why should numbers 8-11 be left out if they have actually faced more balls than their top-middle order counterparts, either due to their own application or due to the pitch easing out (especially sessions 2 and 3 on Day 1)? Isnt this precisely what we intend to measure?

    Of course, it would be the case that the Top 7 Batsmen by batting order would invariably also face a very high majority of total balls in each innings, across all 2000+ tests.

    Look forward to your inputs. [[ Anshu What do I do for the 750 matches for which balls played info is not available. In most matches 1-7 really means the batting cream. Taking 8-11 will dilute the numbers and will not show how difficult or easy the pitch was to bat. Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 8:22 GMT

    Also slightly difficult to understand XL file. Would you be able to insert a column for the batting team please. [[ I have used Home/Away teams. Will redo this as Batting/Bowling teams and upload. Both of us are talking nonsense. THe PQI is for the match, not for the innings ??? Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 7:42 GMT

    Ananth, i have previously several times mentioned that there is a 10% delta in the aggregate averages of team Ist inn and team IInd inn. This is even more of a universal truth than home and away delta, as it is true for 75% of batsmen and bowlers.

    Hence that batsmen who score a 100 in the team IInd inn should be given more credit than the same score in the team Ist inn. Else noted IInd inn performers like Laxman will get the short shrift.

    Now in the PQI method, a Ist inn century will receive an even greater weightage than a 4th inn century. So between matches, it is a perfect method, but within a match, it will definitely favour Ist inn scorers, whereas it shoould be doing the opposite.

    Hence I propose that there must be an explicit layer which transfers some weight from team Ist inn to team second inn (e.g -5% and +5%). That will straighten out this within match kink out forever. [[ And, Gerry, I have mentioned many times that what you are asking for is a Ratings work and will be done as and when I do an innings ratings project. This is a Bowler/Pitch analysis and let us leave it at that. I get the feeling you are sceptical of my understanding what you want and are just emphasising this without understanding what I say. I understand this, and more. In spades. Otehrwise how could I have done the Wisden-100 work. Without incorporating the innings in which the batsman played his knock or the situation he came in and the target score how could three scores wither side of 150 have found their place in the top-5. What you are saying is a straightforward delta. That is not correct. The context is more important. Dhoni's innings today, coming at 59 for 4 is far more important than many a 50 scored while a team is ahead by over 300 in the third or while chasing 100 in the fourth. Why do you bring in Innings in a Bowler/Pitch analysis. There is no context in this. Runs scored off a top class attack or on a difficult-to-score pitch are always valuable, irrespective of context. And the combination of the two is like gold. Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 7:06 GMT

    If the batsman is like Lara, who has scored signficantly as % of his own team's runs (2nd highest after Bradman, as per one of your old pieces), but whose bowling is very weak, allowing the opposition batsmen to boost T7-RpW, then how would he be compensated?

    A recent such example is Mike Hussey in last year's Ashes test in Adelaide. He top scored in both innings but England won by an innings after making a big score, boosting T7-RpW hugely.

    Also Dravid in 4-0 hammering against England. [[ Gerry, there will be problems if you consider only the PQI. However all these batmen would benefit when you consider the PQI and BQI together. In almost all the cases you have referred to, the opposition bowling was very good. The fact is the pitch does not change suddenly when the other team bats. It does so only in the minds of the watchers. It remains the same. Ananth: ]]

  • No featured comments at the moment.

  • Gerry_the_Merry on January 3, 2012, 7:06 GMT

    If the batsman is like Lara, who has scored signficantly as % of his own team's runs (2nd highest after Bradman, as per one of your old pieces), but whose bowling is very weak, allowing the opposition batsmen to boost T7-RpW, then how would he be compensated?

    A recent such example is Mike Hussey in last year's Ashes test in Adelaide. He top scored in both innings but England won by an innings after making a big score, boosting T7-RpW hugely.

    Also Dravid in 4-0 hammering against England. [[ Gerry, there will be problems if you consider only the PQI. However all these batmen would benefit when you consider the PQI and BQI together. In almost all the cases you have referred to, the opposition bowling was very good. The fact is the pitch does not change suddenly when the other team bats. It does so only in the minds of the watchers. It remains the same. Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 7:42 GMT

    Ananth, i have previously several times mentioned that there is a 10% delta in the aggregate averages of team Ist inn and team IInd inn. This is even more of a universal truth than home and away delta, as it is true for 75% of batsmen and bowlers.

    Hence that batsmen who score a 100 in the team IInd inn should be given more credit than the same score in the team Ist inn. Else noted IInd inn performers like Laxman will get the short shrift.

    Now in the PQI method, a Ist inn century will receive an even greater weightage than a 4th inn century. So between matches, it is a perfect method, but within a match, it will definitely favour Ist inn scorers, whereas it shoould be doing the opposite.

    Hence I propose that there must be an explicit layer which transfers some weight from team Ist inn to team second inn (e.g -5% and +5%). That will straighten out this within match kink out forever. [[ And, Gerry, I have mentioned many times that what you are asking for is a Ratings work and will be done as and when I do an innings ratings project. This is a Bowler/Pitch analysis and let us leave it at that. I get the feeling you are sceptical of my understanding what you want and are just emphasising this without understanding what I say. I understand this, and more. In spades. Otehrwise how could I have done the Wisden-100 work. Without incorporating the innings in which the batsman played his knock or the situation he came in and the target score how could three scores wither side of 150 have found their place in the top-5. What you are saying is a straightforward delta. That is not correct. The context is more important. Dhoni's innings today, coming at 59 for 4 is far more important than many a 50 scored while a team is ahead by over 300 in the third or while chasing 100 in the fourth. Why do you bring in Innings in a Bowler/Pitch analysis. There is no context in this. Runs scored off a top class attack or on a difficult-to-score pitch are always valuable, irrespective of context. And the combination of the two is like gold. Ananth: ]]

  • Gerry_the_Merry on January 3, 2012, 8:22 GMT

    Also slightly difficult to understand XL file. Would you be able to insert a column for the batting team please. [[ I have used Home/Away teams. Will redo this as Batting/Bowling teams and upload. Both of us are talking nonsense. THe PQI is for the match, not for the innings ??? Ananth: ]]

  • Anshu N Jain on January 3, 2012, 8:29 GMT

    First up, a very happy new year to you Ananth, and your loved ones!

    Understand that you have used the Top 7 Batsmen in the batting order (in each of the 4 innings) to calculate the overall RpW and BpW.

    An alternative that could be looked at is to use the Top 7 Batsmen by the number of balls faced in the innings.

    That way, a larger part of each innings will be accounted for in the final RpW and BpW figures, thereby making the RpW and BpW more "representative" of the overall match. Also, why should numbers 8-11 be left out if they have actually faced more balls than their top-middle order counterparts, either due to their own application or due to the pitch easing out (especially sessions 2 and 3 on Day 1)? Isnt this precisely what we intend to measure?

    Of course, it would be the case that the Top 7 Batsmen by batting order would invariably also face a very high majority of total balls in each innings, across all 2000+ tests.

    Look forward to your inputs. [[ Anshu What do I do for the 750 matches for which balls played info is not available. In most matches 1-7 really means the batting cream. Taking 8-11 will dilute the numbers and will not show how difficult or easy the pitch was to bat. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 9:06 GMT

    You may also want to look at using the geometric mean of the RpW and BpW to get rid of the "perceived" arbitrariness and subjectivity of the allocation of weights to these figures. The same end-result is achieved without having to explain why the weights are what they are. [[ At the outset a good suggestion. However will not work. For the example given, the GM of 30 and 60 is 42.4 which is way too high. If we take RpW as x and BpW as 2x (based on RpO of 3.0), the following are the results. 75/25: 1.25 x 66.7/33.3: 1.33x GM: 1.414 x. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 11:25 GMT

    {What do I do for the 750 matches for which balls played info is not available}

    So, T7-RpW is the average of the scores of the top 7 batsmen across all innings.

    How is the T7-BpW calculated? Is it the average of the balls faced by the Top 7 batsmen (dismissed) across all innings?

    If so, how is it calculated for matches which have no balls played information available at the Batsmen level? Is it then based on the overall BpW for the match? Or is it derived from the T7-RpW and overall RpO for the match? [[ My blind spot. And my apologies. I should have really said ""I do not want to use a number which has been developed on a pro-rata basis for over 750 scorecards."" However I agree with you that the BpW figure is a good enough basis. My worry is that the scoring rate variation is so much across ages, probably 2.0 to 4.0, that the runs scored by top-7 batsman cannot be ignored. Ananth: ]]

  • Anshu N Jain on January 3, 2012, 12:00 GMT

    {For the example given, the GM of 30 and 60 is 42.4 which is way too high}

    Couldnt understand "way too high"; compared to what? The actual value shouldnt matter as long as the underlying process is agreed and understood to be the most apt. [[ By "too high" I mean that the weight given to BpW is too high in proportion to the RpW. The 42.4 is too close to the straight mean of the two numbers. i want this to be weighted towards the RpW figure. In reality either separately or together can be used. Let us wait for the other comments. Ananth: ]]

    If this is an in-match index, why should we use an RpO figure of 3.0, which is across ALL matches played to date, to "derive" suitable weights to assign to the RpW and BpW? [[ That is only an example. Has not been used at all. Ananth: ]]

    It unnecessarily burdens any and every match with what's happened outside of it, rather than sticking to actual in-match facts, which is the premise that this analysis of Pitch Quality rests on.

  • Gerry_the_Merry on January 3, 2012, 12:01 GMT

    Ananth, I completely understand what you are saying in the context of an innings rating exercise. I should have clarified upfront - if this and the next article are for innings rating, I take back my comment as you would automatically incorporate innings importance. I perfectly remember that it was your Wisden-100 which got me to realize the extra weightage that should be given for second innings. So when you do an innings rating work, I have no doubt this will automatically come in, so no need for an explicit layer. Hence here we are on the same page.

    I am ONLY talking in the context of "batsman across bowling group across ages" type of analysis. In this, in the May 2011 version, there was no differentiation between team Ist and team IInd. So TOUGHER second innings runs were not given their due. I am surprised that you don't find it correct to incorporate this differential in a batting career stratification exercise, but I have absolutely no doubt about what I am saying. [[ Gerry, I am looking at doing Batsman analysis Home/Away. I can also do by innings. These are all different aspects of batting. The reason why the Bowler and Pitch are done together is because they are interlinked together. It is nice to look at the possibilities of Marshall/Ambrose on flat pitches or Shakib/Mashrafe on bowler-friendly tracks. Ananth: ]]

  • Arjun on January 3, 2012, 12:16 GMT

    Ananth,

    good work. Still believe that Top-10 scores of the batsmen accross whole Test Match irrespective of batting position is better indicator of pitch quality. [[ Yes, I remember our discussions earlier on this. I think there is a strong correlation between the top-10 batsmen scores and top-7 RpW. Is it worthwhile doing a corollary piece on top-10 batsman runs and balls and then conclude the series with the batsman analysis. If so I can interject that article. Balls faced is important since that has an influence on how the pitch played. Let us throw this to the other readers also. Ananth: ]]

  • dr naveed on January 3, 2012, 16:39 GMT

    i want to know, if you can provide with a list of batsmen who would have played similar number of tests , innings , their average , their runs per test ,how many centuries, half centuries, zeros they would have scored had they played similar number of tests etc etc. for example ,how many runs would bradman or lara or gavaskar or miandad or border and other top batsmen would have scored in details mentioned above ,had they played similar number of tests played by tendulkar,and similarly how many runs and other details would tendulkar had achieved had he played similar numbers of tests ,innings and other details of each individual player, i think mr tendulkar would come far down in that list. what do you say ? we will see who was the greatest. [[ I understand what you are getting at. However that is an exercise I am not going to do. Unlike some obscure Australian academic who wanted to get the attention at this time, I have no intentions of doing that sort of futile exercises. I have done lot of peer comparisons of specialized batting numbers and these show Bradman with an imposing figure of around 2.80. The next one is Sutcliffe with 1.70. A number of batsmen are in the 1.60-1.70 range.Most modern batsmen, including SRT, Kallis, Javed are just below 1.50. Others like Ponting,Younis, Yousuf, Dravid, Lara are still lower. This is not a peer comparison of runs scored (as Dr.Rohde has done) but average, which is a performance measure. For me that is enough. Bradman, huge daylight through which a truck can be driven, and then the others. As far as Test cricket is concerned. And ODI comparisons should be within the 40 years. There cannot be any fourth-dimensional extrapolation. The advantage with this compariosn is that this negates all other factors like period, pitches, bowling, cricket laws, helmet, etc. I had done this work over 2 years back and will probably re-do this soon since so much has happened since then. Ananth: ]]