March 2, 2012

Tests - Pitch type analysis: The final solution ???

A detailed statistical analysis of the quality and type of pitches in each host country

Australia heavily dominated visiting teams in Tests between 1999 and 2004 Hamish Blair / © Getty Images

Finally I think I have found the solution to the vexed problem of how to handle the two very important factors faced by the batsmen. I am referring to the Bowler quality and Pitch type. These two are non-contextual in nature inasmuch as these are not influenced too much by the match conditions. I will briefly explain what has been done over the past 8 months in this blogspace. This will serve both as a recap and an introduction.

First the Bowling quality faced. I started with something simple and, with hundreds of wonderful responses from readers, I can confidently say that we have got almost close to what is ultimately needed. A brief summary of what is the final capsule is given below.

1. BQI to be done based on actual bowlers who bowled. This will take care of situations such as Imran playing as a batsman.
2. Use the reciprocal weighting method, as suggested by Arjun. This takes away the excess dilution of the bowling quality by weaker bowlers.
3. Use career-to-date bowling average at the beginning of the concerned Test, with special methods to handle the initial Tests.
4. Use the appropriate home or away c-t-d bowling average depending on where the Test is being played. There was a clear consensus on these two methods.
5. Incorporate recent form of bowlers.

These have made the BQI (Bowling Quality Index) a very powerful and effective method of valuing runs scored.

Now for the Pitch Type methodology.

I had done this earlier as a post-match determination. At that time I was not comfortable with using a Pitch type measure using previous Tests in the concerned location. I was swayed by the wide variations which happened in Tests in locations like Hamilton, Leeds, Kingston, Chennai et al. I was quite unsure of the whole methodology and this was reflected in the analysis. However at least I got the measure used correctly, after a number of trials. This was the top-7/10 partnerships in the match. This was suggested by Arjun. This worked very well since this encapsulated about 15 batsmen performances.

However this single-match post match methodology had some basic problems, as effectively pointed out by Unni and Ali. There was the double counting of bowler performances. Batsmen in strong-bowling teams benefited since their own bowlers kept the other team's runs and this, in turn, benefited the batsmen by making the match tilt towards a bowling one. I even have a complicated antidote to this problem sent by Unni.

So I set out to correct this. Using the single match lends itself to many varying situations, not all of which can be foreseen. So I have decided not to proceed on the single match basis. This article covers the revised work on Pitch type. I am sure the readers will find this more acceptable. In the next article I will look at the batsmen runs, adjusted by the BQI and revised Pitch type index.

I decided that I HAD to look at the history in detail. Couple of readers had also indicated that I must look at historic data and use the same to get an insight on the pitch type. So I put in some hard yards (or kilometres) in this area.

First I looked at the grounds. Easy to get discouraged. Over 100 grounds in which the 2000+ tests have been played. Only three grounds have had over 100 Tests played. This is less than a Test per year. Only 11 grounds have had over 50 Tests played there. And to top it all, there are 57 grounds in which fewer than ten Tests played. How do I get a handle on a pattern. I set myself a minimum target of ten Tests and in some important grounds like Bangalore, this required a period coverage of 17 years. Even five Tests in Bangalore took seven years. After a few fruitless days, I realized that proper ground analysis can only be done for about ten grounds and two countries, viz., Australia and England. They have settled patterns for playing Test cricket, playing regularly on core grounds. So the ground option was a non-starter.

I was getting nowhere. Suddenly I thought, "why grounds, why not countries". I first thought of the objections. Different types of grounds. Different levels of assistance. Flat tracks and dust bowls in the same country. But I realized that, these could all be handled if I followed my instinct and set time frames suitably. I briefly toyed with, and discarded, taking fixed numbers of Tests for each country. This evened the distribution very well. However it did not allow me to introduce the, almost mandatory, peer comparisons across grounds/countries. The similar numbers required varying number of non-overlapping years. So I went back to my tried and trusted period method which I had earlier used for the Test analysis - across ages.

This has worked very well. I have given the salient points below.

1. There are nine periods in all. As expected, the first two cover 50 years because of the sparse nature of Test series then. These periods reflect clear trends in Tests over the long period of 14 decades.

2. For each period, by Test, the top-7 partnership averages are determined, that too for the two teams independently.

3. A very important distinction is made between the Home team's top-7 averages and Visiting team's top-7 averages. This separation has completely changed and strengthened the methodology. In general, the home team's numbers are better than the visiting team's numbers (barring countries like Bangladesh and, surprisingly, New Zealand) and these varying numbers are not mixed up together.

4. These home and visiting numbers for each country are compared to the period values for home and visiting teams across all countries. This ratio, which ranges from 0.72 to 1.50, gives a clear indication as to the relative weight of the runs scored. This is the cornerstone of this analysis. I will decide later how this range can be implemented: as a continuous ratio or in the group methodology. Readers can now understand the importance of keeping the time periods across all countries same.

5. A special note is needed for a few situations. Bangladesh: The seven Tests played during 1950s-60s are treated as home Tests for Pakistan. The one Test played during 1999 between Sri Lanka and Pakistan is treated as neutral and visiting status is accorded for both. A similar treatment is done for the 12 Tests played in UAE and both teams in each of these Tests are accorded visiting team status. This is the case for some of the 1912 Triangular series matches. In these cases, instead of the top-7, the top-10 partnerships are taken since all four innings fall into the "visiting" group.

6. Readers would know how much of an importance peer comparisons are given in this blogspace. This method of working is peer comparisons at its best. Let us not forget that we are looking at the pitch characteristics, and nothing else. A player's x runs scored in a specific country, during s specific period of time, is adjusted by a ratio between the home or visiting batsmen average in that country and home or visiting batsmen for the whole cricket world.

7. It can be seen that the problem of double counting has disappeared. Let me take the example of Lara which was used often earlier. The problem was that Lara benefited from the quality of his bowlers dismissing the opponents for low scores. Now Lara's own bowlers do not get into the picture at all. He would be evaluated by how he and his fellow West Indian batsmen performed at home as against the peer batsmen playing at their respective homes. And similarly for away batting.

8. It must be clearly understood that bowling quality faced by batsmen still is a very important measure. When Clarke scored 151 against Australia in 2004, these runs would be treated almost at par as far as Pitch type are concerned since scoring runs for visiting batsmen in India was almost the same as the world figure (65.4 against period average of 65.8). However this was against a very potent Group-5 level Indian attack and he would get substantial credit for this. This would apply to many an innings against Australia in Australia.

9. Arguments will be raised in favour of incorporating the first/second innings separation. The problem is that if I go with both home/visiting and first/second innings separation, there would be four numbers per match and the whole process will be diluted. The lower partnerships would be very small. And I am loathe to take the first/second innings separation, without considering the home/visiting teams since that will add the stronger and weaker teams in a match and work out an average: a process I am not comfortable with.

Let me take the 1980-89 period. It will be obvious that run scoring in New Zealand, for the home team, which has a home-T7 average of 60.1, was more difficult than run scoring, for the home team, Pakistan, which has a home-T7 average of 71.4. This is taken care by the ratio between the respective home-T7 averages and across-countries home-T7 average of 66.3. So Wright's 130, scored at Eden Park during 1984 will have a weight upwards and Zaheer Abbas's 168 at Lahore during 1984 will be weighed downwards.

Now let me take the 2006-12 period. It will be obvious that run scoring, for the teams visiting Sri Lanka, which has a visiting-T7 average of 62.2, was more difficult than run scoring for the teams visiting India, which has a visiting-T7 average of 72.0. This is taken care by the ratio between the respective visiting-T7 averages and across-countries visiting-T7 average of 68.4. So North's 128 in Bangalore during 2010 will be valued at a lower level and Shaun Marsh's 141 in Colombo during 2011 will be valued at higher level.

In this analysis, it must always be remembered that, because the ratio is a run multiplying factor, sub-1.00 ratios indicate easier batting conditions and ratios above 1.00 indicate tougher batting conditions. I will later use these numbers to do a bowler analysis also. This will come out very well since the elements of playing at home or away are automatically incorporated.

I suggest readers read the above couple of times to understand the methodology fully before moving on to the tables. These are organized by period and country. In view of the number of tables there are only minimum of comments.

Period : 1877-1914

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1877-1914South Africa2641.61.2352.80.97

A somewhat lower scoring period. However that does not matter since we are using a ratio and a peer comparison. Scoring home runs against Australia was easier than doing so in England. A similar trend for the visiting batsmen.

Period : 1920-1939

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1920-1939New Zealand852.81.3471.00.90
1920-1939South Africa2858.81.2069.20.92
1920-1939West Indies864.41.1069.10.92

The averages increased dramatically with the arrival of the big scoring batsmen. England eased somewhat for the home batsmen. And the visiting batsmen did well.

Period : 1946-1959

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1946-1959New Zealand1637.11.50*61.30.98
1946-1959South Africa2552.01.1859.61.01
1946-1959West Indies2478.90.7882.70.73

The post-war period saw a drop in averages. The new Zealanders found it very difficult to score in their backyard as did Pakistan and South African batsmen. Visitors to Pakistan had it really tough. West Indies was a feather-bed for all batsmen

Period : 1960-69

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1960-1969New Zealand1949.11.3259.51.05
1960-1969South Africa1568.30.9560.71.03
1960-1969West Indies2074.90.8766.70.93

Australia eased considerably for all batsmen. The Indian batsmen found their home scoring touch.

Period : 1970-79

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1970-1979New Zealand2157.81.1770.30.91
1970-1979South Africa478.80.8647.41.35
1970-1979West Indies3475.00.9071.00.90

Pakistan changed dramatically for their own batsmen. With the advent of quality spinners, batting in India was not that easy. Both Australia and England became slightly more easy for the batsmen.

Period : 1980-89

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1980-1989New Zealand2860.11.1057.81.07
1980-1989Sri Lanka1255.41.2057.11.08
1980-1989West Indies3072.20.9258.81.05

Look at the change in West Indies for the visiting batsmen. To be expected with the advent of the great fast men. England struggled at home.

Period : 1990-1998

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1990-1998New Zealand3462.81.0668.40.89
1990-1998South Africa3062.21.0754.31.12
1990-1998Sri Lanka2667.50.9862.50.97
1990-1998West Indies3765.01.0258.51.03

England became a much better country for all batsmen. Pakistan became tougher for all. Travelling to West Indies and Australia was becoming easier.

Period : 1999-2004

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
1999-2004New Zealand2565.31.0568.50.96
1999-2004South Africa3176.90.8957.21.14
1999-2004Sri Lanka3671.60.9659.41.10
1999-2004West Indies3362.91.0964.91.01

Look at Australia's home average and visiting average. That is one of the biggest differences we have ever had.

Period : 2005-2012

PeriodCountryTestsHome TeamHomeVisiting TeamVisiting
2005-2012New Zealand2964.51.1566.51.01
2005-2012South Africa3970.21.0561.01.09
2005-2012Sri Lanka3177.90.9558.71.14
2005-2012West Indies3065.21.1372.70.93

The last period sees a narrowing of the Australian figures. India and Pakistan became big scoring countries for all batsmen. The only Top-7 partnerships average exceeding 100 happened in Pakistan. Sri Lanka showed a wide variation between home batsmen and visiting batsmen.

Home Top-7 partnership averages: by country

Home T7-PsAvg187719201946196019701980199019992005All
Bangladesh       1.631.271.29
India 1.090.971.061.060.940.890.960.870.96
New Zealand 1.341.651.321.
South Africa1. 1.070.891.051.07
Sri Lanka     1.200.980.960.950.95
West Indies 1.100.780.870.900.921.

Now for a graphical representation of how the numbers have stacked up, by country. Please remember that the lower part of the graph indicates that run scoring was on the easier side while the top half represents tougher run scoring conditions.

Top-seven partnership averages of home batsmen over the years © Anantha Narayanan

Most countries, barring Australia and England, have found it tough during their first period, even in their own countries. Pakistan seems to have wild swings. England seems to have the most stable of countries for the home batsmen. Barring a period or two, Australia have found their own backyard very comfortable.

Visiting Top-7 partnership averages: by country

Visiting T7-PsAvg187719201946196019701980199019992005All
Bangladesh       0.820.840.78
India 0.970.870.951.
New Zealand 0.900.981.050.911.070.890.961.010.96
Pakistan  1.411.040.951.101.070.970.791.04
South Africa0.970.921.011.031.35
Sri Lanka     1.080.971.111.151.05
West Indies 0.920.730.930.901.
Zimbabwe      1.010.920.950.93

Top-seven partnership averages of visiting teams over the years © Anantha Narayanan

It was indeed very tough for the visiting batsmen to travel to Pakistan during the initial periods. South Africa, during the 1970s was similar. England was somewhat tough during the early stages bot eased off somewhat recently.India has almost always been a reasonably easy place to visit. Look at Australia over the past 15 years: not too easy a place to tour.

I will have a follow-up article like my previous one, grouping batsman scores against a combination of the BQI and the new PTI (this article). I will use the same groups methodology. 5 for BQI, already explained. And 5 for PTI, suitably allocated depending on the distribution of 150+ values.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems

Comments have now been closed for this article

  • testli5504537 on March 11, 2012, 4:32 GMT

    @Gerry_the_Merry Your comments make interesting reading, and your knowledge about the game is summed up best in your last sentence. But your suggestion about the computing T7B3 for India/WI baffled me. It would be better if you check all the matches between the tow teams and see for yourself how many matches had that 25 BQI windies bowling lineup up against India! For all the talks of Lara mostly playing in a strong bowling lineup compared to his Indian counterparts, he has played about 38% of his matches, which had at least one great (Mcgrath, Donald, Pollock, Akram, Waqar Bond, Asif (The last two unfortunately couldnt play many + Ambrose included case of Indians)) -- fast bowler. In contrast Dravid and Tendulkar merely played their 26% of matches against them.

  • testli5504537 on March 11, 2012, 3:43 GMT

    @Ananth: Sure, all those 100-100 things would be nice and, in fact, necessary for a gaping hole of wants. I think SRT sold all his good fortune with that Ferrari last June. The decisions he made during April-June 2011 are as bizarre & embarrassing as his silence since then (not that he ever had something very perceptive to say!).

    I don't understand how a team can function with such an individual in it. Let the selectors take pity on team India and MSD now. I don't believe SRT will score a 100 in BD and am rooting for Kallis to hold the records of the aggregate test runs and test hundreds!

  • testli5504537 on March 10, 2012, 18:20 GMT

    @Ananth: SRT does not belong in ODI's anymore. Also, stats in tests since April 2011:

    VVS: 1. vs WI in WI: 3 tests, 243 runs, ave=49, 3 50's. 2. vs Eng in Eng: 4, 182, 22, 2 50's. 3. vs WI in Ind: 3, 298, 99, 1 hundred, 1 fifty. 4. vs Oz in Oz: 4, 155, 19, 1 fifty. SRT: 1. vs Eng in Eng: 4, 273, 34, 2 50's. 2. vs WI in Ind: 3, 218, 44, 2 50's. 3. vs Oz in Oz: 4, 287, 36, 2 50's. [[ Total: VVS: 14-878- 36.58 SRT: 11-778-37.04 Very little to choose from these two. Ananth: ]]

    VVS failed miserably in Eng & Oz but did quite well in WI and in Ind. So, if he is to go, SRT must also be removed. SRT did not fail that miserably in Eng & Oz --- in both cases, he was probably India's 2nd best performer. But when 37+ year olds post such stats series in series out, it is better if young batsmen take their place ... even if they post sub-30 averages, there is some upside to it. Now, I think even the media has had enough of SRT and wants the game to move on. Hope SRT does not view that as a challenge!! [[ I am flabbergasted to see his opting to go to Bangladesh, to play in a tournament which is generating less interest than the various Leagues which have cropped up. After more than 3 months on the road, playing in miserable losing matches, why would he not have stayed back in Mumbai and let Rahane take his place. Has the 100th 100 become that important. Maybe so. Some company might gift him a Rolls-Royce and the government might exempt tax. Pepsi's 100-100 colas would flood the market. Toshiba might come with a 100-100 laptop, autographed by SRT and charge Rs.100k. Reynolds might make a 100-100 pen and sell it for Rs.100. A few 100-100 outlets might be opened. Contrary to the perception of some misguided readers, I am a great fan of Tendulkar. That does not mean I cannot have other favourites. However this particular decision absolutely put me off. There is a, say, 50% chance that Tendulkar does not get his 100 in the 3/4 matches he gets to play. What then ??? Ananth: ]]

  • testli5504537 on March 9, 2012, 17:28 GMT

    Ananth: Dravid's timing & presentation of his retirement announcement had as much class as the impeccable manner in which he represented India as a player. A brilliant career as an ambassador/administrator awaits this cerebral gentleman. It is immensely satisfying that after rather sub-par periods of 2007-08 and 2010, he was at his glorious best for 6 months, May 2011-Nov 2011. Can you do a brief tribute article on him? I suggest the following idea (which I had done last year as well): - N = # innings played by a batsman; - A1 = # innings in which he either crossed 80 or played 150+ deliveries ... (a "dominant" innings); - A2 = # innings in which he either crossed 35 or played 80+ deliveries ... (a "supportive" innings); - B1 = A1/N; - B2 = A2/N. Now, compute A1, A2, B1, B2 for all batsmen, and determine where Dravid ranks in it. [[ Will certainly do a special article on Dravid. That would be our repayment to someone who has always stayed in the background, getting very little recognition and along with VVS represents the quintessential gentleman. Even SRT has exasperated us at times, with his pick-and-choose policy, upto and including the Asia Cup. But not these two. I only hope VVS follows soon. He should not fall into the trap of thinking "Now that Dravid has gone, India might need me in the middle-order" and wait until the next home series. The selectors may not select him for the home series. Then he loses face and it would be a forced exit. He does not deserve that. Ananth: ]]

  • testli5504537 on March 8, 2012, 6:49 GMT

    @Milpand. `The outcome of every single delivery is random and independent of previous outcome. So a Markov Chain, Monte Carlo simulation and Bayesian probabilities would be the basis of the simulation.` - not quite sure how we got onto the topic of simulation in an analysis of pitch types based on historical performances, but anyway...and I`m far from an expert on these matters, but... Applying statistical models to simulate games is all very well, but once again I think your first sentence here is an incorrect summation. From what I understand of Markov chains, particularly relating to sport, no match in its entirety fits a Markov state. While they can be used to predict various possibilities from a certain match situation, they don`t imply the randomness and independence as you suggest. Really the only aspect of a cricket match which is random and independent of previous outcomes is the toss - although MS might tend to disagree after recent difficulties [[ First let me apologize for the delay in replying. I was unwell and then Cricinfo servers went on a blink for couple of days. In Football, the quality and level of success a pass from player A to player B will determine what happens next. Also the defensive wall in front. Does B pass the ball back to A, passes to C or take a shot at goal etc. In Cricket, in my simulation, each ball is a contest by itself. What happened in the previous ball, be there a wicket, dot ball, 1-6 runs, play-and-miss, dropped catch et al would go to update the match status. Then the new contest begins, It will be coloured and influenced by the match status but not directly by what happened in the previous ball. I can only see one exception. A no-ball and a free-hit which I have not introduced in my simulation. The other peripheral impact might be if this was a hat-trick ball. Would not the batsman go completely on the defensive, trying to prevent a wicket. Not really. If they needed 10 from 7 balls and this was a hat-trick ball, the batsmen would swing, no matter. Ananth: ]]

  • testli5504537 on March 6, 2012, 23:35 GMT

    Ananth, will it be possible to view the player group analysis sheet in your followup article in the traditional format which shows Mat,Inns,NO,Runs,HS,Ave,100,50,0 separately for each group? [[ First let me apologize for the delay in replying. I was unwell and then Cricinfo servers went on a blink for couple of days. Not tp that level. But Inns/Nos/Runs/Avge.. Ananth: ]]

  • testli5504537 on March 5, 2012, 19:15 GMT

    I sense that this is a continuation of a previous article (articles?). Links to them would be most appreciated. [[ First let me apologize for the delay in replying. I was unwell and then Cricinfo servers went on a blink for couple of days. It is difficult to provide links to all. If you go over recent archives you yourself will find them. Ananth: ]]

  • testli5504537 on March 5, 2012, 8:17 GMT

    (contd.) For Tendulkar, take all WI scores and scale down by 25/35, then compute T7B3. For the two batsmen, the relative BQI will be stripped out of the equation.

    Now which BQI to take? Since you are already splitting Home/away, for the purpose of the above calculation, you must use simple CTD (home+away) average based BQI, else there will be another case of double counting.

    I have thought about this carefully, and am certain that you will get an exact fix on pitch quality, which will hence be captured on a single-match basis.

    So please don’t define pitch quality as the character of a pitch – it just doesn’t exist at a venue level consistently enough.

    Cant resist throwing in this example also: Ahmedabad 1983 v/s West Indies – Clive Lloyd called it the worst pitch he had ever seen, whereas in 1986-87 v/s Pak, it was the most somnolent pitch in living memory. Final Example: When England plays two 3 test series in a summer, why do Indians prefer to visit in the latter half??? [[ Gerry I am going to complete this task as I have set out. Conceding all variations you have mentioned, the nature of pitch over a period is a good indicator of how tough or easy run scoring was. This is also a method to qualtify subjective statements such as "Run scoring in Sri Lanka is more difficult than India, during the past few years". If so, I am going to assume that a vcentury scored in Sri Lanka is more valuable than a similar one scored in India. However, as I have already mentioned to Arjun, when I come to Inni gs Rating analysis I will revert to single match basis. In addition to the 5 or so I already have, I have Unni's method, Arjun's revised method and now yours. Let me evaluate all and come to a fresh one. Ananth: ]]

  • testli5504537 on March 5, 2012, 8:09 GMT

    Ananth, why must you at all carry out multi match assessment of pitch quality? Is there such a quality at all? 1) The MCG pitch on which India bowled Aus for 83 in 1980 was totally different from the one in which Hughes made 100* an year later. 2) The Adelaide 1990 pitch for the Ashes was relaid for Aus-India in 1991 (Aus 145 all out on day 1) 3) I have already talked about the massive difference in Pak pitches whenever WI visited in '80s (see 1980-81 series and 1986 series) 4) Madras is sometimes fast, like Brisbane (Pak 1980, England 1985) and sometimes very slow (Aus 2001). What you did in the previous edition was absolutely on the right track. Let me suggest a way (please comment if you publish) to neutralize double counting:

    Taking a hypothetical example of WI/India, with focus on Lara (weak batting strong bowling, 25 BQI) / Tendulkar (opposite, 35 BQI). Compute two different RSI. For Lara, adjust all Indian scores upwards by 35/25, then compute T7B3. For Tendulkar, take (contd.)

  • testli5504537 on March 4, 2012, 22:26 GMT

    For academic interest, read the paper on ODI simulation published in Canadian Journal of Statistics. [[ Let me do so. Ananth: ]]

  • No featured comments at the moment.