Test cricket is a subtle sport from the point of view of ranking teams. Four results are technically possible in each match, and three of these are common. Unlike in other sports, a draw does not require that both sides end up with equal points. The draw only means that neither team did enough to earn an outright result. The Test calendar is not regular, like the professional tennis calendar or the IPL. The length of a Test series also varies. For example, if India had played a two-Test series in England this year, like Sri Lanka did, they would have come away with a 1-0 series win. Pitches and the weather affect how strong a team's performance is. All these characteristics make the problem of comparing Test teams an interesting one.
There are three major challenges:
How does one measure the margin of victory or defeat?
How does one measure a draw? In a game that ends with one team earning a draw with a desperate tenth-wicket stand for the final hour 250 runs away from the fourth-innings target, how does one account for the relative positions of the two teams at the end of the match?
Even if the first two questions can be answered, how does one decide which matches to consider in the rating? What is a reasonable cut-off? Does one consider the most recent home and away series by each team against each of the other teams? Or does one simply consider all Tests in the last 24 months with some arbitrary decay set in?
The ratings published by the ICC ignore the margin of victory entirely and only consider wins, defeats and draws. Each team gets equal points in a draw. A Test win results in one point, a defeat results in zero. The ICC's ratings methodology does take into account the strength of the opposition.
Anantha Narayanan has developed an interesting advanced variant of this idea. He uses it to identify periods of success, but his method could easily provide ratings.
I first worked out a method for rating Test teams in 2003. My approach has been to rely on the score and the scorecard. I'll use the example of the Galle Test. The Test can be described as follows:
The cost of one wicket for the Galle Test is 39.5 runs. Each team's performance per delivery in the Test can be calculated as follows:
Batting: scored 632 runs in 1077 deliveries
Bowling: took 20 wickets in 1327 deliveries
Sri Lanka's performance per delivery for the Galle Test is:
(632/1077) + (20*39.5/1327) = 0.587 + 0.595 = 1.182
Similarly, Pakistan's performance per delivery for the Galle Test is:
(631/1327) + (12*39.5/1077) = 0.476 + 0.440 = 0.916
In case of an outright win, the winning team is given a "win bonus" - the average points scored by a team in the Test. In this instance, win bonus = average (1.182, 0.916) = 1.049
The final rating for each team:
Sri Lanka: 1.182 + 1.049 = 2.231
Pakistan: 0.916 + 0 = 0.916
In case of a draw, the win bonus is not awarded. In case of a tie, both teams are awarded the win bonus.
This method gives us a rating for each team in any type of Test. With this basic score, we can advance in two directions. First, we can calculate any team's average rating over a given sequence of Tests. Second, we can calculate the average rating for each player over a career, or at any point in a career (similar to the way we calculate batting average).
The first is simply an average of a team's rating in each Test over the given sequence. Later in this post, I use a moving average over the ten most recent Tests to make the chart easier to read.
For the second, assign the team's rating for a Test to each player in the XI. This way, a player gets a rating for each Test played. A player's rating at the end of their nth Test is the average of ratings for each of the first n Tests (similar to batting average). This would be the player's rating at the start of the (n+1)th Test. A player on Test debut has a rating zero.
For each player in a starting XI for a Test, we now have a rating at that point in their individual career. The rating for the playing XI is simply the sum of the individual ratings for the 11 players (zero for players on debut).
This method of calculating the rating eliminates the problem of finding relevant matches to include in a rating. Normally this problem is solved by using some mathematical decay. The ICC's rating method assigns different weights to different matches. Matches since the most recent August have the highest weight while matches in the 12 months leading up to the most recent August are assigned half the weight.
Anantha Narayanan's method uses geometric decay. My approach relies on personnel changes. Among other things, this accounts for things like injuries and other absences. But mainly it relies on the idea that teams are selected to win, and if they don't produce wins, players are changed. Further, whether they win or lose is due to individual players. The best available players are used to build a balanced team. Fans tend to be cynical about selectors, but these points hold true for Test team selection. More generally, the majority (sometimes even up to ten) of players in an XI pick themselves.
For example, India held Australia to a draw in Australia in 2003-04, a feat that looks less impressive when you account for the fact that Australia were missing Glenn McGrath and Shane Warne in that series. I'll use the remainder of this post to illustrate the results using this method.
The chart above shows the rating for Australia in the 21st century. Each blue data point represents a Test match. Each orange data point represents the median experience, measured in number of Tests, for Australia. I've shown a ten-Test moving average to make reading the chart easier. The blue band represents rating and the orange band represents median experience. Ratings for teams range between 0 and 30. The maximum rating for a Test in Test history is by England in 1930. But it is a mistake a read too much into ratings for individual Tests. A more reliable picture emerges when a team builds a sequence of highly rated Tests. This means that (a) it is winning, and (b) it is not losing players who are winning to retirements, and hence are likely to keep winning.
When I made these charts, I realised that tracing median experience was just as illustrative as tracing the rating. I'll chart England over the same period next. I've maintained the scale of the chart to make comparison easier. Incidentally, the Australian side that played England in Melbourne on Boxing Day in 2006 is the most experienced Test team in history. The median player in that XI, Matthew Hayden, played 88 Tests.
The chart for England shows their two distinct high points in the 21st century. The first, under Michael Vaughan in 2004 and 2005, culminated against Australia, winning the Ashes after 16 years. The second, a steadier, less spectacular rise, began under Andrew Strauss in early 2009 and ended with two comprehensive defeats - 3-0 to Pakistan in UAE in 2011-12 and 2-0 to South Africa in England in 2012. Alastair Cook took over the captaincy for England's tour of India. His team was more or less the same as Strauss' team. As we reach the right edge of the chart, we see that while Cook won in India and won the Ashes in 2013, the most recent episode has been the departure of the players he inherited from Strauss.
Since the subject has come up recently, I will conclude this post by showing the picture that this method paints of the relative merits of Australia and West Indies in their heyday. To do so, I present charts showing the rating and median experience for each since 1945.
The West Indian side of the late-1970s and '80s has been the subject of much myth-making. The award-winning documentary film Fire in Babylon is symptomatic of this. It's worth remembering, however, that the best phase of West Indian success coincided with a decline in the strength of England, Australia and India. Australia lost Greg Chappell, Dennis Lillee and Rod Marsh to retirement in the early 1980s. India couldn't replace their spinners adequately. Sunil Gavaskar was in decline in the 1980s. England didn't produce a single bowler of note in the 1980s. In fact, it would be fair to say that England didn't produce a single world-class bowler between Ian Botham and James Anderson. There were bowlers who had stray seasons of high performance, but nobody with the class to sustain it season after season.
West Indies may have had high-quality bowling in the mid-1980s, led by the incomparable Malcolm Marshall, but their record against the one team that had a great bowler of its own, Pakistan, is revealing. The West Indian batting in that period was suspect, as Gordon Greenidge and Viv Richards aged, and Brian Lara was yet to emerge.
If anything, I suspect that a hypothetical West Indies team between 1979 and 1982 would beat the West Indian side of 1984-1987. The batting would have Clive Lloyd, Greenidge, Alvin Kallicharran and Richards, most of them in their prime. The bowling would have Michael Holding, the wily Andy Roberts and a young Marshall.
By comparison, the Australians faced at least two challenging teams at their peak. They had to play India in India - a difficult proposition against Anil Kumble and Harbhajan Singh and India's batsmen (with the power of Virender Sehwag on subcontinental wickets). And they faced Michael Vaughan's English side that hammered every team it faced in the lead-up to the 2005 Ashes.
Ponting's side from 2005 to 2009 dominated Test cricket far more than Steve Waugh's team did. From October 2005 to May 2008, they won 21 out of 25 Tests, including four out of five against West Indies, five out of six against South Africa, five out of five against England, two out two apiece against Sri Lanka and Bangladesh, and two out of four against India.
This was, coincidentally, a period of decline for both South Africa and England. Shaun Pollock was ageing and Dale Steyn was yet to emerge as the dominant force he has become. England's fast bowlers from 2005 never bowled together again with anything near comparable effect. India were the only team capable of challenging Australia. They did so, winning the only Test Australia lost during this period.
Dominance in Test cricket has tended to be the result of other teams not having high-quality bowling and/or competent batting. Waugh's Australian side may have been as good as the West Indian side of the 1980s, but Ponting's side of the mid-2000s was more dominant than either of those teams.
With this rating design, I have answered the three big questions I consider to be of central importance in the matter. The method allows me to account for margin of victory, for draws, and it allows me to build a rating without arbitrarily deciding which matches to consider in the rating. More importantly, it allows us to examine teams in time.
Ratings are interesting and instructive - far more so, in my view, over a sequence of games than they are for one solitary point in time.