December 19, 2008

The most efficient strike bowlers in Tests

A stats analysis to identify the most efficient strike bowlers in Test history

My usual lair is Different Strokes, but that's a place for (semi-)topical opinion rather than discussion of statistics methodology, and Rajesh has been kind enough to allow me to interlope and put this little study before you.

Although I didn't start out that way, what I've ended up with, I think, is a pretty good cross-era ranking of the most efficient strike bowlers Test cricket has known. I don't claim that it's definitive: what I do claim is that the method I've used is quite interesting, and I'd like to see what other stats mavens make of it.

The first decision I made was to eliminate all minnow matches. Leaving out Bangladesh and Zimbabwe is pretty commonplace but if we're being realistic, only England and Australia did not have a bedding-in period as minnows before they became a team to be at least reckoned with. It seems essential to eliminate minnow matches because otherwise some bowlers are at a distinct disadvantage: a bowler whose career was from 1970-1980 never got a chance to bowl at a minnow team, whereas Fred Trueman had endless fun with weak Asian teams in the 1950s. Since I use Ric Finlay's Tastats, this sort of exclusion is very easily accomplished.

There being no formal event which declares a team to have "arrived" in Test cricket, I had to make some arbitrary judgements about when to regard a team as having graduated. I took South Africa's entry to senior ranks as having occurred when they unveiled their quartet of googly bowlers and comprehensively thrashed the fairly weak England team which toured in 1905-6, after which they were generally difficult to beat. Though West Indies won a series against England in 1934-35, the touring side was again half-strength; I decided that they did not really graduate until 1945. India's graduation I took to be 1961, Pakistan's 1965, New Zealand's 1969 and Sri Lanka's 1990. One might be able to argue that Zimbabwe were of a reasonable standard from about 1998-2003, but it seemed simpler to leave them and Bangladesh out of all consideration. Since I also have a prejudice against non-Test matches being included, the ICC Superflop game is also left out.

Subtracting those games has a widely-varying effect on a bowler's career total of wickets. Muralitharan drops from 751 wickets to 588 and Trueman from 307 to 192, whereas Jeff Thomson and Michael Holding's figures remain untouched.

Next, I decided to find a way to give greater credit for taking top-order wickets, because they are the ones you really want your strike bowlers to be cleaning up.

I was initially tempted to weight them on the basis of the runs scored at each position, but then realised that the top order contribution is exaggerated by declarations and innings cut short by the match being over. I then moved on to using the batting averages at each position.

Adding up the averages for each position gives a total of 307.27. The share for each wicket is given by positional average/total average, so the #3 average of 39.662 is 0.129 of the total. Those shares sum to 1, so if we multiply them by 11, they will sum to 11. This gives us the following weightings:

1 2 3 4 5 6 7 8 9 10 11
1.34 1.27 1.42 1.48 1.34 1.15 0.98 0.75 0.56 0.41 0.331

If the dismissal of a batsman is worth the above number of wickets, then a bowler taking one of each will have a total of 11 wickets, whereas someone with a top-order bias will have more and someone who wipes up tail-enders exclusively will have a lot less. Owing to a limitation in TAStats, whose breakdown of bowler's victims by position does not differentiate between openers, in practice I used 1.30 for both 1 and 2.

To take three examples, Shane Warne's total gets adjusted from 685 to 685.2, Glenn McGrath's from 549 to 605.6, and Stuart MacGill's from 164 to 159.0. Given that in practice a lot more top-order batsmen than tail-enders get dismissed, most bowlers actually show a profit, so MacGill's reduction is evidence that he really was a tail-end cleaner.

If we apply this wicket adjustment to the figures for non-excluded matches and remove everyone who played less than 20 relevant games or took under 100 relevant wickets, this is the resultant top ten by average:

Player M Balls Runs Wkts Adj W AdjAve AdjSR
SF Barnes 27 7873 3106 189 203.6 15.26 38.67
R Peel 20 5216 1715 101 108.7 15.78 48.00
MD Marshall 81 17,584 7876 376 410.3 19.20 42.86
CEL Ambrose 96 21,641 8401 397 433.1 19.40 49.97
GD McGrath 120 28,485 11,930 549 605.6 19.70 47.04
AK Davidson 34 8997 3033 142 153.9 19.71 58.48
JC Laker 36 10,312 3611 162 178.3 20.25 57.84
AA Donald 69 14,906 7113 316 350.7 20.28 42.50
H Trumble 32 8099 3072 141 150.8 20.37 53.71
J Garner 58 13,175 5433 259 265.9 20.44 49.56

The right-hand column shows that there is a wide disparity between bowlers' strike rates. A strike bowler's efficiency does not depend solely on runs conceded; his strike rate is also an important factor because of the runs scored at the other end and the overall time taken. If Dale Steyn bowls six overs and takes a wicket but concedes 30 runs while Makhaya Ntini concedes 18 in his six without taking a wicket, the opposition are 48/1 at the end of these spells. If Shaun Pollock bowls 11 overs and concedes 20 runs while taking a wicket, 33 runs get conceded at the other end and the opposition reach 53/1 although the game is ten overs older.

I have for some time been toying with a measure I call the Power Index, which combines the average and strike rate by multiplying them together and taking the square root. Sqrt((runs/wickets)*(balls/wickets)) has a denominator of wickets, so the numerator can be seen as representing the resources used up in taking a wicket.

If we apply that algorithm, we get a new top ten, as follows:

Player M Balls Runs Wkts Adj W AdjAve Adj SR Adj PI
SF Barnes 27 7873 3106 189 203.6 15.26 38.67 24.29
R Peel 20 5216 1715 101 108.7 15.78 48.00 27.52
MD Marshall 81 17,584 7876 376 410.3 19.20 42.86 28.68
AA Donald 69 14,906 7113 316 350.7 20.28 42.50 29.36
CEH Croft 27 6165 2913 125 141.7 20.55 43.50 29.90
DW Steyn 23 4414 2706 114 114.7 23.60 38.49 30.14
GD McGrath 120 28,485 11,930 549 605.6 19.70 47.04 30.44
CEL Ambrose 96 21,641 8401 397 433.1 19.40 49.97 31.13
J Garner 58 13,175 5433 259 265.9 20.44 49.56 31.82
Waqar Younis 73 13,517 7374 293 312.3 23.61 43.28 31.97

Ambrose and McGrath drop down, Colin Croft rises, and Dale Steyn and Waqar Younis come in instead of Davidson and Trumble.

However, this is deeply unsatisfactory because we know that Barnes and Peel played in a time when scores were lower and wickets fell much more often. Today's fashion is to bat aggressively from the word go, whereas in the middle of the last century caution was the Test batsman's watchword. We need a way of equalising for the changes in general pitch conditions and style of play.

This is a well-known problem, and what follows does not claim to be universally applicable.

But the essential aspects of what we are examining here are the balls bowled, runs conceded and wickets taken. If we can find a way of keeing one or more invariant, then we have a fixed point while scaling the others to fit.

I decided to use the first match innings of Tests as the way to fix par. The first innings of the match is the least likely to be cut short by weather, and the least likely to be affected by tactical considerations. A third innings can be anything from a stonewall grind trying to save a match to a hell-for-leather bash while trying to set a target, but a first innings is always going to be played at whatever pace the side think appropriate given the conditions and they will nearly always get as many runs as the conditions allow. The dimensions of the first match innings may change, but its tactical purpose does not.

Across our population of matches, the mean first match innings notches up 327 runs off 678 balls.

What I did was to find out the dimensions of the average first match innings in a particular bowler's period. I decided not to restrict the sample to matches that the bowler played in, because then his performances are effectively the norm and we don't see how he stood out (or not) from his contemporaries. I think we are more interested in how their performances stack up relative to everything that happened in their period, so I used all the non-excluded matches played in the cricket years (running May-April) which his career spanned. Somone who debuted on 11th November 1982 and finished on 25th August 1994 would thus have his period defined as 1982 -1995 (Ric Finlay will recognise his "years from and to" filter option).

I then scaled their figures for balls bowled and runs conceded accordingly. So a bowler whose period averaged 340 runs off 650 balls is adjusted to concede his actual runs * 327/340 off his actual balls * 678/650 . We now have adjusted figures for each of balls, runs and wickets and can run through our standard calculations for average, strike rate and PI to come up with our final result, the top ten of which looks like this:

Player B/I1 NewB R/I1 NewR NewW NewSR NewAve NewPI
MD Marshall 659 18,091.0 321 8023.2 410.3 44.10 19.56 29.37
SF Barnes 552 9670.1 266 3818.3 203.6 47.50 18.75 29.85
DW Steyn 630 4750.3 358 2471.7 114.7 41.42 21.55 29.88
AA Donald 644 15,693.0 319 7291.4 350.7 44.74 20.79 30.50
GD McGrath 645 29,942.4 335 11,645.1 605.6 49.44 19.23 30.83
KR Miller 800 8199.6 329 3597.0 175.3 46.77 20.52 30.98
RR Lindwall 798 9219.3 325 4337.5 200.3 46.03 21.66 31.57
EH Croft 641 6520.9 308 3092.7 141.7 46.01 21.82 31.69
FS Trueman 764 9085.6 325 4665.5 203.5 44.64 22.92 31.99
JC Laker 790 8850.0 321 3678.5 178.3 49.64 20.63 32.00

B/I1 and R/I1 are the average first match innings balls and runs for that bowler's period.

As a dedicated supporter of SF Barnes as the king of bowlers, I am mortified to discover that Malcolm Marshall pips him to the top spot - but if Barnes had to be toppled, I'm glad it was Macko.

On a very contemporary note, Dale Steyn has made an incredible start to his career, since he comes in at number three (with a bullet) on this all-time list. Waqar Younis's figures at the same stage of his career were even more spectacular, with a PI of 27.29, so we can probably assume that Steyn will also descend the list as his career unfolds.

Of the top ten, only McGrath and Laker were ever really used in a containing role on dead pitches, and they did not do that much. In the full table, those who spend time keeping the runs down without taking wickets lose out, with the result that Shane Warne comes a lowly 55th. But then, this is not a merit ranking but an assessment of how nearly they approached the ideal of incessant lethality.

It's not an unbelievable top ten. If the model is wrong, it still manages to produce a sensible result.

But it can certainly be challenged on a number of points.

Are the cut-off dates for minnowhood reasonable?

Are the relative batting averages of the positions in the batting order a sound way to weight the value of wickets?

Is the Power Index a sensible way of combining parsimony and frequency to measure attacking prowess?

Is the first match innings a useful point of reference?

Even if comparing first match innings is reasonable, should one average the dimensions thereof for all matches or just the ones the bowler played in?

Whatever those averages are, is it sufficent to scale them in a linear fashion or should some more complex function be used?

So let the debate on those and no doubt other questions commence.

The full table is available here.

Comments have now been closed for this article

  • testli5504537 on September 20, 2009, 19:08 GMT

    Good Article but some changes could be made (i) changes should also be made for conditions the bowler bowls in eg if the pitch is not conducive for spin , the wickets taken by a spinner should be taken into account and wickets taken by a paceman in say the subcontinent should also be adjusted (ii)adjustments should also be made for the strength of the bowlers' strike partners ie Marshall bowled alongside Holding,Garner and Croft. McGrath had Warne. While someone like Steyn has less support

  • testli5504537 on August 3, 2009, 9:56 GMT

    It is amusing reading the comments by the subcontinental apologists all agog because their favourite bowler is not higher :) This list is not anti-spin.

    The only thing I would suggest is that when you removed the minnow matches, you removed the minnows. What of the minnow players like Heath Streak? They weren't usually bowling against minnow batsmen, were they? Just curious to see how folks like Streak and co (Strangs, Olonga) did.

    The ODI and T20 list would be interesting too.

  • testli5504537 on February 15, 2009, 17:14 GMT

    itz a great analysis...i appreciate the work done...i'l tell u people 1 thing!!...dont take the credit away of some great legends the sport has had..such as Murali Wasim and Warne..and even Kumble...watever said and done playing cricket for a long time and getting so much of wickets is not an easy thing as you sit on write comments..please be aware of it...and don't underestimate their performances...

  • testli5504537 on January 29, 2009, 9:49 GMT

    I have one suggestion, as a bowler gains reputation as a strike bowler, opposition batsman tend to be more wary of that bowler and tend to attack the other bowlers more. So you should take into account the drop in strike rate (and increase in economy ) of his team-mates when the bowler in question is playing as opposed when he is not playing. Also spellwise analysis may make some sense as many times a change of bowler brings about a wicket not because the bowler is a top strike bowler but because of the lack of adaptability of the batsmen to the new bowler

  • testli5504537 on January 27, 2009, 9:19 GMT

    Hi Mike,

    I have been crunching some cricket stats of my own and part of my analysis uses the one that you have done. Can you also include the number of tests played by each one of these bowlers, after the removal of the 'minnow' tests? Thanks a lot.

  • testli5504537 on January 15, 2009, 23:59 GMT

    Analysis aside, though does show lots of sense, I have watched several great bowlers bowling, flair of their run, and intimidating batsmen over the years. I must say that Malcom (MD) Marshall is the best of all.


  • testli5504537 on January 2, 2009, 3:52 GMT

    I’m afraid your analysis of the importance of strike rates, in terms of the example of Ntini partnering either Steyn, with a low strike rate and higher average, or Pollock with the higher strike rate but lower average, has some flaws. In your example you assume that Ntini won’t take any wickets himself, but when one assigns an average to him, then the situation changes. If Ntini has an average of 33 then partnering Pollock the score after the 11 overs would be 53/2, whereas partnering Steyn the 2nd wicket would fall at 88/2 (Ntini concedes 33, Steyn 55). Even with Ntini having a relatively high average, it can be shown that the Pollock/Ntini partnership will bowl the opposition out for a lower score. For example, if Ntini has an avg. of 33, Pollock/Ntini will take 10 wickets for 265 in 110 overs, whereas Steyn/Ntini will take 10 for 336 in 84 overs. The simple average is still the best statistic for measuring a bowlers ability and is far more important than the strike rate.

  • testli5504537 on December 22, 2008, 16:11 GMT

    Great analysis! I must say. But bowlers like Steyn always have good strike rates because they bowl short bursts. This is slightly unfair to bowlers like Mcgrath who toil in over after over on unhelpful featherbeds just because they are not as fast as a Lee or Steyn. Also the weightages of positions is always bias against spinners, especially those who play in teams with a good battery of fast bowlers

  • testli5504537 on December 21, 2008, 6:52 GMT

    I appreciate your hard work, but disagree, because of a few reasons: Many people have tried multiplying different statics for bowlers and batsmen, but have never found a way that can encompass all measures of statistics in one single statistical measure, like average, strike rate, economy rate etc. Or we would be using it, instead of arguing if Jaysuria is a better than Dravid, despite having an inferior average, because Sanath’s strike rate is better. 2. Few would dispute that Waqar Younis was one of the best "strike" bowlers of all time. Many bowlers like McGrath are very efficient bowlers, but I would not consider them strike bowlers in that they do not strike as often as say Lee, Younis or Marshall. You could argue that McGrath is a better all around bowler than Waqar Younis, but certainly not as efficient a strike bowler.

  • testli5504537 on December 21, 2008, 4:53 GMT

    I like this analysis, however I think there is a glaring difference. The conditions when a spinner comes on is greatly different to when a pacer does. Batsmen will have to be pried out from having been set for spinners where pacers, or the best strikers, tend to face batsmen who are trying to play themselves in. And what about pitches? A bowler like Murali has played most his career on a dustbowl whereas Warne at home really only has one spin-conducive pitch to bowl at. Same with pacers where in the WI and in Perth pace bowling was much more helpful than in Pakistan.

  • No featured comments at the moment.