The most efficient strike bowlers in Tests

A stats analysis to identify the most efficient strike bowlers in Test history

25-Feb-2013

My usual lair is Different Strokes, but that's a place for (semi-)topical opinion rather than discussion of statistics methodology, and Rajesh has been kind enough to allow me to interlope and put this little study before you.

Although I didn't start out that way, what I've ended up with, I think, is a pretty good cross-era ranking of the most efficient strike bowlers Test cricket has known. I don't claim that it's definitive: what I do claim is that the method I've used is quite interesting, and I'd like to see what other stats mavens make of it.

The first decision I made was to eliminate all minnow matches. Leaving out Bangladesh and Zimbabwe is pretty commonplace but if we're being realistic, only England and Australia did not have a bedding-in period as minnows before they became a team to be at least reckoned with. It seems essential to eliminate minnow matches because otherwise some bowlers are at a distinct disadvantage: a bowler whose career was from 1970-1980 never got a chance to bowl at a minnow team, whereas Fred Trueman had endless fun with weak Asian teams in the 1950s. Since I use Ric Finlay's Tastats, this sort of exclusion is very easily accomplished.

There being no formal event which declares a team to have "arrived" in Test cricket, I had to make some arbitrary judgements about when to regard a team as having graduated. I took South Africa's entry to senior ranks as having occurred when they unveiled their quartet of googly bowlers and comprehensively thrashed the fairly weak England team which toured in 1905-6, after which they were generally difficult to beat. Though West Indies won a series against England in 1934-35, the touring side was again half-strength; I decided that they did not really graduate until 1945. India's graduation I took to be 1961, Pakistan's 1965, New Zealand's 1969 and Sri Lanka's 1990. One might be able to argue that Zimbabwe were of a reasonable standard from about 1998-2003, but it seemed simpler to leave them and Bangladesh out of all consideration. Since I also have a prejudice against non-Test matches being included, the ICC Superflop game is also left out.

Subtracting those games has a widely-varying effect on a bowler's career total of wickets. Muralitharan drops from 751 wickets to 588 and Trueman from 307 to 192, whereas Jeff Thomson and Michael Holding's figures remain untouched.

Next, I decided to find a way to give greater credit for taking top-order wickets, because they are the ones you really want your strike bowlers to be cleaning up.

I was initially tempted to weight them on the basis of the runs scored at each position, but then realised that the top order contribution is exaggerated by declarations and innings cut short by the match being over. I then moved on to using the batting averages at each position.

Adding up the averages for each position gives a total of 307.27. The share for each wicket is given by positional average/total average, so the #3 average of 39.662 is 0.129 of the total. Those shares sum to 1, so if we multiply them by 11, they will sum to 11. This gives us the following weightings:

1	2	3	4	5	6	7	8	9	10	11
1.34	1.27	1.42	1.48	1.34	1.15	0.98	0.75	0.56	0.41	0.331

If the dismissal of a batsman is worth the above number of wickets, then a bowler taking one of each will have a total of 11 wickets, whereas someone with a top-order bias will have more and someone who wipes up tail-enders exclusively will have a lot less. Owing to a limitation in TAStats, whose breakdown of bowler's victims by position does not differentiate between openers, in practice I used 1.30 for both 1 and 2.

To take three examples, Shane Warne's total gets adjusted from 685 to 685.2, Glenn McGrath's from 549 to 605.6, and Stuart MacGill's from 164 to 159.0. Given that in practice a lot more top-order batsmen than tail-enders get dismissed, most bowlers actually show a profit, so MacGill's reduction is evidence that he really was a tail-end cleaner.

If we apply this wicket adjustment to the figures for non-excluded matches and remove everyone who played less than 20 relevant games or took under 100 relevant wickets, this is the resultant top ten by average:

Player	M	Balls	Runs	Wkts	Adj W	AdjAve	AdjSR
SF Barnes	27	7873	3106	189	203.6	15.26	38.67
R Peel	20	5216	1715	101	108.7	15.78	48.00
MD Marshall	81	17,584	7876	376	410.3	19.20	42.86
CEL Ambrose	96	21,641	8401	397	433.1	19.40	49.97
GD McGrath	120	28,485	11,930	549	605.6	19.70	47.04
AK Davidson	34	8997	3033	142	153.9	19.71	58.48
JC Laker	36	10,312	3611	162	178.3	20.25	57.84
AA Donald	69	14,906	7113	316	350.7	20.28	42.50
H Trumble	32	8099	3072	141	150.8	20.37	53.71
J Garner	58	13,175	5433	259	265.9	20.44	49.56

The right-hand column shows that there is a wide disparity between bowlers' strike rates. A strike bowler's efficiency does not depend solely on runs conceded; his strike rate is also an important factor because of the runs scored at the other end and the overall time taken. If Dale Steyn bowls six overs and takes a wicket but concedes 30 runs while Makhaya Ntini concedes 18 in his six without taking a wicket, the opposition are 48/1 at the end of these spells. If Shaun Pollock bowls 11 overs and concedes 20 runs while taking a wicket, 33 runs get conceded at the other end and the opposition reach 53/1 although the game is ten overs older.

I have for some time been toying with a measure I call the Power Index, which combines the average and strike rate by multiplying them together and taking the square root. Sqrt((runs/wickets)*(balls/wickets)) has a denominator of wickets, so the numerator can be seen as representing the resources used up in taking a wicket.

If we apply that algorithm, we get a new top ten, as follows:

Player	M	Balls	Runs	Wkts	Adj W	AdjAve	Adj SR	Adj PI
SF Barnes	27	7873	3106	189	203.6	15.26	38.67	24.29
R Peel	20	5216	1715	101	108.7	15.78	48.00	27.52
MD Marshall	81	17,584	7876	376	410.3	19.20	42.86	28.68
AA Donald	69	14,906	7113	316	350.7	20.28	42.50	29.36
CEH Croft	27	6165	2913	125	141.7	20.55	43.50	29.90
DW Steyn	23	4414	2706	114	114.7	23.60	38.49	30.14
GD McGrath	120	28,485	11,930	549	605.6	19.70	47.04	30.44
CEL Ambrose	96	21,641	8401	397	433.1	19.40	49.97	31.13
J Garner	58	13,175	5433	259	265.9	20.44	49.56	31.82
Waqar Younis	73	13,517	7374	293	312.3	23.61	43.28	31.97

Ambrose and McGrath drop down, Colin Croft rises, and Dale Steyn and Waqar Younis come in instead of Davidson and Trumble.

However, this is deeply unsatisfactory because we know that Barnes and Peel played in a time when scores were lower and wickets fell much more often. Today's fashion is to bat aggressively from the word go, whereas in the middle of the last century caution was the Test batsman's watchword. We need a way of equalising for the changes in general pitch conditions and style of play.

This is a well-known problem, and what follows does not claim to be universally applicable.

But the essential aspects of what we are examining here are the balls bowled, runs conceded and wickets taken. If we can find a way of keeing one or more invariant, then we have a fixed point while scaling the others to fit.

I decided to use the first match innings of Tests as the way to fix par. The first innings of the match is the least likely to be cut short by weather, and the least likely to be affected by tactical considerations. A third innings can be anything from a stonewall grind trying to save a match to a hell-for-leather bash while trying to set a target, but a first innings is always going to be played at whatever pace the side think appropriate given the conditions and they will nearly always get as many runs as the conditions allow. The dimensions of the first match innings may change, but its tactical purpose does not.

Across our population of matches, the mean first match innings notches up 327 runs off 678 balls.

What I did was to find out the dimensions of the average first match innings in a particular bowler's period. I decided not to restrict the sample to matches that the bowler played in, because then his performances are effectively the norm and we don't see how he stood out (or not) from his contemporaries. I think we are more interested in how their performances stack up relative to everything that happened in their period, so I used all the non-excluded matches played in the cricket years (running May-April) which his career spanned. Somone who debuted on 11th November 1982 and finished on 25th August 1994 would thus have his period defined as 1982 -1995 (Ric Finlay will recognise his "years from and to" filter option).

I then scaled their figures for balls bowled and runs conceded accordingly. So a bowler whose period averaged 340 runs off 650 balls is adjusted to concede his actual runs * 327/340 off his actual balls * 678/650 . We now have adjusted figures for each of balls, runs and wickets and can run through our standard calculations for average, strike rate and PI to come up with our final result, the top ten of which looks like this:

Player	B/I1	NewB	R/I1	NewR	NewW	NewSR	NewAve	NewPI
MD Marshall	659	18,091.0	321	8023.2	410.3	44.10	19.56	29.37
SF Barnes	552	9670.1	266	3818.3	203.6	47.50	18.75	29.85
DW Steyn	630	4750.3	358	2471.7	114.7	41.42	21.55	29.88
AA Donald	644	15,693.0	319	7291.4	350.7	44.74	20.79	30.50
GD McGrath	645	29,942.4	335	11,645.1	605.6	49.44	19.23	30.83
KR Miller	800	8199.6	329	3597.0	175.3	46.77	20.52	30.98
RR Lindwall	798	9219.3	325	4337.5	200.3	46.03	21.66	31.57
EH Croft	641	6520.9	308	3092.7	141.7	46.01	21.82	31.69
FS Trueman	764	9085.6	325	4665.5	203.5	44.64	22.92	31.99
JC Laker	790	8850.0	321	3678.5	178.3	49.64	20.63	32.00

B/I1 and R/I1 are the average first match innings balls and runs for that bowler's period.

As a dedicated supporter of SF Barnes as the king of bowlers, I am mortified to discover that Malcolm Marshall pips him to the top spot - but if Barnes had to be toppled, I'm glad it was Macko.

On a very contemporary note, Dale Steyn has made an incredible start to his career, since he comes in at number three (with a bullet) on this all-time list. Waqar Younis's figures at the same stage of his career were even more spectacular, with a PI of 27.29, so we can probably assume that Steyn will also descend the list as his career unfolds.

Of the top ten, only McGrath and Laker were ever really used in a containing role on dead pitches, and they did not do that much. In the full table, those who spend time keeping the runs down without taking wickets lose out, with the result that Shane Warne comes a lowly 55th. But then, this is not a merit ranking but an assessment of how nearly they approached the ideal of incessant lethality.

It's not an unbelievable top ten. If the model is wrong, it still manages to produce a sensible result.

But it can certainly be challenged on a number of points.

Are the cut-off dates for minnowhood reasonable?

Are the relative batting averages of the positions in the batting order a sound way to weight the value of wickets?

Is the Power Index a sensible way of combining parsimony and frequency to measure attacking prowess?

Is the first match innings a useful point of reference?

Even if comparing first match innings is reasonable, should one average the dimensions thereof for all matches or just the ones the bowler played in?

Whatever those averages are, is it sufficent to scale them in a linear fashion or should some more complex function be used?

So let the debate on those and no doubt other questions commence.

The full table is available here.

Malcolm Marshall