Adjusting averages to account for bowling strengths
Some of you may recall the quotient of BQI to bowling average discussed in this post. Roughly speaking, the idea is to reward bowlers who take the wickets of better batsmen. In this post, I'll flip the idea round, and reward batsmen who score against better bowling attacks.
Firstly, a digression on Ananth's post. The quotient was defined by summing up the batting averages of the batsmen dismissed by a particular bowler, and then dividing by the bowler's regular average. This is, to my mind, a very useful stat, perhaps the best of its kind for its simplicity (you can of course make it better by making it more complicated in appropriate ways). The only problem is that the numbers you get don't correspond to numbers we're used to in following cricket. How good is a 1.2 bowler? A 0.9 bowler?
Happily, there's an interpretation of this stat that puts the numbers on a scale we're familiar with. It's equivalent to the usual average (runs conceded divided by number of wickets taken), with each wicket weighted by the average of the batsman dismissed. You can set a 'benchmark' average (its value is arbitrary), and I'll set it at 31.5. Dismissing a batsman who averages 31.5 is worth 1 wicket. Dimissing a batsman who averages 47.25 is worth 1.5 wickets. A quotient Q is then equivalent to an average of 31.5 / Q. So, a bowler with a quotient of 1.2 has an 'adjusted average' of 31.5 / 1.2 = 26.25. This is the sort of number we're used to thinking about with bowlers' averages.
I don't know who first came up with the idea of weighting wickets in this way – it was first suggested to me by a friend of mine. Probably various people over the years have thought of it.
Working in the reverse direction (adjusting batsmen's averages) is more difficult, since apart from the last few years, we don't know which bowlers each batsman faced. But we can make a first attempt, by taking the average of the bowlers' averages for each innings, weighting each by the number of overs that they bowled.
To take an example, suppose that in one innings, four bowlers were used:
Bowler A, career average 28, bowls 30 overs.
Bowler B, career average 30, bowls 30 overs.
Bowler C, career average 35, bowls 25 overs.
Bowler D, career average 40, bowls 20 overs.
The "average average" is then (28*30 + 30*30 + 35*25 + 40*20) / (30 + 30 + 25 + 20) = 32.52.
Each batsman's runs for this innings would be multiplied by 31.5 / 32.52 – they'll all be slightly decreased, because the attack is slightly weaker than our benchmark average of 31.5.
(Note: if a bowler never took a wicket, or has an average above 100, then I set that bowler's average at 100. This seems reasonable to me.)
We do this for all innings, and we get adjusted averages for all batsmen.
One useful feature of this method (for both batsmen and bowlers) is that it adjusts across changes in the relative strength of bat and ball (as well as rewarding players who do well against strong opposition). In an era where averages are high (such as today), bowlers are rewarded more for wickets and batsmen less for runs. For players in the low-scoring years before 1900, the reverse is true. Of course, it's possible that in a given era, runs are low because there happen to be a lot of good bowlers and not many good batsmen, and in this case the bowlers are unfairly punished (and batsmen unfairly rewarded). But to my mind the results are better than raw averages.
So onto the results. Qualification: 20 Test innings. Here's the top 20.
name inns no runs avg adj avg DG Bradman 80 10 6996 99.9 90.4 GA Headley 40 4 2190 60.8 62.8 MEK Hussey 42 8 2325 68.4 59.4 CL Walcott 74 7 3798 56.7 58.3 ED Weekes 81 5 4455 58.6 55.9 FS Jackson 33 4 1415 48.8 55.4 JB Hobbs 102 7 5410 56.9 55.0 GS Sobers 160 21 8032 57.8 54.6 L Hutton 138 15 6971 56.7 53.8 H Sutcliffe 84 9 4555 60.7 53.6 AD Nourse 62 7 2960 53.8 53.4 KF Barrington 131 15 6806 58.7 52.6 GS Chappell 151 19 7110 53.9 52.3 GE Tyldesley 20 2 990 55.0 52.2 RG Pollock 41 4 2256 61.0 52.0 KS Ranjitsinhji 26 4 989 45.0 50.8 BC Lara 230 6 11912 53.2 50.4 J Ryder 32 5 1394 51.6 50.4 RT Ponting 197 26 9999 58.5 50.3 FMM Worrell 87 9 3860 49.5 49.5 AG Steel 20 3 600 35.3 49.0
The modern-day greats are surprisingly low down. That their averages should be heavily reduced is not surprising, since the bat has been very dominant over the ball in the past few years. But they're still further down that I had expected. Perhaps there is some bias in the method, or perhaps we should pay more attention to Neil Harvey when he compares modern players to those of his day.
(There's another possibility worth thinking about, and that is a gradual increase in competitiveness of the sport, so that today there are fewer players on the high and low extremes and more players towards the middle. I don't know how big an effect this would be.)
Here is a list of players from recent years:
name inns no runs avg adj avg MEK Hussey 42 8 2325 68.4 59.4 BC Lara 230 6 11912 53.2 50.4 RT Ponting 197 26 9999 58.5 50.3 KP Pietersen 80 3 3890 50.5 48.8 JH Kallis 207 32 9678 55.3 48.3 V Sehwag 100 4 5074 52.9 48.0 Moh'd Yousuf 134 12 6770 55.5 47.8 SR Tendulkar 244 25 11877 54.2 47.4 RS Dravid 214 26 10223 54.4 47.3 A Flower 112 19 4794 51.5 46.5 KC Sangakkara 125 9 6356 54.8 46.5
Tendulkar's low position is a bit of a surprise. It's an anomaly that jars with most people's impressions. But remember that averages are not perfect indicators of a batsman's 'true talent' – there's some inherent uncertainty with them.
A full list of batsmen (with an adjusted average of at least 25), click here.
Some further comments:
- Opening batsmen face the opening bowlers disproportionately often, and this isn't taken into account.
- The conditions or characteristics of the batsmen on a given day can change the effectiveness of the bowlers, and the captain would use his bowlers accordingly. So the simple weighting by career average is not a perfect reflection of the overall skill of the attack. But in the long run the above method should be pretty close.
- There's no allowance for ground or pitch conditions, etc.
- I've ignored not-outs. This is worthy of a post of its own, but not-outs don't affect averages much in Test cricket.
- I've used career averages of the bowlers, mainly because it's easy to do. Career-to-date averages can be unstable. It would be reasonable to add a correction factor for the experience of each bowler. But while I've done a small amount of work in this area, I don't have enough results for it to be usable.