Stats Analysis

# How do the top Test batters compare to Don Bradman?

A new measure, the Z-Score, gives us a more complete picture of the gap between the greatest batter and the rest

Quite a few years back, I presented an article comparing Don Bradman's career to other elite batters. That Bradman is at the top is a given. So who is second and how far away are he and the rest from Bradman? I am revisiting that theme now with a more sophisticated measure called Z-Score, which is a much better way to examine the concept of relative positions of variables than my previous ones. I will look at three batting measures, three bowling measures, one fielding measure, and one team measure. All these are performance-centric and mean the same irrespective of how many Tests a player has played. That means no longevity measures, other than a passing reference at the end.
Let's take a simple example to get the concepts of Mean, Standard Deviation (SD), Coefficient of Variation (CoV), and Z-Score clear. These are the cornerstones of this article and are baby steps in statistics. The mean is nothing but an average of all the values. The median is the mid-point value. The Standard Deviation is a measure of how dispersed the data is in relation to the Mean and is determined using the formula shown. The squaring is done to take away the impact of the negative values.
The CoV is determined by the formula "SD/Mean" and is used to indicate how widely the data points have been dispersed around the Mean. All these variables have been explained with examples in the sidebar.
Let us say that 11 students have scored 10, 15, 20, 30, 40, 45, 50, 65, 70, 85, and 95 marks in a test. These could as well be the ordered scores of 11 batters in a Test innings.
• The Mean works out to 47.73
• The SD for this distribution is 27.08
• The CoV for this distribution is 0.567. This is quite high and represents a widely distributed population. It can be seen that the scores are distributed across the spectrum
• Each of these values has a Z-Score, which represents how many SDs away from the Mean the value is. The Z-Score for 85 is 1.376 ((85-47.73)/27.08) and for 20 is -1.024 ((20-47.73)/27.08). Z-Scores can be positive or negative. By definition, the sum of the Z-Scores for any distribution is always 0.0.
• The median is the value of the middle entry. The median here is 45, which is the sixth entry (out of 11)
The Z-Scores are dimensionless since the numerator and denominator have the same units and cancel each other. As such, the Z-Score value has universal relevance irrespective of the nature of the data being analysed. In this analysis, the Z-Score has the same significance level whether the unit is Runs, Runs/Test, Wickets, Wickets/Test, Runs/Over etc.
Do not worry at all if you have not fully grasped the above. I will make sure that the explanations are non-technical. I will also present an easily understood percentage value, which is a simpler representation of the Z-Score. I will not leave anyone in the dark.
One final statement before we move on to the tables. Normally the population selected is 100% of the values. Doing that here will mean that Chris Martin (2.37) will be treated at par with Don Bradman (99.94) and Kevin Pietersen (88.60) at the same level as Sydney Barnes (16.43). That will not work since there are bowlers who are applauded when they score a run because the second run is unlikely, and batters who turn their arm over once in a few Tests. Hence I will inject a certain level of minimum expectations for each of these measures so that we get populations that are reasonably homologous. Thus these will be truly performance-centric measures.
A suggestion to all readers who are not completely au fait with the statistical measures. Just concentrate on the percentage-of-top field and relate to the narrative. Everything will be very clear.
Let us move on to the tables. These tables are current, upto and including the Chattogram Test between Bangladesh and Sri Lanka.
Batting
A reasonable cut-off of 2000 runs has been set for all the batting measures. It is true that a few late-order batters get in, however, it is also true that these batters have performed well often and it allows for certain top batters, like Graeme Pollock and George Headley, who didn't have long careers, to be part of the population.
A total of 334 batters qualify, making this a small but relevant subset of the huge population. The CoV of 0.228 indicates a reasonably distributed sample set.
Well, Bradman has a Z-Score of 6.6, which makes him a complete outlier, way off the normal high Z-Score of 3.0. The second and third entries have Z-Scores below 2.5. Once you take away Bradman, the distribution follows normal patterns. The Z-Score distribution is within the classical +3.0 to -3.0 range. The three batters with 60-plus averages follow next. As recently as last year, Steven Smith was part of this elite group. Unfortunately he has now fallen away badly.
Just as an interesting bit of information, I also show the player with the median value. The percent-of-top value for this player indicates whether the distribution is lopsided or balanced. Here, Alec Stewart is the median entry (the 167th one) and his 39.6% indicates that it is a bottom-heavy distribution - half the entries below 40%.
The low negative Z-Scores are dominated by the bowlers who could bat well. Among the pure batters, Mohammad Ashraful has a Z-Score value of -1.7 and Daren Ganga, -1.5. They are not featured, however.
Now we move on to the second batting measure - Runs per Test. The same 334 batters qualify. The CoV of 0.236 indicates a reasonably distributed sample size.
The first table almost duplicates itself here except that Bradman is less of an outlier with a Z-Score is nearly 5. But he's the only outlier. The second-placed batter, George Headley, is at 2.54. He is at around 74% of Bradman's RpT value. Graeme Pollock follows close behind. Of the modern batters, Kumar Sangakkara, Brian Lara, and Smith feature.
The median value is that of Sadiq Mohammad, who scored at around 63 runs per Test. His value of 46.8% indicates that this is a much more balanced distribution than the batting average one. That is nearly the mid-point. In other words, the mean and median are quite close together - the hallmark of a really balanced distribution.
At the lower end, we have the bowlers who could bat well. Among the pure batters, Syd Gregory has a Z-Score of -1.56, Ken Rutherford has -1.24 and Ashraful, a value of -1.18.
The third batting measure is a quaint one - the Innings per Hundred value. The number of hundreds scored is a longevity measure and I prefer the performance measure of Innings per Hundred rather than Tests per Hundred since that represents a purer basic measure. The cut-off is ten hundreds.
This time only 142 batters qualify, making this a still smaller subset of the huge population. The CoV of 0.260 indicates a reasonably distributed sample size.
Well, Bradman is at the top, as expected. However, he is not even an outlier now, with a Z-Score value of 2.7. He needed 2.8 innings per hundred. Headley has a hundred every four innings and the Z-Score is a not-too-far-off 2.19. Headley's value is at 68.9%. Clyde Walcott is third at 1.80. Kane Williamson leads the quartet of modern batters featured.
Alvin Kallicharran's 9.1 is the median value (the 71st one) and it is at 30.4% indicating that half the batters are at sub-30% values. It is to be expected with many batters needing upwards of ten innings to score a hundred. It can be seen that the steep cut-off of ten hundreds takes most of the "bowlers who can bat" out of the equation. The few featured players with the lowest values are mostly batters.
Bowling
Now for the bowlers. To start with, the bowling average. A reasonable cut-off of 75 wickets has been set for all the bowling measures. A few casual bowlers will get in; however, these bowlers have performed well often to even reach this number, and it also allows for certain top bowlers, like Shane Bond and Frank Tyson, to be part of the population.
A total of 273 bowlers qualify, making this a small relevant subset of the huge population. The CoV of 0.197 indicates a very reasonably distributed sample size.
George Lohmann does not do a Bradman but sits comfortably on top with a Z-Score of over 3.0. He is just about an outlier. Barnes and Charlie Turner come in next with close values around 2.25. They are some distance away from Lohmann. The table is dominated by the pre-World War I bowlers, with Tyson the exception. Kyle Jamieson is the only modern bowler with an average under 20 and a featured entry in the table.
Chaminda Vaas is right in the middle (the 137th entry). This again is a bottom-heavy distribution with half the bowlers below the 36% values. Three bowlers, Daren Powell, Carl Hooper, and Mohammad Sami are the outliers at the other end of the table, with Z-Score values below -3.0.
The next bowling measure considered is "wickets per Test". The bowler count remains the same at 273. The CoV of 0.269 indicates a reasonably distributed sample size.
Barnes, with a rounded value of 7.0 wickets per Test leads the table with a Z-Score value of well over 3.5. Tom Richardson and Lohmann follow closely behind. Of great interest is the next entry - that of Muthiah Muralidaran. His terrific haul of six wickets per Test gives him a high Z-Score of 2.6. The top half of the featured table is completed by three pre-World War I bowlers.
This is undeniably the most balanced distribution of all the ones featured. Abdul Qadir's median value of 3.52 wickets per Test is only 0.01 away from the mean, meaning that the mean and median are almost the same. Almost exactly 50% of the bowlers are below the median. There are a few non-regular bowlers with Z-Score values below -2.3. Wally Hammond, Sanath Jayasuriya, and Steve Waugh have captured below a wicket per Test and are hovering near the outlier mark, with Waugh being the true outlier.
The third bowling measure is the bowling strike rate - balls per wicket. The bowler count continues to stay at 273. The CoV of 0.217 indicates a very reasonably distributed sample size.
Lohmann is on top, but with a Z-Score of just over 2.2, not really an outlier. Readers will be glad to note the presence of Bond in a close second position - with a strike rate of 39 balls per wicket and a Z-Score of 1.9. Kagiso Rabada follows very closely behind. After Barnes come two modern greats - Waqar Younis and Dale Steyn.
Peter Siddle's median position is above the Mean value of 65.9, indicating an almost equally distributed distribution with half the bowlers above the 55% mark. Some regular bowlers, like Bapu Nadkarni, Hedley Howarth and John Emburey, have strike rates exceeding 100 balls per wicket and have very low Z-Score values. Hooper, who needed all of 20 overs to get a wicket, is the true outlier with a Z-Score of -3.85.
Fielding: dismissals per Test
I wanted to do two analysis segments: one for wicketkeepers and one for fielders. There are no problems with dedicated wicketkeepers. However, I have many grey areas in my database when it comes to players like Sangakkara, Walcott, Mushfiqur Rahim, who donned the wicketkeeping gloves for only part of their careers. I am not 100% certain as to when a catch was taken with the gloves or without. Hence, I decided to have one consolidated fielding segment, taking the total number of dismissals as the base. I am aware that this favours wicketkeepers, but I will make sure that fielders also get enough coverage. The cut-off is 50 dismissals. This will require around 15 Tests for keepers and 40 Tests for fielders.
The number of players who qualify is 233 players, making this a small relevant subset of the huge population. The high CoV of 0.607 indicates a very widely distributed sample size. The CoV for this measure is well over twice that of the other measures. The range of values, from 4.49 to 0.33 substantiates this.
Tim Paine just about pips Adam Gilchrist to be top of the table. Quinton de Kock and Alex Carey follow close behind. The top 84 entries are wicketkeepers. In 85th position is Eknath Solkar, who has a Z-Score of 0.21. He and Bobby Simpson are the only fielders to have positive Z-Score values. Daren Sammy and Smith fall just below the 0.0 mark. The lowest-placed wicketkeepers are Wayne Phillips, with a Z-Score of 0.17, and Farrokh Engineer, who has a Z-Score of 0.04. Both average below 2.0 dismissals per Test.
Ollie Pope's median position (the 117th one) has a percent-of-top value of only 28.5 meaning that 50% of the entries have percentage values below 28.5, leaving this a bottom-heavy distribution. The lowest entries are mainly bowlers, like Kapil Dev, Nathan Lyon, Anil Kumble and Stuart Broad, who have taken less than a catch per two Tests.
Team: Innings Scoring Rate
Finally, we come to the team-based metric. After going through various options, I decided that the only true performance-based measure I can consider for a team is the scoring rate. That too, the innings scoring rate is the cleanest of measures since it can be derived from the scorecard for all Tests. I have set up a cut-off of 30 overs (a session and some) merely to add significance to the values available. Still, over 8600 innings qualify, making this the most represented population sample. The CoV is a comfortable 0.236, making this an evenly distributed population.
The highest entry is that of Australia, who scored at a rate of 7.53 against Pakistan in January 2017. This gave the innings a Z-Score of 6.67. This is closely followed by the England onslaught in Rawalpindi in December 2022, with a seven-plus scoring rate and Z-Score value of 6.44. Then the South Africa blitzkrieg of 6.80 runs per over against Zimbabwe in Cape Town in March 2005. Finally, the innings extraordinaire, also from the same Rawalpindi Test - England's first-innings massacre of 657 runs in 101 overs at 6.50 runs per over. All these are outliers with Z-Score values far exceeding 3.0, which is why they are not highlighted in the table above. In reality, there are 70 outlier innings that exceed Z-Score values of 3.0.
At the other end, we have three innings hovering around scoring rates of 1.00 and two innings that have scoring rates way below 1.0. Both these are the outliers at the lower end. One of these innings, the brave South African fightback in Delhi took place in the last decade - they scored 143 in 143.1 overs, with AB de Villiers (43 off 297 balls) as one of the architects of this battle that nearly saved the Test.
The median entry (the 4337th one) has a low scoring rate of 2.51 and a Z-Score of -0.61. Also half the entries have values below 33%, making this a relatively bottom-heavy distribution.
Longevity-based landmarks
Sachin Tendulkar has scored 15,921 runs in 200 Tests. Currently Ricky Ponting is in second position with 13,378 runs (84%). Jacques Kallis, with 13,289 runs (83%) is in third position. Among the active players, Joe Root is currently at 11,736 runs (74%). In my last Cricket Monthly article, I had projected that Root is likely to end at 14,531 runs (91%), close to Tendulkar, but not close enough. So, it is almost certain that Tendulkar's record will never be broken.
Murali captured 800 wickets in Tests. Currently Shane Warne is in second position with 709 wickets (89%). Jimmy Anderson, with 700 wickets (87%) is in third position. In my TCM article, I had projected that Anderson is likely to end at 734 wickets (92%). So, it is almost certain that Murali's record will never be broken.
Mark Boucher's haul of 555 dismissals is so far ahead of Gilchrist's 416 (75%) and Ian Healy's 395 (71%) that it is inconceivable to think that this record will ever be broken. At least as far as the batters and bowlers are concerned, the second-placed player may end at around 91%. Look at the vast difference that exists now itself in this keeping category.
The quirky stats section
In each article, I present a numerical/anecdotal outlier relating to Test and/or ODI cricket. This time the outlier query is: What are the least number of runs scored or wickets lost while securing Test wins?
The answers are given below, in chronological order. All the low-run-aggregate wins are from the pre-World War II period, while all the low-wickets-lost wins, barring one, are from the post-war period.
Wins after scoring very few runs
Wins after losing very few wickets
• Lord's, 1924: England won after losing only two wickets, against South Africa
• Leeds, 1958: England won after losing only two wickets, against New Zealand
• Birmingham, 1974: England won after losing only two wickets, against India
• Chittagong, 2003: South Africa won after losing only two wickets, against Bangladesh
• The Oval, 2012: South Africa won after losing only two wickets, against England
Talking Cricket Group
Any reader who wishes to join my general-purpose cricket-ideas-exchange group of this name can email me a request for inclusion, providing their name, place of residence, and what they do.
Email me your comments and I will respond. This email id is to be used only for sending in comments. Please note that readers whose emails are derogatory to the author or any player will be permanently blocked from sending in any feedback in future.

Anantha Narayanan has written for ESPNcricinfo and CastrolCricket and worked with a number of companies on their cricket performance ratings-related systems