Stats Analysis

# What lines and lengths should you bowl to Kohli, Smith, Williamson, Root and Pujara?

An analysis of beehive data of lines and lengths bowled to these batters from 2012 onwards provides some insights

Kartikeya Date
05-Jul-2021
For the last decade or so, the ICC has published Hawk-Eye information (when available) for international matches. So far, this data is available for 241 Tests from November 2012 to March 2021. While ball-by-ball records (runs and dismissals data) are available for all Tests in this period, ball-tracking information (speed, pitching point, and beehive point for each delivery) seems to be available only for those Tests where Hawk-Eye provided the service.
This article uses this data to look at how right-hand batters have fared against right-arm seam bowlers - those classed as right-arm fast, right-arm fast-medium and right-arm medium fast.
The graph below shows all deliveries (124,916) for which ball-tracking data is available in the period mentioned above in which the batter is a right-hander and the bowler is a right-arm seamer according to the above classification. For 16 batters, ball-tracking data is available for at least 2000 deliveries each.
The placement in the beehive for any given delivery in the charts below marks the point relative to the stumps at which the ball is estimated to cross the plane of the stumps by the ball-tracking software.
The beehive is amenable to several kinds of analysis. One can determine the share of deliveries hitting the stumps, or the share of deliveries that are outside off stump. Since the outcome of each delivery in terms of runs and wickets has been recorded, we can group deliveries together to determine outcomes when the ball is delivered in different parts of the beehive.
A common approach to preparing groups of this kind is using k-means clustering. This algorithm groups points in a specified number of clusters by iteratively assuming and updating centroid locations by assigning each point in the data to its nearest centroid location.
In this article I have organised the beehive into eight clusters. This number was chosen after trials with different numbers of clusters (from five to 12) to find the clusters that were most stable. (More about stability here.) The clusters are numbered from one to eight and are named for line and length attributes. This is provided for all available deliveries from Tests all over the world in the first graph below. The average, scoring rate, and share of total deliveries falling in each cluster are then provided separately in tables for India, England and Australia; for those countries, a sufficient number of deliveries are available in the data set.
The clusters allow us to organise the beehive record beyond the basic picture, which shows where a delivery crosses the stump in relation to the right-hander's off-stump. For example, even without building the clusters, we can tell from the beehive data that the bowling in India is marginally straighter than in Australia or England. Right-arm seamers also attack the stumps more often in India (13.5% of deliveries are hitting the stumps) compared to England (9.4%) and Australia (9.7%). The clusters enable a more granular inquiry into lines, lengths and the propensities of different players, as this article will show.
The eight zones organise the deliveries in eight line and/or length areas. In the parlance of everyday commentary, I class three zones (one, two and three) to broadly be outside off stump, three zones (four, five and six) to be straight, and the remaining two (zones seven and eight) to be either at the body or down the leg side.
Three zones show scoring rates below three runs per over - the bouncer (zone four), top of off stump (five), and wide outside off stump (one). Of these, zone one is a defensive zone in the sense that the low scoring rate does not produce a correspondingly low batting average. Batters can safely ignore deliveries in this zone if they are so inclined.
The bouncer zone is an attacking zone for the bowler. It produces a wicket every 43 balls. The other two slow-scoring zones (five and one) are also the two that produce wickets most infrequently. Bowling wide outside off stump (zone one) produces a wicket every 92 balls. Right-hand batters find it hardest to score off the line and length in zone five, no matter where the Test is being played; the average right-hand batter's wicket in zone five costs 19.5 runs and comes once every 78 balls.
Short-of-good-length deliveries, whether they are outside off stump (zone two) or into the body (zone seven), are hit for about 3.5 runs per over. The most profligate zone, unsurprisingly, is eight (down the leg side). Bowlers concede more than four runs an over against the average right-hand batter when they bowl in this zone.
Bowling straight (in zones four, five or six), produces more dismissals than does bowling wide. Intuitively, this makes sense because bowling straight makes it more likely that the batter has to defend the stumps, as against letting the delivery pass. When the stumps don't have to be defended, bowled and lbw are more or less eliminated as modes of dismissal.
Australian pitches offer more bounce than those in England and India. This is most clearly evident from the average length for deliveries in zone five: 6.7m in Australia versus 8.0m in India and 7.6m in England. The zone-wise patterns of bowling, scoring and wicket-taking are more or less consistent across venues.
Next, I look at the records for five right-hand batters - Virat Kohli, Cheteshwar Pujara, Joe Root, Kane Williamson, and Steven Smith. Each of these players (with the exception of Williamson) has faced at least 4000 deliveries of right-arm seam for which details are available in the database.
In the World Test Championship final, Virat Kohli was dismissed twice by Kyle Jamieson. In the first innings Jamieson beat Kohli's inside edge and dismissed him lbw. The ball would have hit the top of leg stump about 3cm below the top of the bail, and 10.9cm to the leg side of the centre of middle stump - squarely in zone eight. In the second innings Jamieson had Kohli fending at a back-of-a-length delivery that would have crossed the plane of the stumps 111cm above the ground, 43.3cm to the off side of the middle stump - squarely in zone two.
These are archetypal dismissals for Kohli. Of his 50 dismissals against right-arm seamers in the database, nine fall in zone two, and ten in zone eight. The chart below shows how a sizeable number of dismissals are clustered around the top of leg stump and wide of off stump (in zones one, two and three). Zones one, two and eight account for 28 of his 50 dismissals (about 60%) against right-arm seam in the database, even though only 34% of the deliveries he faces from right-arm seam fall in these zones.
This doesn't mean that Kohli has a weakness in these areas. They reflect a technical choice he has made. Kohli commits to the front foot when he shuffles across the stump because he wants to cover his off stump well. As his average and scoring rate in zone six, and even zones three and five, shows, he is a master of the bowling in these areas. But this leaves his inside edge exposed from time to time. Bowlers who can hit the pitch hard short of a length can get him fending or chasing outside off stump, "outside his eyes" as commentators like to say. Kohli's dismissals in the WTC final were classic Kohli dismissals by a bowler who seems to be able to execute the right-arm seamer's ploys effortlessly and relentlessly.
Pujara's approach is, in one sense, the mirror image of Kohli's. His strongest areas are wide outside off stump on the square cut, and on his pads or into his body (zones seven and eight: deliveries around the leg stump, generally - except for the bouncer). On or around off stump, Pujara is cautious to a fault. It is not surprising that a great number of his dismissals cluster around the middle and off stump rather than the top of leg stump. Pujara's straight bat is rarely beaten, and when it is, it is as likely to be on the outside edge as on the inside. His major limitation is his inability to score against the short, straight ball. This gives the right-arm seamer ways to keep him quiet. Pujara manages 1.1 runs per over against the bouncer (zone four), compared to 4.1 runs per over for Kohli.
Pujara's approach can be considered an example of the classic "wait for the bowler to make a mistake" approach to batting against right-arm seamers. Kohli's approach, on the other hand, involves trying to score off the right-arm seamer's stock delivery. There are trade-offs in both cases. If Pujara were prepared to hook and pull, he would probably average more. Rahul Dravid, another classical batter who was similarly circumspect around off stump, did play the hook.
If Pujara could hook, he might be a player like Joe Root. Root is slightly less assured than Pujara when right-arm seamers bowl straight, and while Pujara has a phenomenal record on and around his leg stump, Root is stronger when the ball is wide outside off - he averages more and scores quicker (and deliveries wide outside off are more common than deliveries on the pads in cricket). Root plays the hook shot against the bouncer regularly. He is also especially severe on the shorter stuff in zone two.
The database is unfortunately not as extensive for Williamson as for the other four batters here, but its emerging contours suggest that Williamson is the closest among the five to being the perfect orthodox batter. Like with Root and Pujara, his scoring strengths are in the classic areas - wide outside off stump (especially zones two and three) and on his pads. To this, he adds a magnificent defence when the bowler attacks his stumps. Williamson has been dismissed three times in 585 balls in zones five and six. His superb dead defensive bat was evident in the WTC final.
Finally, we come to Steven Smith, who is sui generis. Smith has worked out a way to marry Kohli's desire to dominate his off stump without having to commit to the front foot as Kohli does. Consequently, Smith has a magnificent record against the classic bowler's errors (width, drifting on the pads), and unusual assurance when the bowler attacks the top of off stump. The best way to attack Smith, such as it is, is to target his stumps. This brings all modes of dismissal in play and gives the bowler the best chance of dismissing Smith if he misses something. On the rare occasion, as against Jasprit Bumrah at the MCG in December 2020, Smith is bowled leg stump. But it takes an exceptionally deceptive quick bowler to beat him there. Over the last four years or so, teams like England and New Zealand have sought, with mixed success, to try to keep Smith quiet. This is a very difficult task, and there isn't a team in the world right now that has four bowlers who can achieve this. New Zealand fared better at this in 2019-20 than England did in 2017-18 and 2019.
The beehive record provides a textured picture of the contest between and ball, and as shown here, it is possible to identify patterns of play and differences in the approaches taken by different players. As mentioned, this record is not exhaustive, but a significant volume of deliveries are now available, and that makes it possible for the way different batters play for their off stump against right-arm seamers in Test cricket to be described.

Kartikeya Date writes at A Cricketing View. @cricketingview