<
>

Launching Superstats, the new language for cricket analysis

play
A look at ESPNcricinfo's Forecaster (1:33)

How much your favourite team will score? How well can they defend in the last five overs? This is your one-stop tool (1:33)

Superstats is a new language for numbers-based story-telling in cricket. The bouquet consists of three elements: Smart Stats, which puts each batting and bowling performance into context by looking at match situations and player quality; Luck Index, which - for the first time - quantifies the element of chance in cricket; and Forecaster, which looks the future and predicts win probabilities in each game.

These metrics have been derived from ESPNcricinfo's rich ball-by-ball database, and from complex algorithms developed by IIT Madras and Gyan Data, an IIT-M incubated company.

To start with, here are the basic FAQs on Luck Index and Forecaster: the factors considered, and the outputs you will get from each.

Luck Index

What is Luck Index?
Luck Index is a metric that quantifies luck. This is done by identifying every lucky event that happens in a match, and then calculating, through a complex algorithm, the run value of that event.

Why do we need a Luck Index in cricket?
Cricket is a sport where luck often plays a significant role in the outcome of the game. An umpiring error, a dropped catch or missed run out, even an edged boundary in the final overs, can have a huge impact on a game.

While the luck factor has always been acknowledged by cricket pundits, so far it has been limited to subjective analysis. (The only objective measure has been in terms of runs scored by a batsman after being given a reprieve.)

However, Luck Index is able to quantify the run value for each lucky event - in terms of how much it helped or hurt the teams - and so also calculate if that event actually affected the result of the game.

Which events are considered 'lucky'?
The events can be broadly classified into two categories: dismissal-related events and non-dismissal-related events. A dismissal-related event is one where either the batsman should have been out but wasn't, or where the batsman shouldn't have been out, but was. Examples of these are:

- Dropped catch
- Missed run-out/stumping
- Batsman wrongly given not out by umpire
- Batsman wrongly given out by umpire
- Batsman dismissed off a no-ball

Non-dismissal-related events are those where a batsman's dismissal doesn't come into the picture. Examples of these are:

- Misfield/overthrows
- No-ball/wide
- Edged boundary
- Batsman injury

How do you quantify the impact of each luck event?
For all dismissal-related events, the Luck Index model performs a simulation by replacing that event with the opposite event. So, if a batsman is reprieved through a dropped catch, the game is simulated from that point, assuming that the catch had been taken and the batsman dismissed. The difference between the team totals in the two cases is the run impact of that event. A similar simulation is carried out if the batsman is reprieved thanks to an umpiring error.

Is the run impact of a lucky event the same for a batsman compared to his team?
No, it is not. The reason is simple: for the batsman, all the runs scored after the luck event are bonus runs; for the team, though, another batsman would have come in and faced the balls played by the lucky batsman (unless the team is nine wickets down). Hence, the luck impact of the event for the team is only the difference in runs scored between the original batsman and others who would have replaced him.

For example, let's look at Shane Watson in the match between Chennai Super Kings and Rajasthan Royals in IPL 2018. Watson was dropped off the fifth ball of the innings when he was on 8, and he went on to score 106 off 57 balls.

From an individual point of view, his luck score for that innings was clearly 98, for he added those many extra runs, but from a team point of view the benefit was calculated to be 32. That is because the algorithm calculated that had the catch been taken and if the balls that Watson faced after being dropped had in fact been distributed among the other batsmen, they would have scored 32 fewer runs than Watson scored off those deliveries.

If a batsman is incorrectly given out, the simulation is carried out with him continuing his innings. Again, the difference in team scores between the simulated and original score is the impact of that event.

The impact of a non-dismissal event is, quite simply, the runs scored off that ball and off the extra ball bowled due to that event.

All lucky events are not the same in terms of impact, are they?
No, they're not. The ones which are dismissal-related (batsman wrongly given out/not-out, missed catch/run-out/stumping) usually have a higher run cost than a non-dismissal-related event (overthrow, no-ball etc), but even here, Luck Index further differentiates between whether or not the event actually caused the match result to flip: if the run impact of the event is greater than result margin of the game (in the direction of the team which lost the match), then it is an event which has caused the result to change.

Is it a zero-sum game between batsman/bowler or batting/bowling team?
The nature of cricket dictates that luck is not a zero-sum game between teams or players, because what is luck for one might be the result of poor execution of a skill by another.

Example: a dropped catch is lucky for the batsman and the batting team, unlucky for the bowler (because he induced a chance and deserved a wicket), but not bad luck for the fielding team because they failed to grab a wicket chance only due to poor execution of a skill.

So, using Luck Index, can you tell us the luckiest team/player of IPL 2018?
Yes we can. The luckiest team, in terms of just the net run impact of all luck events (whether in their favour or against them) were Kolkata Knight Riders, with 349 Luck Runs, while the least were Rajasthan Royals with 163 Luck Runs. Watson was the luckiest batsman of the tournament and Aaron Finch the unluckiest, while Rashid Khan was the unluckiest bowler of IPL 2018. To know about the lucky events of IPL 2018 in more detail, click here.

Forecaster

What is the Forecaster?
Want to know if CSK will chase down 180 against SRH when they are 85 for 2 after ten overs? The Forecaster is the tool to answer that question, giving a win probability for CSK at every stage of their chase. Apart from this, the Forecaster also gives the expected final total for the team batting first during their innings, and the expected runs and wicket probabilities for each bowler for the next over of an innings.

How does the Forecaster work?
The following factors are taken into account when calculating the predicted score:

- Batting strength of the team (including batsmen to follow, at every stage of the innings)
- Bowling strength of opposition
- Batsman v bowler head-to-head numbers
- Phase-wise strike rates and economy rates for batsmen and bowlers

Based on these factors, there is an expected score for the batting team at every stage. The win percentage for the chasing team also takes into account the team momentum (runs and wickets off the last six balls), and the historical probability of teams winning from that position.

Apart from the win probability and expected score, the Forecaster also predicts the runs and wicket probability in the next over for each bowler in the opposition attack.

How accurately does Forecaster predict the winning team?
Over the last three seasons of the IPL (2016 to 2018), the Forecaster had a 60% success rate in correctly predicting the winning team at the start of the run-chase, which went up to 80% by the 15th over of the chase.

How does the quality of batsmen remaining or bowlers left to bowl impact the win probability?
Since the predicted scoring rate is a function of batsman/bowler quality, and the head-to-head stats between them, the win probability is heavily dependent on all these factors.

For example, in the IPL 2018 match between Rajasthan Royals and Sunrisers Hyderabad, the Forecaster predicted a 41% chance of Royals winning when they needed 50 from 30 balls with six wickets in hand, when in fact the historical probability of the chasing team winning from that position in the last three IPL seasons was 60%.

The reduced probability was because of the relative strengths of the teams at that point: Rashid had two overs left to bowl out of the last five, while the Royals didn't have much batting firepower left in their line-up at that stage. As it turned out, Royals scored only 38 from their last five and lost by 11 runs.