June 4, 2012

Stats

A Moneyball analysis of the IPL

Devashish Fuloria

From Rahul Oak, USA

Most sport fans are geeks at heart, and cricket fans even more so. We tend to obsess about numbers – batting averages, strike rates, economy rates, and more stats than the average punter can keep track of while also living a normal life, managing a day job, having a family and that sort of thing. The one thing, however, that has always bothered me a bit is how all rankings for cricketers are always split into batsmen and bowlers’ rankings. At the end of a tournament or season, there is always a list of top scorers, wicket takers and so on and so forth. You could also have the odd pundit sit around and rate performances on a series-by-series basis on a scale of 1 to 10. Which got me thinking – how about if we can figure out a fairly objective way to come up with a rating system that puts batsmen and bowlers on the same scale?

There are obvious issues with this just given that batsmen and bowlers are judged by totally different criteria. Runs are paramount to a batsman just as wickets are the currency that bowlers are measured on. Being a programmer during my day job, and having worked on statistical analysis of baseball stats in the past, I tried to come up with a system that would work reasonably well for cricket. I’ll try and explain the method below. For those who fell asleep in the back benches when statistics classes were in progress, I would strongly suggest skipping the next paragraph. You are very welcome.

The biggest challenge is trying to come up with a scale by which to judge players on. For this purpose, I went with the number of standard deviations from the mean. In plain English, we are trying to measure how good someone is compared to their peers. It isn’t very different from grading a class of students on a bell curve. There are two obvious difficulties with this - firstly, what stats to use for batsmen and bowlers and secondly, how to select the pool of players to calculate means and standard deviations on for each of those stats. I used five basic batting and bowling stats (runs, wickets, average and so forth) that make the most sense in Twenty20 cricket, and used the following criteria for calculating the mean and standard deviation: For a single season, a player should have batted at least seven innings (half the season until IPL 2011) or bowled at least 28 overs. For the combined previous seasons’ stats, these criteria were doubled. I combined the raw scores for the two sets of data and then attached more weight to numbers from the current season to take form into account. Then I just closed my eyes and spun the wheel and held my breath. Let’s move on to the exciting bits.

With me so far? Great, because now it’s time for the good stuff. Below is a list of the top 25 (and bottom 25) performers in the IPL, with their raw scores rounded to 2 decimal places:

1.CH Gayle(32.07) 2.V Sehwag(17.29) 3.SL Malinga(16.53) 4.G Gambhir(14.39) 5.RG Sharma(13.64) 6.SP Narine(13.21) 7.DA Warner(12.08) 8.KP Pietersen(11.09) 9.AB de Villiers(10.26) 10.SE Marsh(10.07) 11.S Dhawan(10.02) 12.AM Rahane(9.11) 13.SK Raina(8.87) 14.DW Steyn(8.83) 15.SR Watson(8.45) 16.M Vijay(8.31) 17.MM Patel(8.3) 18.L Balaji(7.89) 19.JP Duminy(7.73) 20.SR Tendulkar(7.62) 21.AC Gilchrist(7.61) 22.M Muralitharan(6.47) 23.CL White(6.2) 24.MS Dhoni(5.79) 25.A Mishra(5.5) … ... ... 115.S Nadeem(-3.64) 116.AB Agarkar(-3.85) 117.SC Ganguly(-3.94) 118.JEC Franklin(-4.06) 119.A Singh(-4.16) 120.M Kaif(-4.16) 121.R Sathish(-4.28) 122.RR Powar(-4.3) 123.HV Patel(-4.36) 124.VR Aaron(-4.39) 125.SS Tiwary(-4.51) 126.AL Menaria(-4.62) 127.K Goel(-4.73) 128.AD Mathews(-5.03) 129.PA Patel(-5.96) 130.N Saini(-6.11) 131.Y Venugopal Rao(-6.17) 132.Harbhajan Singh(-6.2) 133.M Manhas(-6.35) 134.DL Vettori(-6.44) 135.P Kumar(-6.87) 136.IK Pathan(-7.13) 137.VRV Singh(-7.57) 138.LR Shukla(-7.76) 139.B Lee(-13.56) 140.M Kartik(-17.47)

It’s really no surprise that Gayle is right up there on top of the list. Also, what isn’t surprising is that based on his raw scores, Gayle is almost twice as valuable as the next best on the list (Sehwag). Malinga being the top bowler at No. 3 isn’t much of a surprise either. It is a bit of a surprise seeing Rohit Sharma being as high as No. 5, but then stats don’t lie. What’s also interesting is that Kallis and Watson are down at Nos. 29 and 15 respectively. Narine’s season was good enough that not having played in the IPL before does not stop him from being No. 6 on this list – he was also helped by the fact that this season’s stats have been weighed higher than the ones from before.

So if Billy Beane of Moneyball fame were to look at these numbers and make some decisions, what is he likely to come up with? That fact that Gayle and Malinga and Co. are valuable is hardly a Sherlock-level deduction. Also, you don’t need to crunch numbers to know that the multi-million buys for Vinay Kumar and Jadeja were plain idiotic.

Doesn’t need a genius to know that the likes of Rahane, Dhawan and Awana are going to be fairly in demand. But let’s look at the no-so-obvious ‘buys’:

1. Munaf Patel: Although a certain M Vijay fought hard for this spot, he’s had some high profile knocks in important games. However, for all that Munaf Patel is, he is one person you definitely don’t expect to be on that list. Takes top spot for less obvious buys.

2. JP Duminy: While his name isn’t one that would come up in cocktail party discussions about the IPL, he’s been a steady performer whenever he’s been given a chance. Takes No. 2 spot.

3. Amit Mishra: It’s his bad luck that he’s been part of a terrible team during this season, but he deserves more credit than he gets for his IPL performances.

4. Anil Kumble: The grand-daddy of Indian spin bowling has retired and isn’t a legitimate ‘buy’ in the strictest sense, but still commands a place in this list for pure nostalgic reasons.

5. Doug Bollinger: Considering that Ben Hilfenhaus has become MS Dhoni’s go-to fast bowler, Bollinger should be on the radar of many teams.

Sourav Ganguly, Saurabh Tiwary and Irfan Pathan have been paragons of underachievement, but they have got enough headlines. Let’s try and pick the busts that don’t immediately come to mind:

1. Brett Lee: He’s been bought for fairly high sums across the seasons and given the experience and quality he brings to the table, he’s been the most disappointing pick in the IPL.

2. Virat Kohli: Fairly surprising that he’s made No. 2 on the ‘sell’ list. But for a player of his caliber, and considering that he was retained by Bangalore for a nominal value of $1.8m, you’d expect a lot more.

3. Brendon McCullum: After he blazed away to a 158* in the first ever game of the IPL, his performances have been middling. Definitely a ‘sell’. It’s fairly surprising that the No. 1 and No. 3 on this list are players who were part of the winning Kolkata Knight Riders team during IPL 2012.

4. Daniel Christian: While on the surface, he remains an important player on the Deccan lineup, his performances leave a lot to be desired.

5. Brad Hodge: Given the reputation that preceded him, as well as the Rajasthan Royals’ tendency to cull their squad, I’d put Hodge as a definite ‘sell’. Maybe Kevin Pietersen was referring to our No. 4 and No. 5 (among others) when he made his now famous “second-rate Aussies” quip.

While this analysis is by no means perfect and I’m sure people way smarter than this author could come up with sophisticated mathematical models that would factor a lot more variables into this kind of exercise, this list would still be fairly useful in the hands of a smart IPL owner.

Devashish Fuloria is a sub-editor with ESPNcricinfo

RSS Feeds: Devashish Fuloria

Keywords: Stats

© ESPN Sports Media Ltd.

Posted by sashah on (June 7, 2012, 1:33 GMT)

this is a very innovative article.I'm sure there are lots of ways to improve upon the model but coming up with the idea deserves a lot of credit. Keep posting oak!

Posted by Ravi on (June 6, 2012, 10:18 GMT)

If you have diferent weightages for batsmen in the top order and middle order and bowlers in the powerplay overs, middle overs and death overs, you might have a better analysis on your hands. Also if you can correlate the score with their price at the auction, many new domestic players might come up on the buy list.

Posted by PLI on (June 6, 2012, 9:53 GMT)

Superb analysis. Very well done. I know people will complain about various factors that weren't included, but don't worry about that.

Posted by raj khanna on (June 6, 2012, 5:43 GMT)

I havn't seen a better analysis than this on cricket in recent times. Great work Rahul. Such analysis will certainly help Team pickers n choosers to get value for their bucks.

Raj khanna

Posted by NK on (June 6, 2012, 4:51 GMT)

The best part of this article is how different it is from everything out there! A great read but more importantly something that the IPL team owners should take notice of. With the IPL being relatively new and not enough analysis being done on the performance, these stats would prove very handy. And I completely disagree with the person who spoke about Virat Kohli's brand image. In a sporting arena, your brand image only takes you as far as your actual performance. The minute the performance dips, so does that image. That's the beauty of sport vs. films. Again kudos to the author who did the analysis and wrote this article!!

Posted by Sifter on (June 5, 2012, 21:44 GMT)

Great start! I would also echo some of the comments here about batsmen getting a lot of help. Especially opening batsmen or #3 eg. Dhawan, Rahane, Vijay. I'd love to have a system where guys who bat a bit lower can be honoured. Maybe less value on runs, more on strike rate? Similarly for bowlers: Brett Lee or Harbhajan Singh haven't taken many wickets, but generally their economy has been pretty good.

Posted by HP on (June 5, 2012, 16:40 GMT)

No doubt in your analysis but baseball doesn't depend on pitch. 50runs in kolkata is not less than 80 runs in bangalore. so this mathod doesnt apply on cricket mostly in T20 where 6balls enough to change the result. bring a better analysis with situation,pitch in consideration.:)

Posted by unni on (June 5, 2012, 12:37 GMT)

Good attempt. Some comments. 1. I didn't understand how did you combine the distance from the mean for several parameters into one single score. This cannot be simply added since the units are different. 2. Also, in my opinion, you could project two lists one for batting and another for bowling for the presentation purposes. Malinga's relative low score and absence of many bowlers in the top 25 list indicates that there were lot of good bowlers but, nobody went to bowl significantly brilliantly than others. So, this cannot be clubbed with the batsmen list easily. However, the problem happens only when you publish top x. 3. How do you propose to handle the case for players who didn't play enough matches? For some of them, it could be because the lack of form(especially for the ones who played 2-3 games, where they got a chance.) So, probably they are sell candidates. How can you adapt your system to incorporate that?

Posted by Daki on (June 5, 2012, 10:25 GMT)

finally a good article to read!!! Great work

Posted by phil on (June 5, 2012, 9:40 GMT)

I'm not a mathematician but it looks very batsman dominant. Bowlers win you games too.

Comments have now been closed for this article