March 2, 2014

The perils of data-driven cricket

For all his triumphs as England coach, Andy Flower ultimately got the balance between trusting people and numbers wrong

Was Andy Flower ultimately empowered by data or inhibited by it? © PA Photos

Cricket is an art, not a science. It's a fact that needs restating after the disintegration of Andy Flower's reign as England coach.

Slavery to data had gone too far. The triumphs of the more jocund Darren Lehmann, Flower's coaching antithesis, are a salutary reminder of the importance of fun and flair in a successful cricket team. And it's not only cricket that could learn from the tale.

Big data - the vogue term used to describe the manifold growth and availability of data, both structured and not - is an inescapable reality of the 21st century. There are 1200 exabytes (one billion gigabytes) of data stored in the world; translated, that means that, if it were all placed on CD-ROMs and stacked up, it would stretch to the moon in five separate piles, according to Kenneth Cukier and Viktor Mayer-Schonberger's book Big Data. Day-to-day life can often feel like a battle to stay afloat against the relentless tide of information. One hundred and sixty billion instant messages were sent in Britain in 2013. Over 500 million tweets are sent worldwide every day.

Kevin Pietersen was the subject of a good number of those after his sacking as an England cricketer. Amid the cacophony of opinions, one voice we could have done without was David Cameron's. The prime minister gave a radio interview saying that there was a "powerful argument" for keeping the "remarkable" Pietersen in the team. Cameron had once recognised the dangers of descending into a roving reporter, promising, "We are not going to sit in an office with the 24-hour news blaring out, shouting at the headlines." Downing Street's impulse to comment on the Pietersen affair is a manifestation of information overload at its worst: with so much space to fill, politicians feel compelled to fill it. The result is that they have less time to do their day jobs.

Flower's reign, for the most part, showed the virtues of using data smartly. But data is emphatically not a substitute for intuition and flair - either in the office or on the cricket field

Datafication often brings ugly and perverse consequences. The easiest way to reduce poverty is to give people just enough money to inch them ahead of an arbitrarily defined standard of poverty, rather than tackle the deep-rooted and more complex causes. Schools are routinely decried for a narrow-minded approach to education - "teaching to the test" - but this is the inevitable result of the obsession with standarised tests. California has pioneered performance-related pay for teachers, but a huge rise in teacher-enabled cheating has been one unforeseen result.

No industry has been permeated by datafication quite like the financial sector. The complex - oh, so complex - algorithms that underpinned the financial system had a simple rationale. In place of impulsive human beings, decision-making would be transferred to formulas that dealt only in cold logic, ensuring an end to financial catastrophes. We know what happened next. Yet the crash has changed less than is commonly supposed: around seven billion shares change hands every day in the US equity markets - and five billion of those are traded by algorithms.

The Ashes tour felt like English cricket's crash. The numbers said that it couldn't possibly happen; those who spotted the warning signs were belittled as naysayers who let emotions cloud their judgement. The Ashes series was caricatured as the triumph of the old school - Lehmann's penchant for discussing the day's play over a beer - over Flower's pseudo-scientific approach. While clearly a simplification - Lehmann is no philistine when it comes to data - the accusation contains a grain of truth.

Flower's attraction to big data originated from reading Moneyball, the book that examined how the scientific methods of Oakland Athletics manager Billy Beane helped the baseball team punch above its financial limitations. But it is too readily forgotten that the Oakland Athletics ran out of steam in knockout games. "My shit doesn't work in the playoffs," Beane exclaimed. "My job is to get us to the playoffs. What happens after that is luck." Not even Beane found an empirical way of measuring flair, spontaneity and big-game aptitude.

After the debris of England's tour Down Under, the Sun published its list of the 61 "guilty men" - including 29 non-players - involved in England's Ashes tour. It was hard not to ask what on earth the backroom staff was doing. And, more pertinently, if England's total touring party had numbered only 51 or 41, could England really have performed any worse? The proliferation of specialist coaches and analysts seemed antithetical to the self-expression of players on the pitch.

Similar questions are being asked in different fields. The average businessman now sends 108 emails a day. But as inboxes get bigger, so opportunity for creativity decreases. This reality is slowly being recognised: a multi-million dollar industry has grown around filtering emails to liberate businessmen from the grind. The world is running into the limits of Silicon Valley's favoured mantra "In God we trust - all others bring data."

No one would advocate pretending that big data isn't valid. Datafication is happening at a staggering rate; the amount of digital data doubles every three years. Flower's reign, for the most part, showed the virtues of using it smartly. But cricket data is affected by the unpredictability of human beings and so constantly fluctuates. Data is emphatically not a substitute for intuition and flair - either in the office or on the cricket field.

By the last embers of Flower's rule, England seemed not empowered by data but inhibited by it, as instinct, spontaneity and joy seeped from their cricket. Accusations of England lacking flair on the field had a point - witness Alastair Cook's insistence on having a cover sweeper regardless of the match situation. Going back to 2011, consider England's approach to tying down Sachin Tendulkar in the home series against India: they relied obsessively on drawing Tendulkar outside his off stump in the early part of his innings rather than let him get his runs on the on side, an adherence to the result of a computer simulator plan created by their team analyst, Nathan "Numbers" Leamon.

The selection of three beanpole quick bowlers to tour Australia was rooted in data that showed such bowlers were most likely to thrive in Australia. The ECB looked at the characteristics of the best quick bowlers - delayed delivery, braced front leg and so on, and then tried to coach those virtues into their own players, seemingly not realising it was too late; you can't change those things once bowlers are more than about 15. It did not matter how many boxes Steven Finn, Boyd Rankin and Chris Tremlett ticked in theory when they were utterly bereft of fitness and form in practice. It was proof of the pitfalls of excess devotion to data and reliance on bogus statistics. "Garbage in, garbage out," as some who work with data are prone to saying.

Data is a complement to intuition and judgement, not a replacement for them. As Cukier and Mayer-Schonberger argue in their study, big data "exacerbates a very old problem: relying on the numbers when they are far more fallible than we think".

Criticisms of Flower's reliance on data always lingered under the surface, as when South Africa expressed bafflement when Graham Onions was dropped for Ryan Sidebottom in 2010, a data-driven decision largely made before the tour even began.

For all his triumphs as England coach, and there were many, Flower ultimately got the balance between trusting people and numbers wrong. He was in good company. In the brave new world, those who thrive will not be those who use data most - but those who use it most smartly.

Comments