The Data Detective by Tim Harford Ten Easy Rules to Make Sense of Statistics

What's it about?

The Data Detective (2021) is a smart, practical guide to understanding the ways in which statistics –⁠ and our reactions to them –⁠ distort and obscure reality. Using psychological research and illuminating examples, it reveals some of the ways our brains influence how we see data and statistics and how we draw incorrect conclusions as a result. By picking apart our cognitive biases and misconceptions, we gain the ability to see data, and in turn, the world, for what it really is.

Did you know that storks deliver babies? Statistics prove it: In countries with higher stork populations, more babies are born than in countries with low stork populations. This, of course, isn’t true –⁠ storks don’t deliver babies. But it’s very easy to make it seem like they do, using faulty statistical arguments.
The ease of lying with statistics has made many people understandably leery of them. The trouble is that without statistics, we’d never have discovered that cigarette smoking makes you 16 times likelier to get lung cancer or that COVID-19 is transmitted human-to-human. These lessons will give you ten strategies for understanding statistics, so you can enjoy the wisdom of the good while confidently discarding the bad. In these lessons, you’ll learn how a famous art critic was tricked by a forgery; why it’s good that London’s murder rate is higher than New York’s; and why experts are such terrible forecasters.
Abraham Bredius was an art critic, collector, and world-renowned expert on Dutch painters. He had special expertise when it came to Johannes Vermeer, the seventeenth-century master revered for works like Girl With a Pearl Earring. One day in 1937, a lawyer named Gerard Boon paid Bredius a visit to show him a recently discovered Vermeer painting called Christ at Emmaus. Bredius was immediately awestruck,⁠ but he was still careful.
He inspected the painting for all the signs of forgery –⁠ and found none. Bredius declared Emmaus a genuine Vermeer, perhaps even his finest work. He also said that when he saw the painting, he “had difficulty controlling his emotion. ” Unfortunately, Bredius’s heightened emotions were his undoing – because Christ at Emmaus was totally fake. The key message here is: Notice your emotional reactions to data and information. Emmaus wasn’t even a very good painting, but still, Bredius was fooled.
He wanted so badly to believe that Emmaus was a genuine Vermeer that his emotions clouded his logical reasoning. Unfortunately, most people are likely to be fooled in just the same way when presented with information that stirs their emotions. Some statistics don’t cause emotional reactions –⁠ no one gets upset when they hear “Mars is more than 30 million miles away from Earth. ” But other issues –⁠ particularly political ones –⁠ easily get a rise out of us. When that happens, we’re likely to ignore the information if it doesn’t fit our preconceived beliefs or use it as evidence if it does. Expertise in a subject doesn’t make us immune to that effect –⁠ in fact, some studies have shown that experts are even less likely to change their opinions in the face of contradictory evidence.
That’s because they’re both motivated to avoid uncomfortable information and good at producing arguments in their own favor. So no one is immune to motivated reasoning. Fortunately, following a couple of simple protocols can help you reduce your likelihood of doing it. It starts with noticing how you feel when you see a statistical claim.
Are you outraged, overjoyed, or in denial? After noticing your emotions, pause and reflect to see whether you’re straining to reach a particular conclusion. Your commitment to weighing the facts will help you think more clearly – and, as an added bonus, you’ll set an example of clear thinking for others, too.
When the author got his job as a presenter on a BBC radio show, he immediately fell in love with the work. But he was less fond of the morning commute from East to West London. He had to ride a crowded bus and then the tube – a subway train – that was practically bursting at the seams. Because of those miserable mornings, the author was interested in learning more about just how busy London’s public transport system actually was.
He was shocked to find that the average occupancy of a London bus was just 12 people, and on the tube, it was less than 130. Those statistics felt totally wrong; they completely contradicted the author’s personal experience. What was going on? Here’s the key message: Learn when it’s better to trust a statistical claim or personal experience. We know that our personal beliefs and emotions can sometimes distort our perception of a statistical claim. But sometimes, personal experiences can be as informative as statistics.
The key is to find a balance between the two. It starts with analyzing the quality of the statistical claim itself by determining its origin. In the case of London's public transport, the numbers came from a government organization called Transport for London, or TfL, which collects data from people using their payment cards before boarding. So the data’s origins here seem credible. Next, we should look at why personal experience –⁠ in this case, the author’s –⁠ might differ from the statistics. Here, the math involved in calculating averages is relevant.
Say there’s a train line with ten trains a day. One of those trains carries a thousand passengers, while the other nine carry zero. The average occupancy for each train on that line would be 100 people –⁠ pretty close to the real average in London. So TfL’s statistics weren’t lying –⁠ but they said nothing about the personal experiences of the people packed into those extra-crammed trains. In this case, statistics and personal experience were equally informative. But there are some times when one or the other works better.
Statistics often win out when it comes to health-related issues since statistics show the most likely outcome for the greatest number of people. For instance, cigarette smoking still makes you 16 times more likely to get lung cancer, even if your 90-year-old chain-smoking grandmother is doing fine. On the other hand, statistics can tell lies, too, particularly when it comes to things like performance reviews. People are much more likely to manipulate, fake, or distort data when a monetary or professional benefit is at stake, so judging performance on a case-by-case basis is preferable. True understanding comes from knowing when statistics, personal experience, or a combination of both is the most relevant.
In the late-2010s, the UK seemed to be in the midst of an infant mortality crisis. Rates of early death varied substantially across the country – and at first, it wasn’t clear why. It turned out that the difference in mortality rates ultimately resulted from differences in definition – specifically, whether a baby born at 22 or 23 weeks should be recorded as a miscarriage or a live birth followed by an early death. In London, those pregnancies were recorded as miscarriages.
In the English Midlands, by contrast, they were instead considered live births. The difference was enough to explain the gap in mortality rate at London versus Midlands hospitals. This story points to the importance of uncovering what a claim is genuinely saying beyond the surface. The key message is this: Carefully consider what a statistic is actually measuring. Measuring something like infant mortality seems simple on the surface: count the babies who died. But go a little deeper and it gets confusing because the difference between a fetus and a baby is complex and often highly contentious.
This has a real bearing on the field of statistics, which is, at its core, all about measuring or counting things. Yet, when we see a statistic, we rarely question exactly what or who is being counted. Consider the following claim: Children who play violent video games are more likely to be violent in reality. It’s unclear just what that claim is measuring. For instance: What counts as a violent video game? How frequently are these children playing video games?
And how exactly were the researchers measuring violence? The murkiness of definitions can aid people whose goal is to distort the facts – perhaps to advance a particular political perspective. For example, consider a policy proposal for a “five-year freeze on unskilled immigration,” published by a Brexit lobbyist group in 2017. But what exactly does “unskilled” mean? In this case, the term included anyone making a salary lower than £35,000. That would prevent immigration for most nurses, primary school teachers, paralegals, and pharmacists.
You can still support or reject the policy, but it’ll undoubtedly help you to know who exactly is being considered “unskilled” under it. So remember to question the definitions used in a claim before you accept or refute it. And if you see a claim that inequality has risen, first ask: Inequality of what?
Alarming headlines blared out from London’s newspapers in April 2018. They all read some variation of: “London’s Murder Rate is Higher Than New York’s for the First Time Ever! ” Setting aside the fact that the definition of “murder” is different in each city, the claim was technically true. There were 14 murders in New York City in February 2018, while in London, there were 15.
What can we conclude from that statistic? Well, really, nothing at all. Numbers alone don’t tell us much. In order to understand what’s actually happening in the world, we need to consider the broader context and perspective from which the data is presented. The key message here is: Put claims into context before you draw conclusions. Let’s step back and consider some facts about murders in London versus New York.
In 1990, London had 184 murders; New York had more than ten times that number, at 2,262. Since then, murder rates have fallen in both places. In London in 2017, there were 130 total murders. In New York, there were 292 – a significant improvement. With this context in mind, we start to understand the actual situation better. Because New York is now much safer, its murder rate sometimes dips below London’s.
London hasn’t suddenly fallen into crime-ridden, gang-infested mayhem –⁠ both cities are, in fact, safer than they were before! Unfortunately, the current news environment prioritizes immediate, of-the-moment news that obscures the bigger picture. What would the news look like if it were instead published on a 25-year basis? The stories might be about the rise of the World Wide Web and China’s emergence as a global power –⁠ certainly not the murder rates in London and New York in a single month. Looking at broader time scales can help you understand the real significance of a statistic, but broad numerical scales are also helpful. Consider, for instance, the cost of the border wall Donald Trump wanted to build between the US and Mexico: $25 billion.
On the surface, that number seems big. But compare it to the entire US defense budget of just under $700 billion a year, or about $2 billion a day: In that context, the wall would cost about two weeks’ worth of US military operations. You, of course, may still decide that the cost of the wall or the murder rate in London is cause for alarm. But understanding the full context will make your opinion much better-informed.
Have you heard about the famous jam-tasting experiment run by psychologists Sheena Iyengar and Mark Lepper? In it, the researchers set up a jam-tasting stall that sometimes offered 24 varieties of jam and other times offered six. After customers tasted the jam, they were given a voucher to buy it at a discount. Ultimately, the bigger display attracted more customers – but only three percent of them ended up buying the jam.
Meanwhile, 30 percent of customers bought the jam at the smaller display. The psychologists concluded that people respond better to fewer choices and worse to more choices. Since its publication, the study has become a sensation. You can find the results plastered everywhere, from pop-psychology articles to TED Talks. But can we trust them? Here’s the key message: Even scientific research can be influenced by bias.
It turns out that the research on choice is much more inconclusive than the original jam study would suggest. Published papers on the subject were more likely to find that while offering many choices had a major effect, it could be⁠ strongly positive or strongly negative. Unpublished papers were more likely to find no effect at all. Those results added up to a spectacularly exciting average effect of zero. It might be unnerving to think about, but it turns out that academic publications are no less vulnerable to certain biases than the news. One example is publication bias, which says that journals are much more likely to publish experiments with surprising or counterintuitive results than those with inconclusive ones.
After all, no one wants to read a study with humdrum, boring results. Additionally, many researchers’ careers and incomes are tied to their ability to conduct and publish research. This norm creates perverse incentives for them to manipulate data so it’ll seem more significant than it is in reality. As a result, the social sciences are dealing with a “replication crisis” in which large numbers of prominent studies turn out not to be replicable.
Until this issue is solved, it’s worthwhile to consider just how trustworthy a study is before you go around touting its results. First, get a sense of whether the study makes intuitive sense or feels more like a strange outlier. Then check to see whether there are many other studies that draw similar conclusions. Just these simple steps can help you avoid spreading misleading or incorrect information.
How much pressure do people feel to conform to their peers? There’s a large amount of research suggesting that the answer is: a lot. In the 1950s, psychologist Solomon Asch conducted a study in which he showed subjects two images. One depicted three lines of different lengths, and the other depicted a “reference line.
” Their task was simply to identify which of the three lines was the same length as the reference line. However, there was a catch: Subjects were surrounded by “plants” – that is, people placed there to purposely choose the wrong line. The subjects’ choices were influenced by the errors of their peers a significant amount of the time. These experiments were elegant and fascinating, but we can’t conclude that Asch discovered some universal truth about human nature. That’s because his research was limited to a specific population: white, male 1950s American college students. The key message is this: Statistics and data aren’t always equally applicable to all people.
Recently, psychologists have become aware of the problem that many studies limit their research to very specific populations. In particular, experiments tend to be done on subjects that are considered “WEIRD” – an acronym that stands for Western, Educated, and from Industrialized Rich Democracies. So does that mean Asch’s conclusions were inaccurate? Well, by 1996, his study had inspired 133 follow-up studies – and the overall results held up. Most of the follow-up studies weren’t very diverse, but the ones that were showed some interesting effects –⁠ for instance, that people were more likely to conform with groups of friends rather than strangers, and that groups of women were more likely to conform than men. It shouldn’t be difficult to obtain a representative sample of the population when it comes to academic research.
But getting representative data is much more difficult in other areas, particularly polling. The main issue with polling is sample bias or the fact that some types of people are just more likely to respond to polls than others. Another issue is the particular place from which the data is pulled. For instance, a poll of American Twitter users may overrepresent young, college-educated people, who are more likely than others to use the platform. Keep these facts in mind when you encounter a piece of data, and always ask yourself the question: Who might be missing from this sample? Do your best to investigate – because you might just find the blind spot.
Upon its release in 2009, Google Flu Trends was touted as a revolutionary tool in tracking the spread of seasonal influenza. By counting searches for “flu symptoms” and “pharmacies near me,” Google could accurately estimate new daily flu cases faster than the CDC. Google Flu Trends was in many ways the herald of a new age: that of “big data” and algorithms. Big data refers to the information we produce when we surf the web, pay with credit cards, or use mobile phones.
Algorithms are computer programs often used to find patterns in datasets. Google Flu Trends used big data and algorithms to –⁠ it seemed –⁠ produce good data on flu trends. Yet, just four years after the project was announced, it completely collapsed. Why? The key message here is: Maintain a healthy skepticism of algorithms and big data. Google Flu Trends crashed and burned when, one winter, it suggested there was a severe outbreak when there wasn’t one.
At one point, it estimated that the spread of flu was two times worse than suggested by the official CDC data. So what was the problem? Mainly it was that Google didn’t, in fact, know what the connection was between search terms and the spread of flu. The algorithm was searching for patterns in the data, but it found connections involving things unrelated to flu, such as “high school basketball. ” As a result, the algorithm became less of a flu detector and more of a general-purpose winter detector. That meant it was unable to detect a summer outbreak of flu that occurred in 2009.
Of course, there are sometimes cases in which it is worth trusting algorithms over human-produced estimates. For instance, there’s a wealth of evidence to suggest that human judges are neither wholly objective nor consistent when making criminal sentences. Algorithms are much better at producing fair sentences by comparing cases to similar ones in the past. Sometimes algorithms will produce accurate, quality results, and sometimes they won’t.
Consequently, we’ll need to judge each algorithm on a case-by-case basis and not take its accuracy as a given. Doing this can be difficult because many companies don’t want to reveal the secrets behind their money-making engines. But if everyone is allowed to peer under the hood of an algorithm, we’ll be much more likely to understand how they’re making the decisions they are –⁠ and how they can improve.
The Congressional Budget Office was established in the US in 1974 to provide Congress with reports on the budgetary costs of policy proposals. As one CBO official described it, the process was like dropping a bill down into a manhole and then having the cost estimates sent back up 20 minutes later –⁠ objective and noncontroversial. But not every president accepted the CBO’s estimates with grace. The first president to complain was Jimmy Carter, who wanted to improve America’s energy efficiency.
But the CBO evaluated Carter’s proposals for doing so and found they wouldn’t work as well as planned. The Carter administration was unhappy because the CBO “wasn’t helping. ” But that was precisely the point –⁠ the best government organizations will present statistics accurately, whether or not they make politicians happy. Here’s the key message: Don’t dismiss the importance and usefulness of official statistics. When politicians attempt to distort or discredit the work of statistical agencies, disaster can result. Just take the example of Greece, whose official statistics in the early 2000s were about as untrustworthy as they come.
In order to remain in the eurozone, a country must keep its budget deficit below three percent of its GDP. Greece couldn’t do that through legitimate methods, so officials decided to fudge the numbers a little, leaving out several billion euros the country had borrowed here and there. This numerical fiddling became apparent in 2009. In the midst of the global financial crisis, the EU realized that Greece had borrowed much more money than it had admitted to – and, not only that: it was unable to pay it back. The Greek economy promptly crashed and burned. Independent statistical agencies keep a country honest, but they’re also worth their price tag for other reasons.
One cost-benefit analysis conducted in the UK, for instance, found that data from the national census was instrumental in everything from pension policy to building schools and hospitals in the right areas. In addition, it enabled other organizations to calculate all sorts of per-capita statistics. Unfortunately, the analysis couldn’t put a numerical value on all the statistical calculations, but its estimate –⁠ a rather conservative one – ⁠was that the measurable benefits totaled £500 million a year. The census itself costs less than that and applies for ten years, leading to a tenfold return on the initial investment. If governments are expected to take action on issues, then they need a solid statistical bedrock from which to do so. Official statistics are their best shot at creating that.
David McCandless, author of Information is Beautiful, once produced a striking and unforgettable animation called Debtris. Just like the classic computer game Tetris, Debtris showed large colored blocks falling to the bottom of a screen. The size of each block represented the cost of various items, like the UN budget, the estimated cost of the 2003 Iraq war, and Walmart’s revenue. The catchy music, colorful graphics, and slow presentation of the various comparisons are beautiful to look at.
But unfortunately, those elements obscure the many problems in the data used to make the graphic. The key message is this: Don’t be fooled by the slick aesthetics of a graph or chart. Sometimes, the visualization of a statistic is beautiful, but the data behind it is ugly. This is, unfortunately, the case with Debtris, which makes many mistakes. For instance, it conflates net measures with gross measures. This is like comparing a company’s profit with its turnover.
Given the issues within Debtris, should we simply discard any attempts at presenting data beautifully? Not necessarily. Sometimes, someone can hit the sweet spot between beauty and informativeness. One such example is Florence Nightingale, a legendary figure known today as the founder of modern nursing. Lesser-known –⁠ but no less impressive –⁠ was Nightingale’s work as a statistician. In 1858, she began circulating something called the rose diagram, with the goal of proving that sanitary measures could reduce the number of deaths from infectious diseases.
At the time, scientists didn’t know that bad hygiene helped transmit germs. Nightingale’s rose diagram was designed to look like two roses side by side. One represented diseases and deaths before the measures; the other showed them after. The result was a stark visual representation of all the deaths the sanitary measures helped prevent.
The diagram managed to convince hesitant doctors about the soundness of Nightingale’s sanitation measures, and public health acts were eventually passed in response. To avoid falling prey to misleading graphs and charts, it’s worth checking your emotional response to them as you would when faced with other data. Then, after noticing those emotions, verify that you understand what the graph is actually saying –⁠ what the axes mean, what is being counted, or what experiments it’s reflecting. Recognize that someone may be trying to persuade you of something –⁠ which, sometimes, is quite all right.
Philip Tetlock was a young Canadian-born psychologist who, along with a group of other social scientists, had been given a huge task: prevent nuclear war between the US and the Soviet Union. To do that, Tetlock interviewed countless experts to get their take on all the possibilities about what might happen next and why. However, Tetlock became frustrated when he found that experts of all stripes were extremely stubborn, constantly refusing to change their minds when given contradictory evidence. Many of them were also relentless in trying to justify incorrect forecasts they’d made in the past.
So Tetlock created a devious study that vividly showed just how bad the forecasters were at forecasting. The key message here is: Always keep an open mind and be willing to revise your opinions. For his experiment, Tetlock collected some 27,500 predictions from almost 300 experts in politics, geopolitics, and, to a lesser extent, economics. He asked clear questions that would be easy to declare true or false in the future. And then he waited 18 years for the results. In 2005, Tetlock finally published his conclusions.
The overarching takeaway was simple: experts were terrible forecasters. The predictions they made were incorrect, they were overconfident, and they even selectively misremembered their own forecasts, claiming they’d been right all along when the records showed they’d been wrong. Does this just mean that the world is too complex to predict? Tetlock thought not. So he decided to conduct another ambitious study that drew forecasting opinions from 20,000 experts and amateurs alike. The most interesting takeaway from this study was that there are, indeed, some people who are better than others at making predictions –⁠ not perfect, but above average.
Additionally, those same people got better at forecasting over time, which meant they hadn’t just gotten lucky the first time. Tetlock called this group “superforecasters. ” A few qualities tied them together, but perhaps the most crucial of all was the personality trait of open-mindedness. In other words, the superforecasters didn’t cling stubbornly to a particular forecasting approach, and they were happy to change their views when shown new evidence.
Tetlock’s study is proof that our mistakes in making statistical predictions aren’t as much a result of insufficient knowledge as they are of our refusal to accept data. So always keep your mind wide open. That, combined with a solid base of statistical knowledge, will make your understanding of the world that much clearer. The key message in these lessons: To look at any kind of data with a clear mind and a focus on the facts, remember some important rules.
These include watching your emotional reactions to information –⁠ whether visual or verbal –⁠ and being willing to update your opinions in the face of new evidence. You should also look at the big picture of a statistic, making sure to examine the overarching context and identify potential distortions, exclusions, or oversights. The ultimate goal tying all of these together is to always be curious –⁠ look deeply for the facts and keep asking questions. Actionable advice: Memorize a few “landmark numbers.
” Entrepreneur Andrew Elliott advocates keeping a short list of “landmark numbers” in your head so it’s easier for you to understand the relative significance of other numbers. Here are a few examples: The population of the United States is 325 million; the UK’s is 65 million. The drive from Boston to Seattle is 3,000 miles. And the average novel is 100,000 words long. Once you’ve got those in your head, you can use them to make comparisons – for instance, a 10,000-word report might seem long, but it’s ten times shorter than the average novel.

The Data Detective by Tim Harford Ten Easy Rules to Make Sense of Statistics

Comments

Post a Comment