How Distinct Is a “Distinct Possibility”?

How Distinct Is a “Distinct Possibility”?
Vague Verbiage in Forecasting

“What does a ‘fair chance’ mean?”

It is a question posed to a diverse group of professionals—financial advisers, political analysts, investors, journalists—during one of Good Judgment Inc’s virtual workshops. The participants have joined the session from North America, the EU, and the Middle East. They are about to get intensive hands-on training to become better forecasters. Good Judgment’s Senior Vice President Marc Koehler, a Superforecaster and former diplomat, leads the workshop. He takes the participants back to 1961. The young President John F. Kennedy asks his Joint Chiefs of Staff whether a CIA plan to topple the Castro government in Cuba would be successful. They tell the president the plan has a “fair chance” of success.

The workshop participants are now asked to enter a value between 0 and 100—what do they think is the probability of success of a “fair chance”?

When they compare their numbers, the results are striking. Their answers range from 15% to 75% with the median value of 60%.

Figure 1. Meanings behind vague verbiage according to a Good Judgment poll. Source: Good Judgment.

The story of the 1961 Bay of Pigs invasion is recounted in Good Judgment co-founder Philip Tetlock’s Superforecasting: The Art and Science of Prediction (co-authored with Dan Gardner). The advisor who wrote the words “fair chance,” the story goes, later said what he had in mind was only a 25% chance of success. But like many of the participants in the Good Judgment workshop some 60 years later, President Kennedy took the phrase to imply a more positive assessment of success. By using vague verbiage instead of precise probabilities, the analysts failed to communicate their true evaluation to the president. The rest is history: The Bay of Pigs plan he approved ended in failure and loss of life.

Vague verbiage is pernicious in multiple ways.

1. Language is open to interpretations. Numbers are not.

According to research published in the Journal of Experimental Psychology, “maybe” ranges from 22% to 89%, meaning radically different things to different people under different circumstances. Survey research by Good Judgment shows the implied ranges for other vague terms, with “distinct possibility” ranging from 21% to 84%. Yet, “distinct possibility” was the phrase used by White House National Security Adviser Jake Sullivan on the eve of the Russian invasion in Ukraine.

Figure 2. How people interpret probabilistic words. Source: Andrew Mauboussin and Michael J. Mauboussin in Harvard Business Review.

Other researchers have found equally dramatic perceptions of probability that people attach to vague terms. In a survey of 1,700 respondents, Andrew Mauboussin and Michael J. Mauboussin found, for instance, that the probability range that most people attribute to an event with a “real possibility” of happening spans about 20% to 80%.

2. Language avoids accountability. Numbers embrace it.

Pundits and media personalities often use such words as “may” and “could” without even attempting to define them because these words give them infinite flexibility to claim credit when something happens (“I told you it could happen”) and to dodge blame when it does not (“I merely said it could happen”).

“I can confidently forecast that the Earth may be attacked by aliens tomorrow,” Tetlock writes. “And if it isn’t? I’m not wrong. Every ‘may’ is accompanied by an asterisk and the words ‘or may not’ are buried in the fine print.”

Those who use numbers, on the other hand, contribute to better decision-making.

“If you give me a precise number,” Koehler explains in the workshop, “I’ll know what you mean, you’ll know what you mean, and then the decision-maker will be able to decide whether or not to proceed with the plan.”

Tetlock agrees. “Vague expectations about indefinite futures are not helpful,” he writes. “Fuzzy thinking can never be proven wrong.”

If we are serious about making informed decisions about the future, we need to stop hiding behind hedge words of dubious value.

3. Language can’t provide feedback to demonstrate a track record. Numbers can.

In some fields, the transition away from vague verbiage is already happening. In sports, coaches use probability to understand the strengths and weaknesses of a particular team or player. In weather forecasting, the standard is to use numbers. We are much better informed by “30% chance of showers” than by “slight chance of showers.” Furthermore, since weather forecasters get ample feedback, they are exceptionally well calibrated: When they say there’s a 30% chance of showers, there will be showers three times out of ten—and no showers the other seven times. They are able to achieve that level of accuracy by using numbers—and we know what they mean by those numbers.

Another well-calibrated group of forecasters are the Superforecasters at Good Judgment Inc, an international team of highly accurate forecasters selected for their track record among hundreds and hundreds of others. When assessing questions about geopolitics or the economy, the Superforecasters use numeric probabilities that they update regularly, much like weather forecasters do. This involves mental discipline, Koehler says. When forecasters are forced to translate terms like “serious possibility” or “fair chance” into numbers, they have to think carefully about how they are thinking, to question their assumptions, and to seek out arguments that can prove them wrong. And their track record is available for all to see. All this leads to better informed and accurate forecasts that decision-makers can rely on.

 

Good Judgment Inc is the successor to the Good Judgment Project, which won a massive US government-sponsored geopolitical forecasting tournament and generated forecasts that were 30% more accurate than those produced by intelligence community analysts with access to classified information. The Superforecasters are still hard at work providing probabilistic forecasts along with detailed commentary and reporting to clients around the world. For more information on how you can access FutureFirst™, Good Judgment’s exclusive forecast monitoring tool, visit https://goodjudgment.com/services/futurefirst/.

The Future of Health and Beyond

The Future of Health and Beyond: The Economist features Good Judgment’s Superforecasts

This summer, Good Judgment Inc collaborated with The Economist for the newspaper’s annual collection of speculative scenarios, “What If.” The theme this year was the future of health. In preparing the issue, The Economist asked the Superforecasters to work on several hypothetical scenarios—from America’s opioid crisis to the possibility of the Nobel prize for medicine being awarded to an AI. “Each of these stories is fiction,” the editors wrote in the 3 July edition, “but grounded in historical fact, current speculation, and real science.”

This was unlike most of the work that Good Judgment Inc does for clients. Our Superforecasters typically forecast concrete outcomes on a relatively short time horizon to inform decision- and policymakers about the key issues that matter to them today. The Economist’s “What If” project instead focused on a more speculative, distant future. To address the newspaper’s imaginative scenarios without sacrificing the rigor that Good Judgment’s Superforecasters and clients have become accustomed to, our question generation team crafted a set of relevant, forecastable questions to pair with each topic.

As a result, The Economist’s “What if America tackled its opioid crisis? An imagined scenario from 2025” was paired with our Superforecast: “How many opioid overdoses resulting in death will occur in the US in 2026?”

What if biohackers injected themselves with mRNA? An imagined scenario from 2029” was paired with: “How many RNA vaccines and therapeutics for humans will be FDA-approved as of 2031?”

And “What if marmosets lived on the Moon? An imagined scenario from 2055” was paired with: “When will the first human have lived for 180 days on or under the surface of the moon?”

Superforecaster, Social Scientist, and Archaeologist of Tempe, Arizona, Karen Hagar participated in forecasting these “far into the future” questions because, she says, she likes challenges.

“These questions were different than standard forecasting questions which typically resolve a year into the future,” she explains. “Both types of questions have inherent challenges. The questions with shorter resolution require extreme accuracy. One must research and mentally aggregate all incoming information. This includes any possible Black Swan events, current geopolitical and any social developments that may change within the short time frame. The dynamics of predicting outcomes of questions 10-20 years into the future required the same skill, but possibly even more research.”

The most exciting aspects of the “What If” project for Karen included learning the degree to which science has advanced. “For example, uncovering the scientific data regarding CRISP.R technology and its application to Alzheimer’s research was amazing,” she says.

In making her forecasts for The Economist, she studied the questions from all angles and played devil’s advocate to challenge her colleagues’ thinking. This technique of red-teaming is frequently used by professional Superforecasters to confront groupthink and elicit more accurate predictions.

“What If” is only one of Good Judgment’s several collaborative projects with The Economist. The newspaper’s “World in 2021,” recurring annually since the “World in 2017” and looking to forecast key metrics for the year ahead, consisted of questions that had shorter time horizons and were of immediate importance to decision-makers.

Superforecaster and Show Producer JuliAnn Blam says she is particularly interested in forecasting questions that focus on economic issues and the “World in 2021” project “didn’t disappoint.”

“The questions tended to be more pertinent to everyday life and issues that were of practical interest to me,” JuliAnn explains.

The “World in 2021” project included forecasting the world’s GDP growth, ESG (Environmental, Social, and Governance) investment, and work-from-home dynamics. But one of JuliAnn’s favorite questions was about racial diversity of board members in S&P 500 companies.

A screenshot of Good Judgment’s forecast monitor, FutureFirst, featuring the racial diversity forecast for The Economist’s “World in 2021” project.

“That one was hopeful, ‘woke’, and had me looking more closely at what a diversified board of directors can bring to a company’s outlook, marketing, product line, treatment of employees, etc.,” JuliAnn says. “It was a sort of stepping stone to looking into a lot more than just how many companies will appoint board members of color within the next year, and pushed the argument of why they should and what they would gain by doing so.”

Despite having a shorter time span than the “What If” forecasts, the “World in 2021” also required taking into account numerous factors, some of which weren’t even on the horizon when the questions were launched in October 2020. Take, for instance, the global GDP question.

“There are so many factors to consider, between Xi and Evergrande and the resultant fallout of the cascade from that default, to new COVID variants stopping workforces, anti-vax movements, the infrastructure bill and the green new deal, and then inflation,” JuliAnn says. “Tons to balance and think about!”

Whether it’s a forecast of global GDP next year or a possibility of using the Moon as a base for space exploration in the following decade, the Superforecasters always apply their rigorous process and tested skills to provide thoughtful numeric forecasts on questions that matter. As for their reward, Karen puts it best: “The enjoyment from forecasting is honing and improving forecasting skill, acquiring new information, and interacting with intellectuals of the same knowledge base.”

You can find Good Judgment’s Superforecasts on the “What If” questions in The Economist’s print edition from 3 July 2021 or on their website, or ask us about a subscription to FutureFirst, Good Judgment’s forecast monitor, to view all our current forecasts from our team of professional Superforecasters.

Handicapping the odds

Handicapping the odds: What gamblers can learn from Superforecasters

Successful gamblers, like good forecasters, need to be able to translate hunches into numeric probabilities. For most people, however, this skill is not innate. It requires cultivation and practice.

In Superforecasting: The Art and Science of Prediction, a best-selling book co-authored with Dan Gardner, Phil Tetlock writes: “Nuance matters. The more degrees of uncertainty you can distinguish, the better a forecaster you are likely to be. As in poker, you have an advantage if you are better than your competitors at separating 60/40 bets from 40/60—or 55/45 from 45/55.”

Good Judgment’s professional Superforecasters excel at this, but thinking in probabilities doesn’t come naturally to the majority of human beings. Daniel Kahneman and Amos Tversky, who studied decision making under risk, found that most people tend to overweight low probabilities (e.g., the odds of winning a lottery) and underweight other outcomes that were probable but not certain. In other words, people on average evaluate probabilities incorrectly even when making critical decisions.

Superforecasting gambling poker
Base rate neglect often leads to poor decisions in forecasting, finance, or gambling.

If you’ve participated in any Good Judgment training, you’ll know that the first step in estimating correct probabilities is to identify the base rate—the underlying likelihood of an event. This is also the step that the majority of decision makers tend to ignore. Base rate neglect is one of the most common cognitive biases we see in training programs and workshops, and it generally leads to poor investing, betting, and forecasting outcomes.

For those new to the concept, consider this classic example: “Steve is very shy and withdrawn, invariably helpful, but with little interest in people or the social world. A meek and tidy soul, he has a need for order and structure and a passion for detail.”

Is Steve more likely to be a librarian or a farmer? A librarian or a salesman?

While the description, offered in Daniel Kahneman’s Thinking, Fast and Slow, may be that of a stereotypical librarian, Steve is in fact 20 times more likely to be a farmer—and 83 times more likely to be a salesman—than a librarian. There are simply a lot more farmers and sales persons in the United States than male librarians.

Base rate neglect is the mind’s irrational tendency to disregard the underlying odds. Failure to account for the base rate could lead, for example, to the belief that participating in commercial forms of gambling is a good way of making money. Likewise, failure to factor in the house edge could lead to poor betting decisions.

Fortunately, the mind’s tendency to overlook the base rate can be corrected with training and practice.

Recognition of bias and noise, and techniques to mitigate their detrimental effects, should be at the heart of any training on better decision making. In Good Judgment workshops, we have consistently observed tangible improvements in the quality of forecasting as a result of debiasing interventions.

The other essential component is practice. On Good Judgment Inc’s public platform, GJ Open, anyone can try their hand at forecasting—from predicting the next NBA winner to estimating the future price of a bitcoin. Unsurprisingly, those forecasters who use base rates and forecast on the platform regularly tend to have better results.

To stay on top, gamblers, like successful forecasters and professional Superforecasters, need to actively seek out the base rate and mitigate other cognitive biases that interfere with their judgment. While “Thinking in Bets,” as professional gambler Annie Duke puts it in her best-seller, does not come easy to most people, better decision making—in forecasting, investing, and gambling alike—is a skill that can be learned. With an awareness of cognitive biases, debiasing techniques, and regular practice, anyone can acquire the mental discipline to handicap the odds more effectively.

* This article originally appeared in Luckbox Magazine and is shared with their permission.