Superforecasters: Still Crème de la Crème Six Years On

Superforecasters: Still Crème de la Crème Six Years On

The multi-year geopolitical forecasting tournament sponsored by the research arm of the US Intelligence Community (IARPA) that led to the groundbreaking discovery of “Superforecasters” ended in 2015. Since then, public and private forecasting platforms and wisdom-of-the-crowd techniques have only proliferated. Six years on, are Good Judgment’s Superforecasters still more accurate than a group of regular forecasters? What, if anything, sets their forecasts apart from the forecasts of a large crowd?

A bar graph showing the Superforecasters' error scores are lower than those of regular forecasters
From the paper: Superforecasters’ accuracy outstrips wisdom-of-the-crowd scores.

A new white paper by Dr. Chris Karvetski, senior data and decision scientist with Good Judgment Inc (GJ Inc), compares six years’ worth of forecasts on the GJ Inc Superforecaster platform and the GJ Open public forecasting platform to answer these questions.

Key takeaway: Superforecasters, while a comparatively small group, are significantly more accurate than their GJ Open forecasting peers. The analysis shows they can forecast outcomes 300 days prior to resolution better than their peers do at 30 days from resolution.

Who are “Superforecasters”?

During the IARPA tournament, Superforecasters routinely placed in the top 2% of accuracy among their peers and were a winning component of the experimental research program of the Good Judgment Project, one of five teams that competed in the initial tournaments. Notably, these elite forecasters were over 30% more accurate than US intelligence analysts forecasting the same events with access to classified information.

Key Findings

Calibration plot showing the Superforecasters are 79% closer to perfect calibration
From the paper: Regular forecasters tend to show overconfidence, whereas the Superforecasters are close to perfect calibration.

Dr. Karvetski’s analysis presented in “Superforecasters: A Decade of Stochastic Dominance” uses forecasting data over a six-year period (2015-2021) on 108 geopolitical forecasting questions that were posted simultaneously on Good Judgment Inc’s Superforecaster platform (available to FutureFirst™ clients) as well as the Good Judgment Open (GJ Open) forecasting platform, an online forecasting platform that allows anyone to sign up, make forecasts, and track their accuracy over time and against their peers.

The data showed:

  • Despite being relatively small in number, the Superforecasters are much more prolific, and make almost four times more forecasts per question versus GJ Open forecasters.
  • They are also much more likely to update their beliefs via small, incremental changes to their forecast.
  • Based on the Superforecasters’ daily average error scores, they are 35.9% more accurate than their GJ Open counterparts.
  • Aggregation has a notably larger effect on GJ Open forecasters; yet, the Superforecaster aggregate forecasts are, on average, 25.1% more accurate than the aggregate forecasts using GJ Open forecasts.
  • The average error score for GJ Open forecasters at 30 days from resolution is larger than any of the average error scores of Superforecasters on any day up to 300 days prior to resolution.
  • GJ Open forecasters, in general, were over-confident in their forecasts. The Superforecasters, in contrast, are 79% better calibrated. “This implies a forecast from Superforecasters can be taken at its probabilistic face value,” Dr. Karvetski explains.
  • Finally, the amount of between-forecaster noise is minimal, implying the Superforecasters are better at translating the variety of different signals into a numeric estimate of chance.

You can read the full paper here.

Where Can I Learn More About Superforecasting?

Subscription to FutureFirst, Good Judgment’s exclusive monitoring tool, gives clients 24/7 access to Superforecasters’ forecasts to help companies and organizations quantify risk, improve judgment, and make better decisions about future events.

Our Superforecasting workshops incorporate Good Judgment research findings and practical Superforecaster know-how. Learn more about private workshops, tailored to the needs of your organization, or public workshops that we offer.

A journey to becoming a Superforecaster begins at GJ Open. Learn more about how to become a Superforecaster.

The Future of Health and Beyond

The Future of Health and Beyond: The Economist features Good Judgment’s Superforecasts

This summer, Good Judgment Inc collaborated with The Economist for the newspaper’s annual collection of speculative scenarios, “What If.” The theme this year was the future of health. In preparing the issue, The Economist asked the Superforecasters to work on several hypothetical scenarios—from America’s opioid crisis to the possibility of the Nobel prize for medicine being awarded to an AI. “Each of these stories is fiction,” the editors wrote in the 3 July edition, “but grounded in historical fact, current speculation, and real science.”

This was unlike most of the work that Good Judgment Inc does for clients. Our Superforecasters typically forecast concrete outcomes on a relatively short time horizon to inform decision- and policymakers about the key issues that matter to them today. The Economist’s “What If” project instead focused on a more speculative, distant future. To address the newspaper’s imaginative scenarios without sacrificing the rigor that Good Judgment’s Superforecasters and clients have become accustomed to, our question generation team crafted a set of relevant, forecastable questions to pair with each topic.

As a result, The Economist’s “What if America tackled its opioid crisis? An imagined scenario from 2025” was paired with our Superforecast: “How many opioid overdoses resulting in death will occur in the US in 2026?”

What if biohackers injected themselves with mRNA? An imagined scenario from 2029” was paired with: “How many RNA vaccines and therapeutics for humans will be FDA-approved as of 2031?”

And “What if marmosets lived on the Moon? An imagined scenario from 2055” was paired with: “When will the first human have lived for 180 days on or under the surface of the moon?”

Superforecaster, Social Scientist, and Archaeologist of Tempe, Arizona, Karen Hagar participated in forecasting these “far into the future” questions because, she says, she likes challenges.

“These questions were different than standard forecasting questions which typically resolve a year into the future,” she explains. “Both types of questions have inherent challenges. The questions with shorter resolution require extreme accuracy. One must research and mentally aggregate all incoming information. This includes any possible Black Swan events, current geopolitical and any social developments that may change within the short time frame. The dynamics of predicting outcomes of questions 10-20 years into the future required the same skill, but possibly even more research.”

The most exciting aspects of the “What If” project for Karen included learning the degree to which science has advanced. “For example, uncovering the scientific data regarding CRISP.R technology and its application to Alzheimer’s research was amazing,” she says.

In making her forecasts for The Economist, she studied the questions from all angles and played devil’s advocate to challenge her colleagues’ thinking. This technique of red-teaming is frequently used by professional Superforecasters to confront groupthink and elicit more accurate predictions.

“What If” is only one of Good Judgment’s several collaborative projects with The Economist. The newspaper’s “World in 2021,” recurring annually since the “World in 2017” and looking to forecast key metrics for the year ahead, consisted of questions that had shorter time horizons and were of immediate importance to decision-makers.

Superforecaster and Show Producer JuliAnn Blam says she is particularly interested in forecasting questions that focus on economic issues and the “World in 2021” project “didn’t disappoint.”

“The questions tended to be more pertinent to everyday life and issues that were of practical interest to me,” JuliAnn explains.

The “World in 2021” project included forecasting the world’s GDP growth, ESG (Environmental, Social, and Governance) investment, and work-from-home dynamics. But one of JuliAnn’s favorite questions was about racial diversity of board members in S&P 500 companies.

A screenshot of Good Judgment’s forecast monitor, FutureFirst, featuring the racial diversity forecast for The Economist’s “World in 2021” project.

“That one was hopeful, ‘woke’, and had me looking more closely at what a diversified board of directors can bring to a company’s outlook, marketing, product line, treatment of employees, etc.,” JuliAnn says. “It was a sort of stepping stone to looking into a lot more than just how many companies will appoint board members of color within the next year, and pushed the argument of why they should and what they would gain by doing so.”

Despite having a shorter time span than the “What If” forecasts, the “World in 2021” also required taking into account numerous factors, some of which weren’t even on the horizon when the questions were launched in October 2020. Take, for instance, the global GDP question.

“There are so many factors to consider, between Xi and Evergrande and the resultant fallout of the cascade from that default, to new COVID variants stopping workforces, anti-vax movements, the infrastructure bill and the green new deal, and then inflation,” JuliAnn says. “Tons to balance and think about!”

Whether it’s a forecast of global GDP next year or a possibility of using the Moon as a base for space exploration in the following decade, the Superforecasters always apply their rigorous process and tested skills to provide thoughtful numeric forecasts on questions that matter. As for their reward, Karen puts it best: “The enjoyment from forecasting is honing and improving forecasting skill, acquiring new information, and interacting with intellectuals of the same knowledge base.”

You can find Good Judgment’s Superforecasts on the “What If” questions in The Economist’s print edition from 3 July 2021 or on their website, or ask us about a subscription to FutureFirst, Good Judgment’s forecast monitor, to view all our current forecasts from our team of professional Superforecasters.

Books on Making Better Decisions

Books on Making Better Decisions: Good Judgment’s Back-to-School Edition

Since the publication of Tetlock and Gardner’s seminal Superforecasting: The Art and Science of Prediction, many books and articles have been written about the ground-breaking findings of the Good Judgment Project, its corporate successor Good Judgment Inc, and the Superforecasters.

This is not surprising: Decision-makers have a lot to learn from the Superforecasters. Thanks to being actively open-minded and unafraid to rethink their conclusions, the Superforecasters have been able to make accurate predictions where experts often failed. They know how to think in probabilities (or “in bets”), reduce the noise in their judgments, and mitigate cognitive biases such as overconfidence. As Tetlock and Good Judgment Inc have shown, these are skills that can be learned.

Here is a short list of eight notable books that present a wealth of information on ways to evaluate an uncertain future and improve decision-making.

In 2011, IARPA—the research arm of the US intelligence community—launched a massive competition to identify cutting-edge methods to forecast geopolitical events. Four years, 500 questions, and over a million forecasts later, the Good Judgment Project (GJP)—led by Philip Tetlock and Barbara Mellers at the University of Pennsylvania—emerged as the undisputed victor in the tournament. GJP’s forecasts were so accurate that they even outperformed those of intelligence analysts with access to classified data. One of the biggest discoveries of GJP were the Superforecasters: GJP research found compelling evidence that some people are exceptionally skilled at assigning realistic probabilities to possible outcomes—even on topics outside their primary subject-matter training.

In their New York Times bestseller, Superforecasting, our cofounder Philip Tetlock and his colleague Dan Gardner profile several of these talented forecasters, describing the attributes they share, including open-minded thinking, and argue that forecasting is a skill to be cultivated, rather than an inborn aptitude.

Noise, defined as unwanted variability in judgments, can be corrosive to decision-making. Yet, unlike its better-known companion, bias, it often remains undetected—and therefore unmitigated—in decision processes. In addition to research-based insights into better decision-making and remedies to identify and reduce noise as a source of error, Kahneman and his colleagues take a close look at a select group of forecasters—the  Superforecasters—whose judgments are not only less biased but also less noisy than those of most decision-makers. As co-author of Noise Cass Sunstein says, “Superforecasters are less noisy—they don’t show the variability that the rest of us show. They’re very smart; but also, very importantly, they don’t think in terms of ‘yes’ or ‘no’ but in terms of probability.”

Intelligence is often seen as the ability to think and learn, but in a rapidly changing world, there’s another set of cognitive skills that might matter more: the ability to rethink and unlearn. As an organizational psychologist, Adam Grant investigates how we can embrace the joy of being wrong, bring nuance to charged conversations, and build schools, workplaces, and communities of lifelong learners. He also profiles Good Judgment Inc’s Superforecasters Kjirste Morrell and Jean-Pierre Beugoms, who embody the outstanding thought processes suggested in the book. You can read more about Morrell and Beugoms in our interviews here.

David Epstein examines the world’s most successful athletes, artists, musicians, inventors, and forecasters to show that in most fields—especially those that are complex and unpredictable—generalists, not specialists, are primed to excel. In a chapter about the failure of expert predictions, he discusses Phil Tetlock’s research, the GJP, and how “a small group of foxiest forecasters—just bright people with wide-ranging interests and reading habits—destroyed the competition” in the IARPA tournament. Good Judgment Inc’s Superforecasters Scott Eastman and Ellen Cousins, profiled in the book, weigh in on such topics as curiosity, aggregating perspectives, and learning from specialists without being swayed by their often narrow worldviews.

Other books that mention Superforecasting, Good Judgment Inc, or Good Judgment Project