In November 2013, the Washington Post’s David Ignatius reported that the “top forecasters [in the Good Judgment Project], drawn from universities and elsewhere, performed about 30 percent better than the average for intelligence community analysts who could read intercepts and other secret data.”
This tantalizing tidbit – while never denied outright by the US government – has never been confirmed, either … until now.
A new article in the Journal of Judgment and Decision-Making cites an unpublished paper written by US government analysts with access to the classified data needed to make this comparison. The new evidence is conclusive:
Indeed, using Brier scores to measure accuracy, Goldstein, Hartman, Comstock and Baumgarten (2016) found that superforecasters outperformed U.S. intelligence analysts on the same questions by roughly 30%.
The newly published research, authored by a team that includes Good Judgment Chief Scientist Eva Chen, takes a deeper dive into what makes Superforecasters so “super.” The authors examine four different standards that scientists use to evaluate the quality of judgment. As they state in the article’s abstract:
Good judgment is often gauged against two gold standards – coherence and correspondence. Judgments are coherent if they demonstrate consistency with the axioms of probability theory or propositional logic. Judgments are correspondent if they agree with ground truth. When gold standards are unavailable, silver standards such as consistency and discrimination can be used to evaluate judgment quality. Individuals are consistent if they assign similar judgments to comparable stimuli, and they discriminate if they assign different judgments to dissimilar stimuli. We ask whether “superforecasters”, individuals with noteworthy correspondence skills (see Mellers et al., 2014) show superior performance on laboratory tasks assessing other standards of good judgment. Results showed that superforecasters either tied or out-performed less correspondent forecasters and undergraduates with no forecasting experience on tests of consistency, discrimination, and coherence. While multifaceted, good judgment may be a more unified than concept than previously thought.
Good Judgment co-founder Phil Tetlock tweeted this punchline to describe the article’s conclusion:
Good Judgment Inc is proud to bring the scientifically validated superior judgment and forecasting skills of its professional Superforecasters – drawn from the top forecasters in the Good Judgment Project – to clients in government, industry, and non-profits. How can these “unusually thoughtful humans” help your organization solve problems? Contact email@example.com to learn more.