- Published on
- Written by
Dr. Philip Tetlock’s Forecasting Research
Dr. Philip Tetlock is a Professor of Political Science at the University of Pennsylvania aiming to carry out research related to forecasting global catastrophic risks.
What problem are they trying to solve?
Professor Tetlock’s research could be very valuable for understanding, forecasting and mitigating global catastrophic risks, for example, those caused by natural or engineered pathogens, artificial intelligence, nuclear weapons or extreme climate change.
Accurately forecasting future events is extremely challenging but Professor Tetlock’s research has developed methods to improve forecasting accuracy, as displayed in Figure 1.
What do they do?
Over the last 35 years, Professor Tetlock has pioneered the practice of forecasting, a way to make predictions about future events more accurate and useful. Professor Tetlock has been described by the economist Tyler Cowen as “one of the greatest social scientists in the world”2 and his papers have garnered almost 50,000 citations.
Professor Tetlock’s research is particularly relevant from a long-term perspective because it has the potential to improve our ability to predict global catastrophic risks. To prioritise efforts to reduce such risks, we must predict their relative probabilities in advance. Of course, this is extremely difficult. Many of the most concerning risks humanity faces in the coming centuries, such as engineered pandemics or superintelligent computer systems, are unprecedented. Professor Tetlock’s forecasting methods could help us make rigorous, comparable and calibrated assessments of the likelihood of these risks and the effectiveness of efforts to reduce them.
Professor Tetlock has an ambitious research agenda related to forecasting global catastrophic risks. He is also an active public figure, building support for forecasting by writing books, going on podcasts and participating in interviews. With additional funding, we expect Professor Tetlock and his collaborator, Dr. Pavel Atanasov, to focus more on global risk forecasting and increase their output in this area.
Why do we recommend them?
- Open Philanthropy, our research partner, recommends Professor Tetlock’s research as one of the highest-impact funding opportunities for mitigating global catastrophic risks in the world.
- Professor Tetlock has a strong track record of conducting innovative, actionable research.
- With additional funding, Professor Tetlock and Dr. Atanasov could produce high-value research directly related to forecasting global catastrophic risks.
Here we briefly review the history of Professor Tetlock’s forecasting research and achievements.
Pioneering forecasting work
Between 1984 and 2003, Professor Tetlock ran a number of forecasting tournaments in which predictions about future events were solicited from hundreds of experts. He then analysed which traits best predicted the accuracy of individual forecasters. The findings, published in the 2005 book Expert Political Judgment: How Good Is It? How Can We Know?, famously suggested that the predictions of experts fared especially poorly. Experts were generally outperformed by simple extrapolation algorithms and generalists who relied on a range of different sources of information.
IARPA collaboration and the Good Judgment Project
After publishing Expert Political Judgment, Professor Tetlock entered into a collaboration with IARPA to further test different forecasting strategies. Between 2011 and 2015, IARPA sponsored a large forecasting tournament involving thousands of different forecasters and over a million forecasts. Professor Tetlock worked with a team of forecasters who had performed well in previous forecasting tournaments. He calls these individuals, who were consistently able to finish in the top 10 percent of the tournaments they entered, “superforecasters.” The superforecasting team, which Professor Tetlock called the Good Judgment Project, beat teams of other experts and intelligence professionals to win the IARPA tournament. Professor Tetlock analysed the significance of this result in the 2015 book Superforecasting.
These past successes have led to the development of multiple forecasting platforms. Professor Tetlock co-founded Good Judgment, a consultancy that offers bespoke forecasting and workshops to private clients. Good Judgment also runs Good Judgment Open, an open platform for crowd-based forecasts. Metaculus and HyperMind are similar platforms inspired by Professor Tetlock’s forecasting research.
As an example of the value of forecasting, Open Philanthropy contracted Good Judgment to make public forecasts related to the development of the COVID-19 pandemic.
Why do we trust this organisation?
For this recommendation, we are grateful to be able to utilise the in-depth expertise of, and background research conducted by, current and former staff at Open Philanthropy, the world’s largest grant-maker on global catastrophic risk. Open Philanthropy identifies high-impact giving opportunities, makes grants, follows the results and publishes its findings. (Disclosure: Open Philanthropy has made several unrelated grants to Founders Pledge.)
As indicated above, Professor Tetlock has demonstrated a strong track record of success, not only as a researcher, but as a science communicator and project manager too.
Dr. Philip Tetlock is the Penn-Integrates-Knowledge Annenberg Professor at the University of Pennsylvania, with cross-appointments at Wharton and the School of Arts and Sciences. He has done extensive research over the last three decades on the accuracy of a wide range of geopolitical, economic and military outcomes.
Dr. Pavel Atanasov has previously worked closely with Professor Tetlock on research projects involving a variety of forecasting methods, including aggregation algorithms, prediction markets and identifying accurate forecasters. Professor Tetlock considers Dr. Atanasov to be crucial to the success of the project.
What would they do with more funding?
We expect that additional funding at this time would help Professor Tetlock and Dr. Atanasov extend their forecasting methods to global catastrophic risks. They are seeking support for work on methodological questions with an eye towards hosting a forecasting tournament focused on global catastrophic risks in summer 2021. They call this “second generation forecasting”: forecasting that predicts events over longer timescales and in the face of deep uncertainty. To date, most forecasting research has focused on questions that resolve within six months. Some questions from Professor Tetlock’s early tournaments had timescales of up to 25 years or more, but had not been resolved by the time Expert Political Judgement was published.3 To apply forecasting to questions of global catastrophic risks, more research on the feasibility of forecasting on longer timescales would be needed.
To extend forecasting methods to questions of global catastrophic risks, Professor Tetlock contends that more work is required in two main areas. The first is the development of “Bayesian question clusters.” It is especially difficult to forecast probabilities of global catastrophic risks directly. However, it may be possible to develop sets of nearer, more precise indicators that have a causal connection to the global catastrophic risk in question. Once the Bayesian question cluster is defined, forecasting the sub-questions can provide information on the risk.
The second area of interest concerns how to better and more quickly combine the views of forecasters to improve group forecasts. Professor Tetlock refers to this as “Making Conversations Smarter, Faster”. Methods to test here may include things like getting participants and judges to rank the quality of teammates’ contributions using formal and informal criteria, heuristics and examples; testing mechanisms for accountability and feedback; or identifying “supercommenters” who are able to comment on and improve other forecasters’ decision-making processes.
Professor Tetlock and Dr. Atanasov have identified five key methodological questions that must be resolved before a global catastrophic risk forecasting tournament can be held:
- Developing “Bayesian question clusters” comprised of short-term resolvable questions that can serve as early warning indicators for much longer-range existential risks;
- Strengthening methods to incentivise forecasters to report their true probabilistic beliefs (and minimise distortions linked to forecasters’ values and relative aversions to false-positive and false-negative errors);
- Devising ways to forecast which forecasters are more likely to be accurate longer-term, including question clusters, coherence metrics for checking logical consistency with axioms of probability theory (e.g., scope sensitivity) and propensities to make extreme historical-counterfactual judgments;
- Supplementing “gold-standard objective reality” correspondence metrics with silver-standard metrics for judging “intersubjective reality.” These would assess forecasters’ skill at anticipating shifts in mass-public and experts’ views on global catastrophic risks;
- Adapting methods for assessing the accuracy of causal/counterfactual judgments (developed in ongoing IARPA tournaments) for assessing the likely impact of increasingly speculative technologies for improving life on our planet and eventually elsewhere in the universe.
Since July 2017, The Open Philanthropy Project has granted roughly $1.5 million to Professor Tetlock and his collaborators "to support the initial development and pre-testing of [Making Conversations Smarter, Faster], laying the groundwork for future confirmatory studies.”4 If he does not receive funding from Founders Pledge members at this time, Dr. Atanasov will focus on topics other than global catastrophic risk forecasting in the coming year.
What are the major open questions?
One concern we have about this research is that questions related to forecasting global catastrophic risks do not seem well-suited to forecasting tournaments like those Professor Tetlock has previously carried out. For example, questions used for the Expert Political Judgement tournaments were chosen for clear resolution criteria, had mutually exclusive answers, and were amenable to base-rate forecasting.5 Some questions relevant to global catastrophic risk work will very likely lack one or all of these characteristics. For this reason, as well as the inherent difficulty of making tangible progress on reducing global catastrophic risks, we see this project as part of our portfolio of high-risk, high-reward research. Successfully inventing new ways to forecast global catastrophic risk would be a hugely valuable achievement; however, the most likely result of this project is that it will make marginal progress, if any, towards this goal.
Another concern we have is that Professor Tetlock is involved with other projects until at least June 30, 2021. Additional funding at this time would support the work of Dr. Pavel Atanasov, who would launch the foundational methodological research previously described. Although Dr. Atanasov’s track record is less established than Professor Tetlock’s, Dr. Atanasov has collaborated closely with Professor Tetlock previously and has his full support.
Message from the organisation
We see value in an overview article that explores, in depth, the methodological challenges of designing forecasting tournaments aimed at existential risks. We would start by exploring the root causes of the failure of the Taleb and Tetlock (2013) collaboration and by documenting, in didactic detail, the central role that implicit probability judgments inevitably play in deciding how to prioritize expensive precautionary policy investments. We would also make the case that although probability estimation of tail-risk contingencies is extremely problematic, it is not inherently impossible and the expected-value of even tiny increments in accuracy is high.
The core of the article would however lay out the specific methodological steps that we need to take to render a seemingly intractable challenge semi-tractable. We currently see five such steps: (a) developing Bayesian questions clusters of short-term resolvable questions, each with non-zero, independent diagnostic value vis-à-vis long-range existential risks, that can serve as early warning indicators; (b) developing methods of incentivizing forecasters to report their true probabilistic beliefs (uncontaminated by value thresholds that influence tolerance for errors of under- vs. over-estimation); (c) forecasting which forecasters are likelier to be more accurate (e.g., correspondence metrics/track records; coherence metrics for checking logical consistency with axioms of probability theory; cognitive-style metrics of propensity to make extreme historical-counterfactual judgments); (d) supplementing gold-standard correspondence metrics grounded in objective reality with silver-standard metrics grounded in intersubjective reality—and assessing skill at anticipating shifts in mass-public and experts’ views on existential risks; e) developing new reciprocal scoring rules together with Ezra Karger to assess risk mitigation strategies; (f) adapting risk-detection methods to opportunity-detection and the feasibility of gradually increasingly speculative technologies for improving life on our planet and elsewhere in the universe. We consider this project a key first step toward existential risk tournaments and, eventually, better guidance to policymakers about existential risks and mitigation strategies.
-Professor Philip Tetlock and Dr. Pavel Atanasov
- 80,000 hours podcast with Professor Tetlock
- Freakonomics podcast episode with Professor Tetlock
- Foreign Affairs article describing question clusters co-authored by Professor Tetlock
- Scientific American articles on identifying accurate forecasters and prediction systems by Dr. Atanasov
- Washington Post post-mortem piece on 2016 election forecasts co-authored by Dr. Atanasov
- Edge Masterclass led by Professor Tetlock
Disclaimer: We do not have a reciprocal relationship with any charity, and recommendations are subject to change based on our ongoing research.
“Most GJP forecasts had time horizons of 1-6 months, and thus can tell us little about the feasibility of long-range (≥10yr) forecasting. In Tetlock’s EPJ studies, however, forecasters were asked a variety of questions with forecasting horizons of 1-25 years. (Forecasting horizons of 1, 3, 5, 10, or 25 years were most common.) Unfortunately, by the time of Tetlock (2005), only a few 10-year forecasts (and no 25-year forecasts) had come due, so Tetlock (2005) only reports accuracy results for forecasts with forecasting horizons he describes as “short-term” (1-2 years) and “long-term” (usually 3-5 years, plus a few longer-term forecasts that had come due)” (Luke Muehlhauser, “How Feasible Is Long-Range Forecasting”, Open Philanthropy blog). ↩
Muehlhauser, “How Feasible Is Long-Range Forecasting” ↩