Giving from your Founders Pledge DAF this year-end? Check our 2024 giving deadlines

Developing Guardrails for Military AI Risks

Notes from a Workshop on Confidence-Building Measures for Uncrewed and AI-Enabled Military Systems

Illustrative image

▲ A U.S. Army expeditionary modular autonomous vehiclePhoto by U.S. Army Sgt. Marita Schwab

If you are interested in supporting work like this, please consider donating to the Global Catastrophic Risks Fund.

Imagine the following scenario: In the near future, in the contested waters of the South China Sea, China has deployed ships, underwater vessels, and aerial systems, including many uncrewed and autonomous/AI-enabled weapon systems. In response to harassment of the Philippines, the U.S. sends a carrier strike group and its own uncrewed systems into the area. The area is growing increasingly crowded, with autonomous systems everywhere, and rhetoric between the United States and China is growing increasingly heated. War seems about to break out.

What, if any, guardrails can help avoid catastrophic escalation from uncrewed/autonomous/AI-enabled systems in such a U.S.-China confrontation?

Last month, I attended a workshop at the Center for a New American Security (CNAS) in Washington, DC, that asked exactly this question (including with the scenario exercise described above). It convened experts from AI companies, national governments and militaries, and leading scholars on AI risk, as part of a project enabled by a grant that the Global Catastrophic Risks Fund made earlier this year, focusing on confidence-building measures for autonomous weapons systems. At the conclusion of the workshop, one person remarked that they believed more person-hours were spent that day thinking about confidence-building measures for uncrewed/autonomous systems than cumulatively had been spent at all before the workshop.

Background

Earlier this year, we made a $100,000 grant to CNAS based on our research on great power conflict, AI, autonomy, and international stability. The grant was designed to explore the feasibility of an “International Autonomous Incidents Agreement” and related “confidence-building measures” as described in the military AI report (p. 32-41). Specifically, the case for the grant related to how Cold War-style confidence-building successes could be leveraged to deal with current and future AI-related risks. We also advised Effektiv Spenden’s Zukunft Bewahren fund (with which the GCR Fund works closely) to make a generous additional $100,000 donation to CNAS to enable this workshop for red-teaming, gaming, and stress-testing the ideas.

There are several related ideas behind this grant. First, we believe that advances in AI may be among the most consequential technological developments in human history, and that research, development, and applications of AI driven by the world’s great powers are likely to play a major part in the trajectory of these advances (including by shaping the regulatory environment of private AI labs training frontier models). Second, we recognize that major war is not a thing of the past, but a very real risk to human civilization, and that major militaries have an unprecedented warmaking capacity that — if unleashed — could precipitate a global catastrophe, including nuclear war. AI-enabled military systems stand at the intersection of these two looming risks, such that prioritizing the safety of such AI applications ought to be a major priority for grantmakers. They may also be more tractable than more general AI safety discussions in policy circles, because of their salience to U.S. national security. The details of these risks are discussed in greater detail in the Founders Pledge report mentioned above.

The Participants

The workshop confirmed one consideration that led us to make this grant: CNAS has strong convening power on this issue. Participants at the workshop (which was held under the Chatham House Rule, with some off-the-record components) included:

  • Current and former senior U.S. policymakers with expertise and experience on military AI and U.S. policy, including from the Department of Defense, the Department of State, and other relevant parts of the U.S. government
  • Active and retired high-ranking military officials
  • International diplomatic experts
  • AI experts, including from leading labs
  • Academics, think tank analysts, and experts on confidence building measures

The Replicator Initiative and the Urgency of Military AI Risks

The workshop took place in the shadow of a just-announced major new announcement by the U.S. Department of Defense: the ominously-named Replicator initiative. The initiative, in the words of Deputy Secretary of Defense Kathleen Hicks, aims to “field attritable, autonomous systems at a scale of multiple thousands [and] in multiple domains within the next 18-to-24 months.” The initiative, as described by Hicks, will cross all domains of military competition, with autonomous systems on land, sea, air (“flocks of [...] systems flying at all sorts of altitudes, doing a range of missions, building on what we've seen in Ukraine. They could be deployed by larger aircraft, launched by troops on land or sea, or take off themselves”) and even space, “flung into space scores at a time, numbering so many that it becomes impossible to eliminate or degrade them all.”

The goal, explicitly, is to build and deploy thousands of autonomous systems in fleets and swarms, in order to compete with China. While much remains unknown about the initiative — and I think it is too early to judge whether and how it affects the risks of conflict — Replicator underscores the direct and urgent policy relevance of this grant and the workshop. The DoD is developing these systems in the thousands, extremely quickly, and in the framework of U.S.-China competition.

The Workshop

The first part of the workshop was an off-the record dinner discussion with a senior U.S. government official, which I therefore cannot describe here. The second part of the workshop dove into the details of a draft paper that CNAS had shared with the attendees. Thomas Shugart and Paul Scharre led the discussions (the next day, Scharre was announced as one of the TIME 100 Most Influential People in AI.

The read-ahead paper framed the discussion:

During previous times of military innovation, international tension, or some combination of these or other factors, even adversarial nations have managed to come to agreement on rule sets that standardize and manage interactions, increase communications, and mitigate the negative impacts of novel weapons and platforms (emphasis mine). The ongoing introduction of large-scale autonomous and uncrewed systems, particularly in a time of increased great power tension, calls for the development of analogous rule sets and confidence-building measures (CBMs) to mitigate the risks associated with their operational employment. This project endeavors to develop realistic and mutually beneficial such rule sets and CBMs for the consideration of U.S. and international leaders and policymakers.

Specifically, the project was scoped as:

  • Looking at developments 5-15 years in the future
  • Prioritizing the air and maritime domain
    • (discussion of undersea systems surfaced repeatedly, pun intended)
  • Within the framework of U.S.-China competition in the Indo-Pacific
    • “an air- and maritime-dominated theater where both those players (and others) have nuclear weapons that significantly increase the potential consequences of unintended escalation. This competition also includes two nations at the forefront of artificial intelligence (AI) and uncrewed system development.”

Additionally, the project focused on uncrewed systems as a broad category, rather than autonomous systems exclusively. Uncrewed (replacing the more gendered “unmanned” that used to be common) therefore refers to both autonomous and remotely-piloted systems (i.e. drones). Autonomous and remotely piloted systems are difficult to distinguish in practice, because the difference is software. One reason for treating these systems as one category is that this difference in software is unverifiable from a distance and therefore unsuitable for non-invasive inspections between adversaries that don’t trust each other. I felt that this scoping decision was largely the right choice, in part because we may expect many (or most) future remotely-piloted systems to have autonomous capabilities that go online in the event of communications disruptions; as the paper states, “Since both sides are likely to focus on disrupting each other’s communications in the event of armed conflict, some autonomous capability will be necessary to ensure combat effectiveness.” (At this point, once a conflict has broken out and nations are targeting communication, systems that have some autonomous capabilities will be distinguishable from those that do not, because the latter may cease to operate appropriately.)

Both the discussion and paper sought to answer a variety of questions related to confidence-building measures:

  • What information should be shared between participating nations? For example, information about the deployment or use of autonomous systems in contested or disputed areas.
  • How might information be shared between participating nations? For example, through military-to-military contacts or a dedicated hotline.
  • How might autonomous systems communicate their level of autonomy, such as through standardized signals or marking?
  • Should there be specific geographic areas within which autonomous system use is mutually curtailed or prohibited?
  • How should we define basic responsible behavior for autonomous systems?
  • How might we share results from accidents or near-accidents caused by AI-enabled autonomous systems?
  • Which types of autonomous systems are of greatest relevance for an International Autonomous Incidents Agreement?
  • To what extent may the accepted behavior for crewed systems and platforms need to be adapted for autonomous systems? For example, if autonomous maritime craft follow the Convention on the International Regulations for Preventing Collisions at Sea [known as “COLREGs”], is that sufficient to ensure responsible behavior?
  • What unilateral declaratory statements could be useful as precursors to full international rule set implementation? For example, states communicating a “shoot second” posture that armed autonomous systems will not fire first but will return fire if fired upon.
  • Should autonomy of nuclear-capable systems be uniquely restricted? If so, how? How would such restrictions interact with existing nuclear arms-control agreements? (Of note, there are no such bilateral agreements between China and the U.S.).

The paper dove into greater detail on confidence-building measures (CBMs) for AI than I had seen before. Most previous writing on such CBMs (e.g. in the National Security Commission on AI’s Final Report) had been on an abstract level, pointing to the promise of CBM-like tools in an environment where formal arms control and a “ban” is highly unlikely. (An exception to this is a recent paper that emerged from a collaboration between the Berkeley Risk and Security Lab — which Founders Pledge funded as part of its grant-making on military AI risks — and OpenAI.)

Unlike hand-waving motions to CBMs in the abstract, the CNAS paper pointed to specific clauses and phrases in two Memoranda of Understanding (MOUs) between the U.S. and China in 2014 and 2015, the Code for Unplanned Encounters at Sea (CUES), the UN Convention on the Law of the Sea (UNCLOS), and more. These rules and regulations cross-reference each other in confusing ways. For example, Section I of the the 2014 U.S.-China MOUs references Article 29 of UNCLOS as well as CUES paragraph 1.3.3 in order to define what “military vessel” even means in the first place, but do this in a way that seems to exclude uncrewed systems from the definitions. This was the kind of fine-grained analysis that we were hoping to see from the grant, to provide a real foundation for policymakers to start thinking about the practicalities of negotiating new CBMs or modifying existing CBMs.

The discussion then focused on strengths and weaknesses of the paper. Some participants who had recently taken part in a track II dialogue between the U.S. and China shared their perspectives on what they believed was and was not feasible with the Chinese. Others, who had expertise in U.S. bureaucracy or the military, shared some practical observations about the tractability of different proposed CBMs. At times, AI safety itself emerged as a topic for discussion, including the point that participants need to think about the near-future implications of ever-more-powerful frontier AI models. I think the discussion illustrated the importance of understanding these various AI risks as linked, not separate.

The third part of the workshop focused on a scenario exercise run by Jacquelyn Schneider of the Hoover Institution: The year was 2030, and China was conducting live-fire military exercises in the Spratly Islands of the South China Sea. Supporting the Philippines, the United States sent vessels and aircraft in an attempt to restore the status quo and uphold international law in the area. Both sides deployed significant numbers of autonomous systems in the air and sea; the scenario appeared realistic, and close to war. Participants were divided into different groups and asked to imagine how the crisis might play out, depending on what kinds of CBMs had been put in place before the crisis. Like the discussion, this exercise was clearly designed to stress-test the ideas of the paper, and I think uncovered several areas for improvement.

Next Steps on AI and Great Power Conflict

As mentioned above, at the conclusion of the workshop, one person remarked that they believed more person-hours were spent that day thinking about confidence-building measures for uncrewed/autonomous systems than cumulatively had been spent at all before the workshop. I think this is true. I look forward to following the project further, including its socialization in policy circles — to really make a difference and create new guardrails, these ideas actually need to be implemented. Judging from the workshop’s attendees, I think many of the relevant decision-makers were already there, laying the groundwork not only for more policy on autonomous military systems, but also on safety questions for more powerful AI and for U.S.-China competition that does not end in catastrophe.

Notes

  1. My travel and lodging expenses were covered by Founders Pledge, not CNAS.

  1. Background
  2. The Participants
  3. The Replicator Initiative and the Urgency of Military AI Risks
  4. The Workshop
  5. Next Steps on AI and Great Power Conflict
  6. Notes

    About the author

    Portrait

    Christian Ruhl

    Global Catastrophic Risks Lead

    Christian Ruhl is our Global Catastrophic Risks Lead based in Philadelphia. Before joining Founders Pledge in November 2021, Christian was the Global Order Program Manager at Perry World House, the University of Pennsylvania's global affairs think tank, where he managed the research theme on “The Future of the Global Order: Power, Technology, and Governance.” Before that, Christian studied on a Dr. Herchel Smith Fellowship at the University of Cambridge for two master’s degrees, one in History and Philosophy of Science and one in International Relations and Politics, with dissertations on early modern submarines and Cold War nuclear strategy. Christian received his BA from Williams College in 2017.