Exploring The Feasibility and Utility of Machine Learning-Assisted Command and Control
Exploring The Feasibility and Utility of Machine Learning-Assisted Command and Control
Machine Learning-
Assisted Command
and Control
Volume 1, Findings and Recommendations
For more information on this publication, visit www.rand.org/t/RRA263-1.
About RAND
The RAND Corporation is a research organization that develops solutions to public policy
challenges to help make communities throughout the world safer and more secure, healthier
and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. To
learn more about RAND, visit www.rand.org.
Research Integrity
Our mission to help improve policy and decisionmaking through research and analysis is
enabled through our core values of quality and objectivity and our unwavering commitment
to the highest level of integrity and ethical behavior. To help ensure our research and analysis
are rigorous, objective, and nonpartisan, we subject our research publications to a robust and
exacting quality-assurance process; avoid both the appearance and reality of financial and
other conflicts of interest through staff training, project screening, and a policy of mandatory
disclosure; and pursue transparency in our research engagements through our commitment to
the open publication of our research findings and recommendations, disclosure of the source
of funding of published research, and policies to ensure intellectual independence. For more
information, visit www.rand.org/about/principles.
RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors.
Published by the RAND Corporation, Santa Monica, Calif.
© 2021 RAND Corporation
is a registered trademark.
Library of Congress Cataloging-in-Publication Data is available for this publication.
ISBN: 978-1-9774-0709-2
Cover: U.S. Air Force photo; MF3d/Getty Images.
Limited Print and Electronic Distribution Rights
This document and trademark(s) contained herein are protected by law. This representation
of RAND intellectual property is provided for noncommercial use only. Unauthorized posting
of this publication online is prohibited. Permission is given to duplicate this document for
personal use only, as long as it is unaltered and complete. Permission is required from RAND
to reproduce, or reuse in another form, any of its research documents for commercial use. For
information on reprint and linking permissions, please visit www.rand.org/pubs/permissions.
Preface
iii
iv Machine Learning-Assisted Command and Control: Findings and Recommendations
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Figures and Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Abbreviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
CHAPTER ONE
Introduction and Project Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Study Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Study Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Organization of Report.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
CHAPTER TWO
Taxonomy of Problem Characteristics.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Taxonomy and Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Analysis of Games and Command and Control Problems.. . . . . . . . . . . . . . . . . . 20
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
CHAPTER THREE
Taxonomy of Solution Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Taxonomy and Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Analysis of Artificial Intelligence Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
v
vi Machine Learning-Assisted Command and Control: Findings and Recommendations
CHAPTER FOUR
Mapping Problem Characteristics to Solution Capabilities. . . . . . . . . . . . . . 33
Expert Panel on Artificial Intelligence for Command and Control.. . . . . . . 33
Scoring Alignment Between Command and Control Processes and
Artificial Intelligence Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
CHAPTER FIVE
Metrics for Evaluating Artificial Intelligence Solutions. . . . . . . . . . . . . . . . . . 45
Measures of Effectiveness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Measures of Performance.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Measures of Suitability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Analysis of Metric Categorization.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
CHAPTER SIX
Conclusion and Recommendations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Conclusion 1. Command and Control Processes Are Very Different
From Many of the Games and Environments Used to Develop and
Demonstrate Artificial Intelligence Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Conclusion 2. The Distinctive Nature of Command and Control
Processes Calls For Artificial Intelligence Systems Different From
Those Optimized For Game Play. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Conclusion 3. New Guidance, Infrastructure, and Metrics Are Needed
to Evaluate Applications of Artificial Intelligence to Command
and Control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Conclusion 4. Hybrid Approaches Are Needed to Deal With the
Multitude of Problem Characteristics Present In Command and
Control Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figures and Tables
Figures
S.1. Artificial Intelligence System Capability Mapping and
Command and Control Process Evaluation. . . . . . . . . . . . . . . . . . . . . . . . x
1.1. Number of Requirements by Type and by Air Operations
Center Mission Thread. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2. Determining Alignment Between Problem Characteristics
and Solution Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3. Evaluative Framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1. Average Values of Problem Characteristics. . . . . . . . . . . . . . . . . . . . . . . 22
3.1. Average Values of Solution Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1. Expert Panel Protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2. Median Ratings of Importance by Problem-Solution Pair.. . . . . 36
4.3. Relative Importance of Solution Capabilities Across Ten
Command and Control Processes and Games, and
Capabilities of Artificial Intelligence Systems Analyzed. . . . . . . . . 41
5.1. Defense Advanced Research Projects Agency Metric
Classifications by Number (top) and by Percentage of
Programs with Metric (bottom).. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.1. Artificial Intelligence System Capability Mapping and
Command and Control Process Evaluation. . . . . . . . . . . . . . . . . . . . . . 64
Tables
1.1. Recent Milestones in Artificial Intelligence Game Play. . . . . . . . . . 8
1.2. Recent Milestones in Applied Artificial Intelligence. . . . . . . . . . . . . . 9
2.1. Literature Review of Problem Characteristics. . . . . . . . . . . . . . . . . . . . . 18
2.2. Problem Characteristics, Descriptions, and Command and
Control Examples.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
vii
viii Machine Learning-Assisted Command and Control: Findings and Recommendations
Issues
• A key priority for the U.S. Air Force is to use artificial intelligence
(AI) to enhance military command and control (C2).
• The academic and commercial contexts in which AI systems have
been developed and deployed are qualitatively different from the
military contexts in which they are needed.
• The Air Force lacks an analytical framework for understanding
the suitability of different AI systems for different C2 problems
and for identifying pervasive technology gaps.
• The Air Force lacks sufficient metrics of merit for evaluating the
performance, effectiveness, and suitability of AI systems for C2
problems.
Approach
ix
x Machine Learning-Assisted Command and Control: Findings and Recommendations
The RAND team proposes a structured method for determining the suit-
ability of an AI system for any given C2 process (Figure S.1). The meth-
odology involves (1) evaluating the C2 problem characteristics, (2) eval-
uating the AI system capabilities, (3) comparing alignment between
problem characteristics and solution capabilities, (4) selecting measures
of merit, and (5) implementing, testing, and evaluating potential AI sys-
tems. In addition to providing a methodology to determine alignment
between C2 problems and AI solutions, this research supports several
conclusions shown in Figure S.1 along with associated recommendations.
Figure S.1
Artificial Intelligence System Capability Mapping and Command and
Control Process Evaluation
Conclusion 1. C2 processes are very different from many of the games and
environments used to develop and demonstrate AI systems
Command and control process
Conclusion 4. Hybrid
approaches are needed
to deal with the
Measures of Measures of Measures of multitude of problem
suitability effectiveness performance characteristics present
in C2 processes
Measures of merit
xiii
Abbreviations
xv
xvi Machine Learning-Assisted Command and Control: Findings and Recommendations
1 DoD, “Secretary of Defense Speech, Reagan National Defense Forum Keynote,” Defense
.gov, December 7, 2019. DoD lacks agreed upon definitions of autonomy and of artificial
intelligence. Although the two terms are not synonymous, any autonomous system contains
one or more forms of AI.
2 W. J. Dahm, Technology Horizons: A Vision for Air Force Science and Technology During
2010‒2030, Arlington, Va.: U.S. Air Force, 2010.
3 Sections 238 and 1051 of the National Defense Authorization Act, respectively.
4 DARPA, “DARPA Announces $2 Billion Campaign to Develop Next Wave of AI Tech-
nologies,” Arlington, Va., March 12, 2020.
1
2 Machine Learning-Assisted Command and Control: Findings and Recommendations
5 For example, see Bernard Marr and Matt Ward, Artificial Intelligence in Practice: How
50 Successful Companies Used AI and Machine Learning to Solve Problems, Chichester, U.K.:
Wiley, 2019.
6 U.S. Air Force Scientific Advisory Board, Technologies for Enabling Resilient Command
and Control MDC2 Overview, Washington, D.C., 2018; G. Zacharias, Autonomous Hori-
zons: The Way Forward, Maxwell Air Force Base, Ala.: Air University Press, Curtis E. LeMay
Center for Doctrine Development and Education, 2019a.
7 This has been reported extensively elsewhere. For example, see Yuna Huh Wong, John M.
Yurchak, Robert W. Button, Aaron Frank, Burgess Laird, Osonde A. Osoba, Randall Steeb,
Benjamin N. Harris, and Sebastian Joon Bae, Deterrence in the Age of Thinking Machines,
Santa Monica, Calif.: RAND Corporation, RR-2797-RC, 2020.
8 U.S. Air Force, Science and Technology Strategy: Strengthening USAF Science and Technol-
ogy for 2030 and Beyond, Washington, D.C., April 2019b.
Introduction and Project Overview 3
Study Context
Terminology
For the purposes of this report, we define AI and machine learning
(ML) as follows: AI is an academic discipline concerned with machines
demonstrating intelligence—that is, behaving in a rational way given
what they know;10 ML is a subfield of AI that concerns machines per-
forming tasks without first receiving explicit instructions. The field of
AI is expansive and includes topics such as problem-solving, knowledge
and reasoning, planning, and learning. ML is a type of AI in which the
machine learns to perform tasks through exposure to training data or
through interactions with a simulation environment. Neural networks
are but one class of ML techniques, along with many other statistical
methods.
11 Joint Publication 3-0, Joint Operations, Washington, D.C.: U.S. Joint Chiefs of Staff,
January 17, 2017. Command is the authority lawfully exercised over subordinates, and con-
trol is the process—inherent in command—by which commanders plan, guide, and conduct
operations.
12 Lockheed Martin Information Systems and Global Services, Technical Requirements Doc-
ument (TRD), for the Air and Space Operations Center (AOC) Weapon System (WS), draft,
AOCWS-TRD-0000-U-R8C0, prepared for 652 ELSS/KQ Electronic Systems Center,
Hanscom AFB, Colorado Springs, Colo.: Lockheed Martin Information Systems and Global
Services, November 16, 2009. Not available to the general public.
Introduction and Project Overview 5
Figure 1.1
Number of Requirements by Type and by Air Operations Center Mission
Thread
Op sse 150
a
100
er ssm
JAOP
at e
DP
io nt
AA
as 50
na
ACP
se Ta
l
ss ct
m ica
en l
D
0
AO
t
Objectives,
Assessment effects, and
guidance
ic
Ab nd
m g
IS
na tin
ort
Dy rge
Gr
ta
AI/ML
ACO
AT
Non-AI/ML
O
NOTE: ACO: Airspace Control Order Development; ACP: Airspace Control Plan; AADP:
Area Air Defense Plan; AOD: Air Operations Directive; ATO (air tasking order): ATO
Development; CAS: Close Air Support; CSAR: Combat Search and Rescue; DT: Dynamic
Targeting; ISR (intelligence, surveillance, and reconnaissance): ISR Planning; JAOP: Joint
Air Operations Planning; JIPTL: Joint Integrated Prioritized List Development; PBA:
Predictive Battlespace Awareness; TMD: Theater Missile Defense.
“smart agent decision aids.” Following the cancelation of the AOC 10.2
in 2016, these capabilities have not yet been delivered.
The retirement of legacy the AOC systems and the deployment
of new Block 20 applications by Kessel Run provide an on-ramp for
AI into operational-level C2. Additionally, the enterprise services and
platform managed by Kessel Run enable the transition of software—
potentially including AI—to the AOC. Finally, other Kessel Run
6 Machine Learning-Assisted Command and Control: Findings and Recommendations
13Robert Winkler, The Evolution of the Joint ATO Cycle, Norfolk, Va.: Joint Advanced
Warfighting School, 2006.
14Joint Publication 3-30, Command and Control of Joint Air Operations, Washington, D.C.:
U.S. Joint Chiefs of Staff, January 12, 2010.
Introduction and Project Overview 7
Table 1.1
Recent Milestones in Artificial Intelligence Game Play
Silver et al., Go, chess, shogi High dimensionality Deep reinforcement learning
2018 (Japanese Monte Carlo tree search
chess)
18 David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai,
Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy
Lillicrap, Karen Simonyan, and Demis Hassabi, “A General Reinforcement Learning Algo-
rithm that Masters Chess, Shogi, and Go Through Self-Play,” Science, Vol. 362, No. 6419,
December 2018.
19 F. Gobet and H. A. Simon, “Templates in Chess Memory: A Mechanism for Recalling
Several Boards,” Cognitive Psychology, Vol. 31, No. 1, 1996.
20 In some games, such as chess, advanced play by teams of expert humans and computer
programs has been explored, although the strongest players are now purely computational.
Introduction and Project Overview 9
Table 1.2
Recent Milestones in Applied Artificial Intelligence
21 The DoD defense industrial base is also advancing applied AI in such areas as processing,
exploitation, and dissemination (e.g., Project Maven), operational C2 (e.g., DARPA Resil-
ient Synchronized Planning and Assessment for the Contested Environment), and tactical
control (e.g., DARPA Air Combat Evolution).
22 National Research Council, Funding a Revolution: Government Support for Computing
Research, Washington, D.C.: National Academy Press, 1999.
10 Machine Learning-Assisted Command and Control: Findings and Recommendations
During the past 70 years, and across the first and second AI “winters,”23
the Advanced Research Projects Agency and DARPA have provided
continuous support for basic and applied AI research. This support has
contributed to various commercial successes. For example, the multi-
billion-dollar market for autonomous cars can be traced back to the
first DARPA Grand Challenge; and Siri emerged from the DARPA
Personal Assistant that Learns program. This support has also contrib-
uted to various military successes. For example, U.S. Transportation
Command used the Dynamic Analysis and Replanning Tool during
Operation Desert Storm to move tanks and heavy artillery to Saudi
Arabia three weeks faster than would have otherwise been possible, and
the Command Post of the Future has become a U.S. Army program
of record.
Notwithstanding these successes, few AI systems have been tran-
sitioned to the military. To enable such transitions, the right techno-
logical capabilities must be aligned to operational needs and integrated
with existing and emerging systems. The following four issues encom-
pass some of the primary challenges to this transition:
23 For example, see Kathleen Walch, “Are We Heading for Another AI Winter Soon?,”
Forbes, October 20, 2019.
Introduction and Project Overview 11
Study Methodology
As AI moves out of the laboratory and into the home, workplace, and
battle space, the need to identify high-quality solutions to real-world
problems grows ever more acute. Applied AI demands methodologies
24 Air Force Life Cycle Management Center, Battle Management Directorate, Descriptive
List of Applicable Publications (DLOAP) for the Air Operations Center (AOC), Hanscom Air
Force Base, Mass., April 1, 2019. Not available to the general public.
25 DoD, A Critical Change to the Air Operations Center—Weapon System Increment 10.2
Program Increased Costs and Delayed Deployment for 3 Years, Washington, D.C.: Inspector
General, DODIG-2017-079, 2017a.
12 Machine Learning-Assisted Command and Control: Findings and Recommendations
Figure 1.2
Determining Alignment Between Problem Characteristics and Solution
Capabilities
AI solution
Solution capabilities
Problem characteristics
C2 problem
Introduction and Project Overview 13
Figure 1.3
Evaluative Framework
Step 2
Evaluate solution capabilities
Step 3
Compare alignment between problem and solution
Organization of Report
This report comprises two volumes. The first contains the primary
findings and recommendations. It is designed for the policymaker. The
second contains the supporting analysis. It is designed for those inter-
ested in technical details and potential extensions. The remainder of this
volume follows the evaluative framework outlined in Figure 1.3:
17
18 Machine Learning-Assisted Command and Control: Findings and Recommendations
Table 2.1
Literature Review of Problem Characteristics
Rittle and Webber, 1973 10 properties of planning problems that make them
“wicked”
Table 2.2
Problem Characteristics, Descriptions, and Command and Control Examples
Problem
Grouping Characteristic Description C2 Example
4 The variability of scores was somewhat lower for games because so many values were zero.
Taxonomy of Problem Characteristics 21
Table 2.3
Scoring of Problem Characteristics
Stochasticity of Action
Clarity of Goals/Utility
Incompleteness of
Data Availability
Environmental
Clutter/Noise
Information
Reducibility
Outcomes
Benefits
Change
Game
Tic-tac-toe 3 0 0 0 0 0 0 0 0 0
Tetris 4 0 0 0 0 0 0 0 0 0
Checkers 3 0 2 0 0 0 0 0 0 0
Chess 3 0 2 3 0 0 0 0 0 0
Go 3 0 3 3 0 0 0 0 0 0
Texas Hold’em 3 0 2 0 0 0 2 0 3 1
CartPole-v1 4 0 3 0 0 0 0 0 0 0
HalfCheetah-v2 4 0 3 0 0 0 0 0 0 0
Bridge 3 0 2 2 2 0 2 0 4 0
StarCraft II 4 0 2 3 0 1 0 0 2 0
C2 Process
Army Intelligence 1 3 2 3 3 3 0 3 3 3
Preparation of the
Battlefield
MAAP 2 2 2 2 3 1 0 1 2 3
Nuclear retargeting 3 4 2 2 3 3 1 2 3 4
Operational assessment 2 1 2 3 2 1 0 1 4 2
Sensor management 3 3 2 2 2 3 1 1 2 3
Figure 2.1
Average Values of Problem Characteristics
Operational tempo
Operational
risks/benefits Problem complexity
Incompleteness Reducibility
of information
Summary
27
28 Machine Learning-Assisted Command and Control: Findings and Recommendations
Table 3.1
Literature Review of Solution Capabilities
4 Zacharias, 2019a.
5 Dahm, 2010.
6 McKinsey Global Institute, Jacques Bughin, Eric Hazan, Sree Ramaswamy, Michael
Chui, Tera Allas, Peter Dahlström, Nicolaus Henke, and Monica Trench, Artificial Intel-
ligence: The Next Digital Frontier?, New York: McKinsey & Company, June 2017.
Taxonomy of Solution Capabilities 29
Table 3.2
Solution Capabilities and Definitions
Problem
Grouping Characteristic Description C2 Example
Table 3.3
Scoring of Solution Capabilities
Data Efficiency
Computational
Explainability
Assuredness
Robustness
Optimality
Soundness
Efficiency
Learning
AI System
Deep Q-Learning 4 1 3 0 0 3 0 0
AlphaZero 3 4 3 0 0 3 0 0
Instance-based learning 2 1 1 2 2 3 2 0
Iterated-Width Planning 1 4 3 4 3 0 3 3
Alpha-beta pruning 3 4 2 4 2 0 4 4
Greedy heuristic 4 4 1 4 2 0 4 4
Influence network 1 4 4 3 2 0 3 4
Genetic algorithm 2 3 2 3 1 0 0 1
Taxonomy of Solution Capabilities 31
ent to a large extent). The mean rating across systems and capabili-
ties equaled 2.1 out of 4, and no single system had all capabilities.
The ratings illustrate a general trade-off between systems that learn
and systems that do not. As compared with systems that learn, sys-
tems that do not have higher average ratings for data efficiency (3.7
versus 0.8), assuredness (3.3 versus 0), soundness (3.8 versus 1.8), and
explainability (2.8 versus ‒0.8). Conversely, systems that do not learn
have lower average ratings for computational efficiency (1.8 versus
3.3) and similar average ratings for optimality (2.7 versus 2.5) and
robustness (2 versus 0.8).
Figure 3.1 shows the average ratings for solution capabilities
across all AI systems. Overall, the systems had highest average ratings
for soundness, optimality, and data efficiency. The finding that data
efficiency was rated relatively high and learning was rated relatively
low reflects the different numbers of learning and nonlearning systems
included in the sample (four and six, respectively). Robustness had
moderate-to-low ratings for learning and nonlearning systems alike.
Figure 3.1
Average Values of Solution Capabilities
Computational
efficiency
Assuredness Soundness
Explainability Optimality
Robustness
AI systems
32 Machine Learning-Assisted Command and Control: Findings and Recommendations
Summary
33
34 Machine Learning-Assisted Command and Control: Findings and Recommendations
but given the general nature of the problem characteristics and solution
capabilities, C2 expertise was not needed to participate.
The panel featured an embedded mixed-methods design and fol-
lowed established practices for eliciting expert judgments.1 Quantita-
tive data were used to determine the importance of solution capabili-
ties for each problem characteristic, and qualitative data were used to
understand factors influencing those ratings. Experts completed two
rating rounds interspersed with a discussion round (Figure 4.1).
In the first round, experts reviewed definitions of all problem
characteristics and solution capabilities. The instructions explained
that the purpose of the panel was to determine the importance of each
solution capability for each problem characteristic. Experts were pre-
sented with all 80 pair-wise combinations of problem characteristics
and solution capabilities, and they rated and commented on the impor-
tance of the solution capability for each pair. Experts used nine-point
scales to rate the importance of the solution capability given the prob-
lem characteristic. The scale ranged from not important (ratings 1 to 3),
Figure 4.1
Expert Panel Protocol
RE RE
PA F
M
LE
CO
Feedback and
CT
discussion
Assessment Reassessment
Engage with other
Answer questions participants, Revise your original
and explain your compare answers, responses based on
position and share group feedback
perspectives and discussion
online
DE
GE
AT
B
E G
EN
1 Kathryn Fitch, Steven J. Bernstein, Maria Dolores Aguilar, Bernard Burnand, Juan
Ramon LaCalle, Pablo Lazaro, Mirjam van het Loo, Joseph McDonnell, Janneke Vader, and
James P. Kahan, The RAND/UCLA Appropriateness Method User’s Manual, Santa Monica,
Calif.: RAND Corporation, MR-1269-DG-XII/RE, 2001.
Mapping Problem Characteristics to Solution Capabilities 35
Figure 4.2
Median Ratings of Importance by Problem-Solution Pair
Operational tempo
Nonstationary
Complexity
Problem characteristic
Importance
Reducibility 9
Data availability 7
Environmental clutter 5
Clarity of goal/utility 1
Incomplete information
Operational risks/benefits
e
ng
nt
nt
st
al
ed
d
bl
bu
un
im
ie
ie
ni
na
ur
fic
fic
ar
Ro
So
pt
ss
ai
ef
ef
Le
O
A
pl
a
p
Ex
at
m
D
Co
Solution capability
beneficial. More detail about expert ratings and free responses may be
found in Volume 2.
The results from the expert panel enable a general and systematic way
to judge the suitability of an AI system for a given problem. We demon-
strate the method with three worked examples, beginning with AI for
computer chess and ending with AI for C2. The first example involves
applying alpha-beta pruning to the game of chess. The method is as
follows:
• Rate the problem characteristics. Volume 2 lists ratings for the ten
problem characteristics for chess. We duplicate these values down
the column labeled “Rating” in Table 4.1.
Mapping Problem Characteristics to Solution Capabilities 37
Table 4.1
Determining the Suitability of Alpha-Beta Pruning for Chess
Computational Efficiency
Solution Capability
Data Efficiency
Explainability
Assuredness
Robustness
Soundness
Optimality
Learning
Problem Characteristic Rating 3 4 4 2 2 0 4 4
Operational tempo 3 9 12 6 12
Problem complexity 2 6 8 4 0 8 8
Reducibility 3 12
Data availability 0 0 0 0 0
Environmental clutter/noise 0 0 0 0
Incompleteness of information 0 0 0 0 0
• Rate the solution capabilities. Volume 2 lists ratings for the eight
solution capabilities for alpha-beta pruning. We duplicate these
values across the row labeled “Rating” in Table 4.1.
• Multiply the values of problem characteristics by the values of solution
capabilities. We then multiply ratings for problem characteristics in
chess with ratings for solution capabilities in alpha-beta pruning.
Note that we only do this for the 36 critical problem-solution pairs
identified by the expert panel, which are shaded in gray in Table 4.1.
• Sum over the critical pairs. The bottom row of Table 4.1 provides
the sum of scores for each column. The right-most value in the
bottom row is the sum across all columns and represents a com-
posite measure of alpha-beta pruning’s suitability for chess.
38 Machine Learning-Assisted Command and Control: Findings and Recommendations
3 One could argue that the assessment of optimality for alpha-beta pruning was too gener-
ous and the assessment for AlphaZero was too harsh. If we set the values of optimality to
1 and 4, respectively, the new suitability scores still strongly favor alpha-beta pruning (145
versus 65).
Mapping Problem Characteristics to Solution Capabilities 39
Table 4.2
Determining the Suitability of AlphaZero for Chess
Computational Efficiency
Solution Capability
Data Efficiency
Explainability
Assuredness
Robustness
Soundness
Optimality
Learning
Problem Characteristic Rating 3 0 4 3 0 3 0 0
Operational tempo 3 9 12 0 0
Problem complexity 2 6 0 0 6 0 0
Reducibility 3 0
Data availability 0 0 0 0 0
Environmental clutter/noise 0 0 0 0
Incompleteness of information 0 0 0 0 0
AlphaZero total 15 0 12 0 0 6 0 0 33
4 Additional details about the MIP and the heuristic are provided in Volume 2.
40 Machine Learning-Assisted Command and Control: Findings and Recommendations
Table 4.3
Determining the Suitability of a Mixed-Integer Program and a Greedy
Heuristic for a Master Air Attack Plan
Computational Efficiency
Solution Capability
Data Efficiency
Explainability
Assuredness
Robustness
Soundness
Optimality
Learning
Problem Characteristic Rating 0, 4 4, 4 4, 4 4, 1 2, 2 0, 0 3, 4 4, 4
Operational tempo 2 0, 8 8, 8 4, 4 8, 8
Problem complexity 2 0, 8 8, 8 4, 4 0, 0 6, 8 8, 8
Reducibility 2 8, 8
Environmental clutter/noise 1 2, 2 0, 0 4, 4
Incompleteness of information 2 8, 8 4, 4 0, 0 6, 8
NOTE: The first value in each cell is for the MIP, and the second value in each cell is for
the heuristic.
the system’s different capabilities. The MIP was rated higher for opti-
mality, whereas the heuristic was rated higher for computational effi-
ciency and explainability. Given the problem characteristics embod-
ied in MAAP, the latter two capabilities, computational efficiency and
explainability, are more important than optimality.
Finally, this method can be used to determine which solution
capabilities are most called for across a collection of problems or pro-
cesses. Chapter Two contains an analysis of problem characteristics for
Mapping Problem Characteristics to Solution Capabilities 41
ten games and C2 processes (Table 2.3). The results from that analysis
combined with the 36 critical problem-solution pairs identified by the
expert panel can be used to determine the relative importance of the
eight solution capabilities for each set of problems.
Figure 4.3 shows the importance of the eight solution capa-
bilities separately for games and C2 processes.5 Values are higher for
C2 processes—because they embody more problem characteristics,
they also call for more solution capabilities. Games of strategy and
Figure 4.3
Relative Importance of Solution Capabilities Across Ten Command and
Control Processes and Games, and Capabilities of Artificial Intelligence
Systems Analyzed
Computational
efficiency
Assuredness Soundness
Explainability Optimality
Robustness Games
AI systems
5 For each solution capability, we determined the problem characteristics that called for it.
We then summed across the ratings for those problem characteristics for a given game or C2
process. For example, computational efficiency is called for by problems with high opera-
tional tempo, high rate of environment change, and high complexity. The importance of
computational efficiency for MAAP equals 2 + 2 + 2, or 6 (Table 4.3). The values shown in
Figure 4.3 reflect the average importance of each capability taken across the ten games and
the ten C2 processes.
42 Machine Learning-Assisted Command and Control: Findings and Recommendations
Summary
1 DoD, Glossary of Defense Acquisition Acronyms and Terms, Fort Belvoir, Va.: Defense
Acquisition University, 2017b.
45
46 Machine Learning-Assisted Command and Control: Findings and Recommendations
Table 5.1
Categories of Measure
Category Definition
MoE The data used to measure the military effect (mission accomplishment)
that comes from the use of the system in its expected environment.
That environment includes the system under test and all interrelated
systems, that is, the planned or expected environment in terms of
weapons, sensors, C2, and platforms, as appropriate, needed to
accomplish an end-to-end mission in combat.
Measures of Effectiveness
2 MoE cannot be derived from the C2 problem characteristics discussed in Chapter Two.
Those characteristics describe the mathematical nature of the problem, but they do not cap-
ture the associated military benefits—for example, improved outcomes.
Metrics for Evaluating Artificial Intelligence Solutions 47
(1) the inherent complexity of C2 systems and (2) the wide variety of
C2 missions.
The largest challenge in measuring C2 systems is the inherent
complexity of those systems.3 C2 systems involve many coordinated
processes, require human decisionmaking, and are subject to such
external factors as environmental conditions and adversary actions. To
isolate the effect of a single change to the C2 system while controlling
for all other variables is often not feasible. As the North Atlantic Treaty
Organization code of best practices for C2 assessment explains,
The second major challenge is that different missions call for dif-
ferent metrics. For example, traditional C2 metrics, such as mission
success and force exchange ratios, are not relevant for humanitarian
assistance/disaster relief operations, which may themselves require a
different set of metrics than peacekeeping operations. Furthermore,
changes to C2 processes may alter effectiveness differently in different
missions.
For these reasons, no single, standard set of MoE can be derived
for all C2 problems: MoE must be tailored for each mission. In light of
this, we do not provide a fixed list of MoE but rather a set of subcatego-
ries and questions that should be considered when devising them. Our
goal here is to identify groups broad enough to be applicable to most
C2 functions and to cover the areas in which we anticipate AI solutions
Table 5.2
Measures of Effectiveness
will be most appropriate. These different groups are listed in Table 5.2,
and we discuss each in more detail below.
Decision quality is perhaps the most direct measure of an effective
C2 process. U.S. Marine Corps doctrine holds that “a principal aim of
command and control is to enhance the commander’s ability to make
sound and timely decisions,”5 while joint doctrine notes that “the C2
function supports an efficient decision-making process.” 6 The relevant
question here is whether the choice made was the best one possible
given the information available. However, determining whether this
5 U.S. Marine Corps, Command and Control, Doctrinal Publication 6, Washington, D.C.,
2018.
6 Joint Publication 3-0, 2017.
Metrics for Evaluating Artificial Intelligence Solutions 49
7 David S. Alberts and Richard E. Hayes, Understanding Command and Control, Washing-
ton, D.C.: Command and Control Research Program, 2006.
8 Abbie Tingstad, Dahlia Anne Goldfeld, Lance Menthe, Robert A. Guffey, Zachary Hal-
deman, Krista S. Langeland, Amado Cordova, Elizabeth M. Waina, and Balys Gintautas,
Assessing the Value of Intelligence Collected by U.S. Air Force Airborne Intelligence, Surveillance,
and Reconnaissance Platforms, Santa Monica, Calif.: RAND Corporation, RR-2742-AF,
2021.
9 U.S. Air Force Doctrine, Annex 3-30: Command and Control, Maxwell Air Force Base,
Ala.: Lemay Center for Doctrine, 2020.
50 Machine Learning-Assisted Command and Control: Findings and Recommendations
10 Mica R. Endsley, “Design and Evaluation for Situation Awareness Enhancement,” Pro-
ceedings of the Human Factors Society Annual Meeting, Vol. 32, No. 2, 1988.
11 There are many definitions of the dimensions of data quality. An often-cited paper is
Nicola Askham, Denise Cook, Martin Doyle, Helen Fereday, Mike Gibson, Ulrich Land-
beck, Rob Lee, Chris Maynard, Gary Palmer, and Julian Schwarzenbach, The Six Primary
Dimensions for Data Quality Assessment: Data Quality Dimensions, Bristol, U.K.: Data Man-
agement Association and Data Quality Dimensions Working Group, October 2013.
12 John R. Boyd, “Patterns of Conflict,” unpublished briefing slides, 1986.
13 Army Doctrine Publication 6-0, Mission Command: Command and Control of Army
Forces, Washington, D.C.: U.S. Department of the Army, 2019.
Metrics for Evaluating Artificial Intelligence Solutions 51
Here we refer to how efficiently resources are employed and what trade-
offs must be made to obtain them, including opportunity costs. These
measures are of particular importance for C2 of logistics processes.
Measures of Performance
18Kri L. Wagstaff, “Machine Learning that Matters,” Proceedings of the 29th International
Conference on Machine Learning, Madison, Wisc.: Omnipress, 2012.
19 Note that categories associated with practicality—V&V and explainability—are missing
from this list because they are not truly benchmarks or properties of the algorithm itself.
V&V is an activity performed on the algorithm, and explainability is about human under-
standing of the process. We include these as MoS (see next section).
Metrics for Evaluating Artificial Intelligence Solutions 53
Table 5.3
Measures of Performance
Measures of Suitability
21 Public-Private Analytic Exchange Program, AI: Using Standards to Mitigate Risks, Wash-
ington, D.C.: U.S. Department of Homeland Security, 2018.
Metrics for Evaluating Artificial Intelligence Solutions 55
Table 5.4
Measures of Suitability
Group Definitions
can never really be sure of just what you will be getting until it
arrives.22
Summary
1. Too little focus on MoE and MoS. Our review of DARPA metrics
shows that the primary focus of AI evaluation tends to be on
performance accuracy and optimality. While this is certainly
important, this keeps the focus on the solution space. Strate-
Figure 5.1
Defense Advanced Research Projects Agency Metric Classifications by
Number (top) and by Percentage of Programs with Metric (bottom)
Other
MoP
Cybersecurity MoE
Reliability
Maintainability/sustainability MoS
Interoperability Other
Scalability
Explainability/credibility
Human-machine teaming
Timeliness
Resource management
Survivability/lethality
Mission success/failure
Situational awareness
Data efficiency
Learning
Robustness
Computational efficiency
Optimality
Soundness
0 10 20 30 40 50 60
Observations (n)
Other
MoP
Cybersecurity MoE
Reliability
MoS
Maintainability/sustainability
Interoperability Other
Scalability
Explainability/credibility
Human-machine teaming
Timeliness
Resource management
Survivability/lethality
Mission success/failure
Situational awareness
Data efficiency
Learning
Robustness
Computational efficiency
Optimality
Soundness
0 10 20 30 40 50 60
Percent programs
Metrics for Evaluating Artificial Intelligence Solutions 61
33 For example, see Lance Menthe, Dahlia Anne Goldfeld, Abbie Tingstad, Sherrill Lingel,
Edward Geist, Donald Brunk, Amanda Wicker, Sarah Soliman, Balys Gintautas, Anne
Stickells, and Amado Cordova, Technology Innovation and the Future of Air Force Intelligence
Analysis, Vol. 1, Findings and Recommendations, Santa Monica, Calif.: RAND Corporation,
RR-A341-1, 2021.
34 DIB, 2019.
CHAPTER SIX
Games such as chess, go, and even StarCraft II are qualitatively differ-
ent from most real-world tasks. These games have well-defined rules
(even if some of them are hidden from the player) that remain constant
over time. Game-playing algorithms exploit this regularity to achieve
superhuman performance. Unfortunately, nature and the adversary
intervene to break this simplifying assumption in military tasks.
63
64 Machine Learning-Assisted Command and Control: Findings and Recommendations
Figure 6.1
Artificial Intelligence System Capability Mapping and Command and
Control Process Evaluation
Conclusion 1. C2 processes are very different from many of the games and
environments used to develop and demonstrate AI systems
Command and control process
Conclusion 4. Hybrid
approaches are needed
to deal with the
Measures of Measures of Measures of multitude of problem
suitability effectiveness performance characteristics present
in C2 processes
Measures of merit
Air Force Life Cycle Management Center, Battle Management Directorate, Descriptive
List of Applicable Publications (DLOAP) for the Air Operations Center (AOC), Hanscom
Air Force Base, Mass., April 1, 2019. Not available to the general public.
Alberts, David S., and Richard E. Hayes, Understanding Command and Control,
Washington, D.C.: Command and Control Research Program, 2006.
Anderson, J. R., Cognitive Psychology and Its Implications, New York: Macmillan,
2005.
Arbel, T., and F. P. Ferrie, “On the Sequential Accumulation of Evidence,”
International Journal of Computer Vision, Vol. 43, 2001, pp. 205–230.
Army Doctrine Publication 6-0, Mission Command: Command and Control of
Army Forces, Washington, D.C.: U.S. Department of the Army, 2019.
Askham, Nicola, Denise Cook, Martin Doyle, Helen Fereday, Mike Gibson, Ulrich
Landbeck, Rob Lee, Chris Maynard, Gary Palmer, and Julian Schwarzenbach,
The Six Primary Dimensions for Data Quality Assessment: Data Quality Dimensions,
Bristol, U.K.: Data Management Association and Data Quality Dimensions Working
Group, October 2013.
Boyd, John R., “Patterns of Conflict,” unpublished briefing slides, 1986.
Brown, N., and T. Sandholm, “Superhuman AI for Heads-Up No-Limit
Poker: Libratus Beats Top Professionals,” Science, Vol. 359, No. 6374, 2018,
pp. 418‒424.
Cordova, Amado, Lindsay D. Millard, Lance Menthe, Robert A. Guffey, and Carl
Rhodes, Motion Imagery Processing and Exploitation (MIPE), Santa Monica, Calif.:
RAND Corporation, RR-154-AF, 2013. As of December 15, 2020:
https://www.rand.org/pubs/research_reports/RR154.html
Dahm, W. J., Technology Horizons: A Vision for Air Force Science and Technology
During 2010‒2030, Arlington, Va.: U.S. Air Force, 2010.
DARPA—See Defense Advanced Research Projects Agency.
69
70 Machine Learning-Assisted Command and Control: Findings and Recommendations
———, Glossary of Defense Acquisition Acronyms and Terms, Fort Belvoir, Va.:
Defense Acquisition University, 2017b.
———, Artificial Intelligence Strategy, Washington, D.C., 2018.
———, “Secretary of Defense Speech: Reagan National Defense Forum Keynote,”
Defense.gov, December 7, 2019. As of December 22, 2020:
https://www.defense.gov/Newsroom/Speeches/Speech/Article/2035046/reagan
-national-defense-forum-keynote-remarks/
U.S. Marine Corps, Command and Control, Doctrinal Publication 6, Washington,
D.C., 2018.
Vinyals, Oriol, Igor Babuschkin, Wojciech M. Czarnecki, et al., “Grandmaster
Level in Starcraft II Using Multi-Agent Reinforcement Learning,” Nature, Vol. 575,
No. 7782, 2019, pp. 350‒354.
Wagstaff, Kri L., “Machine Learning that Matters,” Proceedings of the 29th International
Conference on Machine Learning, Madison, Wisc.: Omnipress, 2012, pp. 529‒536.
Walch, Kathleen, “Are We Heading for Another AI Winter Soon?,” Forbes,
October 20, 2019. As of December 22, 2020:
https://www.forbes.com/sites/cognitiveworld/2019/10/20/are-we-heading-for
-another-ai-winter-soon/?sh=7fbdfba156d6
Winkler, Robert, The Evolution of the Joint ATO Cycle, Norfolk, Va.: Joint Advanced
Warfighting School, 2006.
Wong, Yuna Huh, John M. Yurchak, Robert W. Button, Aaron Frank, Burgess
Laird, Osonde A. Osoba, Randall Steeb, Benjamin N. Harris, and Sebastian Joon
Bae, Deterrence in the Age of Thinking Machines, Santa Monica, Calif.: RAND
Corporation, RR-2797-RC, 2020. As of December 22, 2020:
https://www.rand.org/pubs/research_reports/RR2797.html
Zacharias, Greg, Autonomous Horizons: The Way Forward, Maxwell Air Force Base,
Ala.: Air University Press, Curtis E. LeMay Center for Doctrine Development and
Education, 2019a.
———, Emerging Technologies: Test and Evaluation Implications, Washington, D.C.:
U.S. Department of Defense, April 10, 2019b.
PROJ ECT AIR FOR CE
T
his report concerns the potential for artificial intelligence (AI)
systems to assist in Air Force command and control (C2) from a
technical perspective. The authors present an analytical framework
for assessing the suitability of a given AI system for a given C2
problem. The purpose of the framework is to identify AI systems
that address the distinct needs of different C2 problems and to identify the
technical gaps that remain. Although the authors focus on C2, the analytical
framework applies to other warfighting functions and services as well.
$22.50
ISBN-10 1-9774-0709-9
ISBN-13 978-1-9774-0709-2
52250
RR-A263-1