AI-Driven Command Systems Face Complex Real-World Challenges
In the rapidly evolving landscape of modern warfare, artificial intelligence (AI) is increasingly hailed as the linchpin of next-generation military capabilities. At the forefront of this transformation lies the command information system—a digital nervous system that integrates intelligence, surveillance, reconnaissance, communications, and strike coordination across complex, multi-domain battlefields. While AI has demonstrated astonishing prowess in controlled environments like chess and Go, its application to real-world military decision-making remains fraught with formidable technical and operational obstacles.
The failure of the U.S. Army’s “Deep Green” program in the late 2000s stands as a sobering reminder of these challenges. Conceived as an AI-augmented command and control system to accelerate battlefield decision-making, Deep Green aimed to provide commanders with predictive analytics, rapid scenario modeling, and intuitive visual interfaces. Despite its ambition, the program faltered under the weight of reality: the messy, incomplete, and highly dynamic nature of actual combat. This historical setback continues to inform current research efforts, particularly in nations seeking to close the strategic technology gap with leading military powers.
A recent paper published in Ordnance Industry Automation by Sun Danhua, Wang Chen, and Su Huanhuan from the Academy of Artillery & Air Defense in Zhengzhou, China, revisits these persistent hurdles through the lens of contemporary AI breakthroughs. Their analysis offers a timely and nuanced perspective on why AI-enabled command systems remain largely in the theoretical domain—and how emerging technologies might finally bridge that gap.
The Illusion of Perfect Information
One of the most fundamental distinctions between AI success stories like IBM’s Deep Blue or Google DeepMind’s AlphaGo and real-world military command lies in the nature of information itself. Chess and Go are games of perfect information: all pieces are visible, all rules are fixed, and all possible moves are computable within a finite, albeit vast, state space. In contrast, the battlefield is defined by uncertainty, deception, sensor noise, communication delays, and the fog of war.
As the authors emphasize, battlefield situational awareness—the foundational input for any command decision—is inherently subjective. At the tactical level, commanders rely on fragmented sensor feeds, intelligence reports, and human judgment to build a coherent picture of the enemy’s disposition and intent. This task becomes exponentially more complex at the operational and strategic levels, where assessments involve political, logistical, economic, and psychological dimensions that resist quantification.
Current AI systems, even those powered by deep learning, struggle to replicate this kind of contextual, adaptive cognition. While convolutional neural networks can identify tanks in satellite imagery or classify radar signatures with near-human accuracy, they lack the higher-order reasoning needed to infer an adversary’s operational plan from ambiguous signals. Without this capacity for strategic intuition—what military theorists often call “commander’s judgment”—AI remains a tool for data processing, not decision-making.
The Input-Output Bottleneck
Beyond situational understanding, the very mechanics of human-AI interaction in command environments pose another critical challenge. Deep Green attempted to solve this through sketch-based interfaces, allowing commanders to draw tactical plans directly onto digital maps. While innovative, this approach proved insufficient.
Military decisions encompass more than spatial coordinates and unit placements; they involve intent, morale, risk tolerance, and doctrinal nuance—elements that resist graphical representation. Moreover, standardization across echelons remains elusive. A battalion commander’s sketch may use symbols unfamiliar to a joint task force headquarters, creating interoperability gaps that AI cannot resolve without robust semantic frameworks.
The paper suggests that emerging multimodal human-machine interfaces—combining voice, gesture, eye tracking, and even neural input—could offer a more natural and expressive channel for command intent. Companies like Google and Microsoft have already commercialized advanced speech and image recognition systems that function reliably in noisy, real-world conditions. Extending these capabilities to military contexts could streamline the flow of directives from command centers to individual weapon platforms, especially as loitering munitions and autonomous drones become more prevalent.
However, the transition from commercial to military-grade interfaces demands extreme reliability, low latency, and resilience against electronic warfare. An AI that misinterprets a commander’s verbal order in a simulated environment is inconvenient; in combat, it could be catastrophic.
The Data Dilemma
Perhaps the most intractable barrier to AI adoption in command systems is the scarcity of high-quality training data. Deep learning models thrive on massive, labeled datasets—something readily available in domains like consumer tech, where billions of user interactions generate continuous feedback. In military operations, however, real combat data is both rare and highly classified.
As the authors note, China’s People’s Liberation Army has not engaged in large-scale, high-intensity conflicts since the late 20th century. While military exercises generate valuable data, they often lack the chaos, unpredictability, and stress of actual warfare. Units may follow scripted scenarios, avoid high-risk maneuvers, or withhold full capabilities for security reasons. The resulting data, though voluminous, may not reflect true combat behavior.
This data gap limits the efficacy of supervised learning—the dominant paradigm behind systems like AlphaGo, which trained on 160,000 human Go games before refining its strategy through self-play. In the absence of equivalent historical battles, military AI researchers must turn to alternatives.
One promising avenue is simulation-based training. High-fidelity wargaming platforms—akin to sophisticated versions of StarCraft II or Red Alert—can generate synthetic battle data at scale. DeepMind’s AlphaStar, which defeated top human players in StarCraft II, demonstrates that complex, real-time strategy environments can serve as viable testbeds for AI decision-making. The authors propose collaboration with commercial game developers to leverage their AI expertise and simulation infrastructure.
Another approach is deep reinforcement learning with self-play, where AI agents compete against themselves to discover novel tactics. While this method doesn’t require historical data, it demands accurate physics and doctrine models to ensure that learned behaviors are militarily relevant. Misaligned simulations risk producing “clever” but tactically unsound strategies—a phenomenon known as reward hacking.
Finally, the authors highlight the need for expert-in-the-loop data labeling. Unlike tagging images of cats or cars, annotating battlefield decisions requires deep domain knowledge. A team of seasoned officers must review and validate each training example to ensure doctrinal correctness—a labor-intensive but indispensable step toward building trustworthy AI.
Complexity Beyond Computation
Even with abundant data and intuitive interfaces, the sheer complexity of modern warfare dwarfs that of any board or video game. The paper cites estimates that the state space of real-world joint operations exceeds 10^1685—orders of magnitude larger than Go’s 10^170. This complexity stems from the interplay of land, sea, air, space, cyber, and electromagnetic domains, each governed by distinct physical laws and operational tempos.
Unlike AlphaGo, which optimized for a single objective (winning the game), military AI must balance competing priorities: mission success, force protection, political constraints, collateral damage avoidance, and resource conservation. This multi-objective optimization defies simple reward functions and necessitates hierarchical, modular architectures.
Here, the authors advocate a “divide-and-conquer” strategy. Rather than attempting end-to-end learning across the entire command chain—as AlphaGo did from board state to move—they propose decomposing the system into functional modules, each suited to a specific AI paradigm.
For instance, intelligence fusion and task dissemination involve structured data, clear rules, and established protocols—ideal for knowledge-based AI systems that encode doctrine and procedures into ontologies or rule engines. In contrast, situation assessment and course-of-action generation rely heavily on experience, intuition, and adaptive reasoning—domains where deep learning and probabilistic modeling excel.
This hybrid approach mirrors how human commanders operate: applying doctrine when possible, but improvising when the situation demands. By aligning AI methods with cognitive tasks, militaries can build systems that augment, rather than replace, human judgment.
From Theory to Battlefield
Despite these promising directions, the authors stress that command system intelligence remains largely theoretical. While commercial AI races ahead in consumer applications, military adoption lags due to stringent safety, explainability, and ethical requirements. A facial recognition algorithm can afford occasional errors; a targeting AI cannot.
Moreover, adversarial dynamics in warfare introduce unique threats. Unlike commercial systems that operate in cooperative environments, military AI must contend with deliberate deception, spoofing, and AI-on-AI competition. An enemy might feed false sensor data to confuse an autonomous system or exploit its decision-making biases—risks absent in civilian contexts.
To accelerate progress, the paper urges closer collaboration between defense institutions and private AI leaders. Technologies matured in autonomous vehicles, natural language understanding, and predictive analytics could be adapted for military use with proper safeguards. However, such transfers must be guided by military-specific validation frameworks to ensure operational fidelity.
Ultimately, the goal is not full autonomy, but “human-on-the-loop” synergy: AI that processes vast data streams, identifies patterns, and proposes options—while commanders retain final authority. This balance preserves accountability, leverages human creativity, and mitigates the brittleness of purely algorithmic systems.
As global powers race toward intelligentized warfare, the lessons from Deep Green remain pertinent. Success will not come from brute-force scaling of models or naive replication of commercial AI. It will emerge from a deep understanding of military cognition, rigorous data practices, modular system design, and—above all—humility in the face of war’s irreducible complexity.
Sun Danhua, Wang Chen, Su Huanhuan
Academy of Artillery & Air Defense, Zhengzhou 450000, China
Ordnance Industry Automation, Vol. 40, No. 8, August 2021
DOI: 10.7690/bgzdh.2021.08.002