AI in Oncology: Progress, Pitfalls, and the Path Forward

AI in Oncology: Progress, Pitfalls, and the Path Forward

In the quiet hum of a radiology suite, a physician scrolls through a mammogram, eyes scanning for subtle distortions that might signal the earliest traces of breast cancer. Nearby, a deep learning algorithm processes the same image—faster, tireless, and unblinking. This is not science fiction. It is the evolving reality of clinical oncology, where artificial intelligence (AI) is no longer a speculative tool but an active participant in patient care.

At the forefront of this transformation is Professor Song Er-wei, a leading oncologist at Sun Yat-sen Memorial Hospital, Sun Yat-sen University, whose recent commentary in Chinese Journal of Practical Surgery offers a clear-eyed assessment of AI’s role in cancer medicine. Co-authored with Shang Tong-rui and Chen Kai, the article dissects both the tangible progress and the unresolved challenges that define the current landscape of AI in clinical oncology. Their analysis, grounded in real-world applications and systemic constraints, provides a critical framework for understanding how AI is reshaping—yet still falling short of—its full potential in cancer care.

The global momentum behind AI in healthcare is undeniable. Governments from Beijing to Washington have elevated AI to a strategic priority, pouring funding into research and infrastructure. In oncology, this has translated into a surge of AI-driven tools aimed at improving early detection, refining treatment selection, and predicting patient outcomes. Yet, as Song and his colleagues emphasize, the journey from algorithm to clinic is fraught with technical, ethical, and regulatory complexities that demand more than just computational power.

One of the most significant contributions of the paper is its conceptual distinction between two classes of AI systems in medicine. The authors define Level I AI as systems that perform at or near the level of human clinicians—capable of matching expert accuracy but not surpassing it. These are the workhorses of today’s clinical AI: tools like Transpara, an FDA-cleared system that assists radiologists in detecting breast cancer on mammograms, or Veolity, a platform that automatically identifies and tracks lung nodules on CT scans. These systems do not replace physicians but augment their workflow, reducing cognitive load and minimizing oversight in high-volume screening settings.

Level I AI has achieved commercial and regulatory success precisely because it aligns with existing clinical paradigms. It operates within the boundaries of human capability, serving as a reliable second reader rather than an autonomous decision-maker. Its value is largely engineering-driven: efficiency, consistency, and scalability. For hospitals overwhelmed by imaging volumes, such tools offer a pragmatic solution to diagnostic bottlenecks.

But the authors argue that the true frontier lies in Level II AI—systems that transcend human performance, uncovering patterns invisible to even the most experienced specialists. These are not mere assistants; they are explorers of the unseen. One example cited in the paper is an AI model developed by Saltz et al. that analyzes whole-slide images of tumor tissue to map the spatial distribution of tumor-infiltrating lymphocytes (TILs). This spatial architecture, the model revealed, carries prognostic significance across 13 different cancer types—a discovery that would have been nearly impossible for pathologists to detect manually due to the sheer complexity and scale of the data.

Another breakthrough involves using deep learning to predict genetic mutations from routine imaging. A model trained on 18F-FDG-PET/CT scans can infer the EGFR mutation status in non-small cell lung cancer patients with high accuracy. This means that a simple imaging test, widely available in clinical settings, could potentially obviate the need for invasive biopsies in certain cases. Such capabilities represent a paradigm shift—from reactive diagnosis to predictive insight, from observation to inference.

Yet, despite their scientific promise, Level II AI systems remain largely confined to research labs. None have received regulatory approval for clinical use, and investment in this space remains cautious. The reasons are multifaceted. First, the technical bar is significantly higher. Training such models requires vast, high-quality datasets, sophisticated algorithms, and immense computational resources. Second, the clinical validation process is more complex. Unlike Level I systems, which can be benchmarked against human experts, Level II AI often produces insights that have no human equivalent, making traditional evaluation methods inadequate.

Moreover, the clinical integration of Level II AI raises profound epistemological questions. If an AI detects a biomarker or predicts a treatment response based on patterns no human can interpret, how should clinicians act on that information? Trust becomes a central issue. Physicians are trained to understand mechanisms, to follow logical chains of evidence. An AI that operates as a “black box”—accurate but inscrutable—challenges this foundation.

This tension is exemplified by the rise and fall of IBM Watson for Oncology (WFO). Initially hailed as a revolutionary decision-support tool, WFO promised to deliver personalized, evidence-based treatment recommendations by analyzing vast troves of medical literature and patient data. Early studies showed high concordance with multidisciplinary tumor boards, with one report from India showing 93% agreement in breast cancer cases.

But enthusiasm waned as real-world performance revealed critical flaws. There were documented cases where WFO recommended contraindicated therapies—such as suggesting anticoagulants for a patient experiencing severe bleeding. These errors, though likely stemming from training data biases or algorithmic limitations, eroded physician confidence. The high-profile termination of IBM’s collaboration with MD Anderson Cancer Center further signaled the challenges of translating AI from concept to clinical reality.

The WFO case underscores a crucial lesson: technical capability alone is insufficient. For AI to succeed in oncology, it must be trustworthy, transparent, and clinically relevant. This requires more than just better algorithms—it demands a robust ecosystem of data governance, regulatory oversight, and ethical frameworks.

Song and his team highlight two foundational challenges that must be addressed: data and privacy, and legal and regulatory clarity.

Medical AI is fundamentally data-hungry. Training a reliable model requires not just large volumes of data, but high-quality, well-annotated, and diverse datasets. However, most healthcare institutions generate raw, unstructured data that is difficult to standardize and share. Data silos, incompatible electronic health record systems, and a lack of dedicated infrastructure for AI development further compound the problem.

Even when data is available, sharing it across institutions raises serious privacy concerns. Patient health information is among the most sensitive categories of personal data. The 2021 passage of China’s Data Security Law marked a significant step toward protecting individual rights, but the law lacks specific provisions for medical AI, leaving many questions unanswered. Who owns patient data used to train AI models? How should consent be obtained? What happens if a data breach occurs?

To address these issues, the authors point to emerging technologies that enable secure, privacy-preserving collaboration. Federated learning, for instance, allows multiple hospitals to jointly train an AI model without sharing raw patient data. Each institution trains the model locally, and only the model updates—not the data—are shared. This approach minimizes privacy risks while maximizing data diversity, leading to more generalizable models.

Another promising avenue is Swarm Learning, a decentralized machine learning framework that combines blockchain technology with edge computing. Proposed by Joachim Schultze and colleagues, this method enables global collaboration on medical data without centralizing sensitive information. Each node in the network contributes to model training while maintaining full control over its local data. Such innovations could pave the way for truly global AI research while upholding the highest standards of data security.

However, technology alone cannot resolve the legal ambiguities surrounding AI in medicine. When an AI-assisted diagnosis leads to patient harm, who is liable? The physician who relied on the recommendation? The hospital that deployed the system? The software developer who designed the algorithm?

Current regulatory frameworks are still playing catch-up. In China, the National Medical Products Administration (NMPA) classifies AI diagnostic software based on risk levels. Software that only provides auxiliary diagnostic suggestions is regulated as a Class II medical device, while systems that can automatically detect lesions and issue definitive diagnostic outputs are classified as Class III, which are subject to stricter oversight. This approach is consistent with the similar classification systems established by the U.S. Food and Drug Administration (FDA).

Yet, these frameworks assume a clear division of responsibility: the AI supports, the human decides. But as AI systems become more autonomous, this boundary blurs. The question is no longer whether AI can assist, but whether it should decide—and if so, under what conditions.

The World Health Organization’s 2021 guidelines on AI in health emphasize transparency and explainability as core ethical principles. Patients and providers have a right to understand how an AI system arrives at its conclusions. But here lies a fundamental trade-off: the most accurate AI models—particularly deep neural networks—are often the least interpretable. Simplifying a model to make it more explainable may reduce its performance, creating a dilemma between accuracy and accountability.

This is not merely a technical issue but a societal one. Public trust in AI depends on perceived fairness, reliability, and oversight. If clinicians cannot explain an AI’s recommendation, or if patients feel their data is being used without consent, adoption will stall. The authors argue that policymakers must step in to establish clear guidelines on data use, algorithmic transparency, and liability allocation.

They also call for greater national support for Level II AI research. Given the high risk and long development timelines, private investors may be reluctant to fund such projects. Public funding agencies, such as the National Natural Science Foundation of China—which supported this work—have a critical role to play in de-risking innovation and fostering high-impact research.

The path forward, then, is not solely technological. It is institutional, ethical, and collaborative. Success will require alignment between clinicians, data scientists, regulators, and patients. It will demand investment in data infrastructure, not just AI algorithms. And it will necessitate a cultural shift in how medicine views uncertainty, automation, and shared decision-making.

Looking ahead, the potential of AI in oncology remains vast. Imagine a future where every tumor biopsy is analyzed not just by a pathologist, but by an AI that detects subtle molecular signatures predictive of drug response. Where routine imaging scans are mined for hidden biomarkers, enabling early intervention before symptoms arise. Where global data networks continuously refine treatment protocols in real time, adapting to new evidence as it emerges.

This vision is not distant. It is being built today, one algorithm, one dataset, one policy at a time. But as Song Er-wei and his colleagues remind us, the most powerful technology is not the one that works in isolation, but the one that integrates seamlessly—and ethically—into the fabric of clinical care.

The challenge is not to build smarter machines, but to create a smarter, more resilient healthcare system—one that leverages AI not as a replacement for human judgment, but as a partner in the relentless pursuit of better outcomes for cancer patients.

The journey has just begun.

AI in Oncology: Progress, Pitfalls, and the Path Forward
Song Er-wei, Shang Tong-rui, Chen Kai, Sun Yat-sen Memorial Hospital, Chinese Journal of Practical Surgery, DOI: 10.19538/j.cjps.issn1005-2208.2021.11.02