AI in Oncology: Where it Helps, Where it Falls Short, and What Comes Next

AI in oncology is reshaping imaging, pathology, radiotherapy, trial matching, & workflow efficiency. Learn where it helps, falls short, & what's next.

Artificial intelligence (AI) has moved from conference hype to real oncology workflows, but the clinical reality is more nuanced than either evangelists or skeptics often suggest. In practice, AI in oncology is not one technology/use case. It includes classical machine learning models trained on structured clinical data, deep learning systems trained on radiology and pathology images, natural language processing tools that parse notes and trial criteria, and newer generative models that summarize records or answer clinical questions. The National Cancer Institute now describes AI applications across cancer biology, screening, diagnosis, drug discovery, surveillance, and health care delivery, while the Food and Drug Administration (FDA) maintains a growing list of AI-enabled medical devices authorized for marketing in the United States.

For physicians, that distinction matters. A model that flags suspicious regions on a prostate biopsy slide is not the same as a large language model summarizing a longitudinal oncology chart, and neither of those is the same as a multimodal system that tries to recommend therapy based on pathology, genomics, imaging, and prior treatment history. Lumping them together under the phrase “AI in oncology” creates confusion about evidence, safety, and readiness. The strongest current use cases tend to be narrow, well-defined, and supervised by clinicians. The weakest use cases tend to be open-ended, poorly governed, or deployed without clear validation in the population where they will be used. That pattern is consistent with the current FDA framework, oncology governance recommendations, and clinical implementation reviews.

That is why the right question for oncologists is not whether AI will matter. It already does. The better question is where AI in oncology is genuinely helping physicians today, where it still falls short, and what kind of evidence should be required before a tool becomes part of routine care. The answer is more practical than futuristic: AI works best when it reduces friction in bounded tasks, improves consistency in repetitive work, or surfaces clinically relevant information faster. It works worst when it is treated as a substitute for judgment in complex, high-stakes decisions.

AI in oncology is already in the clinic

The clearest sign that AI in oncology has entered clinical reality is regulatory and workflow presence, not speculative headlines. The FDA’s AI-enabled medical device list is intended to identify authorized AI-enabled devices in the United States, and the agency emphasizes that the list is not even comprehensive. It also notes that future updates will explore ways to identify medical devices incorporating foundation models, including large language models and multimodal architectures. In parallel, NCI describes active AI use cases in cancer screening, detection, diagnosis, drug discovery, surveillance, and delivery of care.

Taken together, the current landscape suggests that the most mature tools in oncology are assistive — rather than autonomous. They typically support image interpretation, highlight patterns that deserve human attention, automate parts of contouring or data extraction, or help clinicians navigate large information spaces such as trial eligibility criteria. That is very different from the popular image of AI as a replacement oncologist. Even among regulated tools, the FDA’s approach centers on lifecycle management, transparency, performance evaluation, and post-market monitoring, which underscores that these systems are products to be governed, not magic to be trusted.

This is also where physician expectations need calibration. In day-to-day practice, AI is most likely to deliver value when the task has a clear denominator and a measurable endpoint: detect a lesion, segment a structure, extract a biomarker, rank trial matches, summarize a chart, or prioritize a worklist. Once the task shifts toward open-ended reasoning, nuanced counseling, balancing competing patient values, or integrating rapidly changing evidence without supervision, the risk profile changes substantially. That distinction is central to how clinicians should evaluate any new AI tool that enters their workflow.

Where AI in oncology helps today

Imaging and screening

Radiology remains one of the most visible and mature domains for AI in oncology. NCI notes that AI is helping improve the speed, accuracy, and reliability of some screening and detection methods. It specifically highlights evidence that AI imaging algorithms can improve breast cancer detection on mammography and may also help predict long-term risk of invasive breast cancer. In other words, imaging AI is not only about finding what is already visible. In some settings, it is also being used to estimate risk from patterns that may be difficult for the human eye to quantify consistently.

The breast screening literature is one of the better examples of this transition from technical promise to clinical testing. Recent Lancet and Lancet Digital Health reports from the MASAI program have suggested that AI-supported mammography screening can increase cancer detection, reduce reading workload, and support earlier detection of clinically relevant breast cancer without an obvious penalty in false positives in the studied workflow. That does not mean every mammography practice should deploy the same product in the same way. It does mean that, among current oncology applications, screening mammography offers some of the strongest real-world evidence that carefully implemented AI can support physician performance in a high-volume task.

For MDs, the practical lesson is that imaging AI is most useful when it functions as a second set of eyes, a triage layer, or a workload redistribution tool. It is far less convincing when marketed as a stand-alone diagnostic authority. In oncology practice, sensitivity, specificity, interval cancer rates, false-positive burden, and downstream workflow effects matter more than an isolated area-under-the-curve number in a curated dataset. Radiology AI can help, but only when its operating characteristics are understood in the actual screening or diagnostic environment where it is being used.

Pathology and digital slide review

Pathology is another area where AI in oncology has moved beyond theory. NCI notes that the FDA has authorized AI-based software to help pathologists identify areas of prostate biopsy images that may contain cancer. The key word is “help.” These tools are assistive systems meant to augment review of digital slide images, not to eliminate pathologist oversight. That framing is important because pathology workflows involve not only detection, but also grading, context, sampling quality, clinicopathologic correlation, and communication of uncertainty.

The broader digital pathology literature is encouraging, particularly for classification, grading, outcome prediction, and even inference of molecular features from histology. At the same time, recent reviews emphasize that real-world deployment is limited by interpretability, clinical integration, and heterogeneity in tissue processing, staining, scanning, and local workflows. In practical terms, a model trained on beautifully curated slides from one system may perform differently on slides from another scanner, another lab, or another institution’s preanalytic pipeline. That is one reason why local validation remains essential before adoption.

For oncologists, AI-assisted pathology may become especially important at the interface between morphology and treatment selection. The next wave is not simply “find cancer on a slide.” It is using pathology-derived features to refine prognosis, predict benefit from therapy, or integrate tissue phenotypes with genomics and clinical data. That is promising, but the closer a pathology model gets to influencing treatment decisions, the higher the bar for external validation, calibration, subgroup analysis, and governance.

Radiation oncology and treatment planning

Radiation oncology offers one of the clearest examples of where AI in oncology can save physician time without pretending to replace expertise. Auto-contouring and segmentation tools can reduce the labor involved in delineating organs at risk and target volumes, a process that is time-consuming and subject to interobserver variability. A 2024 Radiation Oncology study comparing multiple commercial systems framed AI as a way to facilitate and standardize this work. Guidance from the Royal College of Radiologists likewise states that auto-contouring systems have the potential to improve quality and consistency while reducing pathway time.

But radiation oncology also illustrates why enthusiasm needs guardrails. The same guidance stresses that the clinician approving auto-contours is ultimately responsible for their clinical use, that reviewing AI-generated contours is a distinct skill from manual contouring, and that commissioning should use local data representative of the real-world cohort. In other words, speed is not the endpoint. Safe speed is. If the contours are wrong in a dosimetrically important region, the tool may create a new form of risk while appearing efficient on paper.

This is a useful model for how physicians should think about AI more broadly. In oncology, a time-saving tool has value only if it preserves or improves safety, consistency, and downstream decision quality. Radiation planning tools can be high-yield because the human review step is obvious and the output is concrete. That is a much more favorable setup than open-ended language generation, where errors can sound fluent and still be clinically wrong.

Clinical trial matching and chart abstraction

One of the least glamorous but most useful forms of AI in oncology may be information retrieval. Trial matching is a good example. NIH researchers reported that their TrialGPT framework could identify relevant clinical trials, explain how a patient met eligibility criteria, and produce a ranked list of matches for clinician discussion. In comparative testing, TrialGPT reached nearly the same level of accuracy as clinicians on patient-criterion assessment and allowed clinicians to spend 40% less time screening patients in a pilot user study.

That matters because clinical trial access in oncology is frequently limited not by the absence of studies, but by the friction of finding the right study for the right patient at the right time. Manual matching across unstructured eligibility criteria is slow, expensive, and uneven. A well-governed AI layer that accelerates pre-screening without obscuring exclusion logic could improve both efficiency and access, especially in precision oncology where biomarker-defined studies are proliferating. NIH’s ongoing work to further assess fairness and real-world performance is exactly the kind of next step the field needs.

Chart abstraction is similarly promising, but less mature. The CORAL work from UCSF and NEJM AI showed that large language models can extract parts of oncology history from notes, with GPT-4 performing best among tested models. Yet the same work found substantial room for improvement before these systems can be relied on for high-stakes extraction of disease course and treatment history. That is a meaningful takeaway for physicians: language models may be useful for first-pass summarization and research support, but they are not ready to be treated as infallible chart abstractors in oncology.

Documentation, summarization, and knowledge retrieval

In the near term, some of the biggest returns from AI in oncology may come from reducing administrative friction rather than making final treatment decisions. A summary of the NCCN 2025 conference highlighted record summarization and ambient listening tools as current applications already improving efficiency and reducing clinician burden. That framing is important because it aligns AI with a real pain point in oncology practice: the time required to navigate fragmented records, duplicated documentation, and information overload.

The American Society of Clinical Oncology (ASCO) is moving in a similar direction on the knowledge retrieval side. Its AI in Oncology initiative and ASCO Guidelines Assistant reflect an emerging model in which AI is used as an interface to evidence-based guidance rather than as an unconstrained recommender. For MDs, that is a more believable and safer role for generative systems: surface relevant guidance faster, summarize what is already established, and support workflow efficiency, while leaving final interpretation and patient-specific application to the clinician.

Where AI in oncology falls short

Technical performance does not equal clinical value

The biggest conceptual error in evaluating AI in oncology is assuming that a strong technical metric automatically means better care. It does not. A model may classify images accurately in a retrospective dataset and still fail to improve patient outcomes, reduce time to treatment, decrease unnecessary testing, or work reliably across sites. Reviews of clinical translation in oncology make this explicit: despite rapid growth in publications, only a minority of AI and machine learning models have been properly validated, and even fewer have become regulated products used routinely in clinical practice.

That gap matters because oncology is not a benchmark competition. Physicians care about whether a model changes diagnosis, treatment planning, toxicity mitigation, trial access, survival, quality of life, equity, or at least workflow in a meaningful way. Many studies stop at discrimination metrics and never answer the operational questions that determine whether a tool is worth implementing. For that reason, the field should be wary of impressive retrospective performance without prospective evidence, external validation, and implementation data.

Data quality, drift, and representativeness remain major barriers

Cancer care data are heterogeneous, incomplete, and locally shaped by workflow. Imaging protocols differ. Pathology staining and scanning differ. Documentation practices differ. Patient populations differ. Treatment standards evolve. All of that creates dataset shift, which is one of the central reasons why AI performance can erode after deployment. Reviews in digital pathology and oncology governance repeatedly point to data heterogeneity, clinical integration, and fairness as major barriers to broader adoption.

These are not abstract concerns. They go directly to whether a model trained in one health system can be trusted in another, whether performance is stable across demographic subgroups, and whether a tool might inadvertently worsen disparities by performing best where data were most plentiful and clean. ASCO’s responsible use principles explicitly elevate equity and fairness, and a 2024 national survey of oncologists found that although most respondents felt oncologists should protect patients from biased AI, only 27.9% felt confident in their ability to identify poorly representative models. That is a major warning sign for routine deployment.

Explainability, consent, and accountability are unsettled

Even when an AI tool performs well, physicians still have to decide how much explanation is required before it can ethically shape care. In the JAMA Network Open survey of U.S. oncologists, 84.8% reported that AI-based clinical decision models needed to be explainable by oncologists to be used in clinic, and 81.4% supported patient consent for AI use in treatment decisions. Most respondents also believed oncologists should protect patients from biased AI, yet less than half viewed medico-legal problems from AI use as physicians’ responsibility alone.

Those findings capture a core tension in AI in oncology. Physicians are expected to use increasingly complex models, explain their role to patients, recognize bias, manage deference, and retain responsibility for care, even when the technical details of the model may be opaque or the tool may have been procured by a health system rather than selected by the treating oncologist. That is why governance cannot be reduced to “the vendor said it works.” ASCO’s principles and WHO’s guidance both emphasize transparency, informed stakeholders, fairness, accountability, and broader ethical governance.

Generative AI can still hallucinate, oversimplify, and go stale

Generative AI is the most talked-about branch of AI in oncology, but it is also the branch that most clearly demonstrates why fluent language is not the same as trustworthy reasoning. In a JAMA Network Open study of large language models answering medical oncology examination questions, the best model reached 85.0% accuracy, yet 81.8% of the incorrect answers were judged likely to cause moderate to severe harm if acted on in practice. That is not a small caveat. It is the central clinical issue.

The problem is not just hallucination in the popular sense. It is also overconfidence, missing nuance, outdated knowledge, unsupported extrapolation, and failure to reason correctly through complex oncology scenarios. The CORAL work showed that even strong models still need improvement before they can reliably extract and reason over the complexity of cancer progress notes. More recent precision oncology work has also highlighted a specific weakness of general-purpose language models: their reliance on broad background knowledge makes it hard for them to stay current with niche biomarker-driven indications and rapidly evolving FDA approvals unless they are connected to updated, domain-specific retrieval systems.

For physicians, that means generative AI should currently be treated as a draft generator, search accelerator, or workflow assistant, not as an unsupervised source of final oncology recommendations. If a chatbot is involved in chart summary, trial matching, literature synthesis, or prior authorization support, someone with oncology expertise still needs to verify the output before it affects patient care.

Regulatory status does not replace local governance

Another common misconception is that regulatory authorization settles the clinical question. It does not. FDA authorization tells physicians something important about intended use, review pathway, and public documentation, but it does not remove the need for local validation, training, monitoring, and workflow design. The FDA’s software-as-a-medical-device framework emphasizes lifecycle management, and the agency has issued guidance and principles around good machine learning practice, predetermined change control plans, transparency, and lifecycle management for AI-enabled device software functions.

Radiation oncology guidance makes the same point from the implementation side: commissioning should use local data, ongoing surveillance should continue after go-live, and the clinician approving the output remains responsible for clinical use. That logic applies well beyond radiotherapy. If an oncology practice cannot explain how a tool was validated, who monitors for drift, how overrides are handled, and what happens when the model and physician disagree, then the tool is not ready for routine care regardless of how polished the sales demo appears.

What comes next

Multimodal models will shape the next phase of AI in oncology

The future of AI in oncology is likely to be multimodal rather than single-input. A 2025 Nature Cancer review describes ten hallmarks of AI contributions to precision oncology across prevention and diagnosis, treatment optimization, clinical trial design and matching, biomarker development, and new drug discovery. That is important because the hardest oncology questions rarely live in one data type. They span imaging, histology, genomics, laboratory trends, prior therapies, performance status, comorbidity, and narrative context.

In practical terms, the next major gains may come from systems that combine modalities to improve risk stratification, response prediction, monitoring, and treatment personalization. But multimodal power also increases the burden of validation. The more data streams a model consumes, the more potential failure modes it inherits: missing data, discordant data quality, evolving biomarker standards, local documentation idiosyncrasies, and hidden confounding variables. Multimodal models may be more useful, but they are not simpler to govern.

Oncology-specific copilots will matter more than general chatbots

The likely winners in clinical practice are not generic chatbots with broad internet knowledge. They are oncology-specific copilots connected to curated, current, and locally relevant sources. ASCO’s use of AI to help clinicians access evidence-based guidelines points in that direction. So does recent precision oncology research showing that context-augmented language models may overcome some of the limitations of general models in keeping pace with rapidly changing approvals and molecularly defined treatment decisions.

This is a crucial distinction for physician readers. The most credible future for generative AI in oncology is not autonomous decision-making. It is retrieval-augmented support: summarizing records, organizing molecular findings, surfacing relevant guidelines, explaining trial eligibility, and accelerating literature review while remaining tethered to verifiable sources. A recent framework for responsible LLM-driven clinical decision support in precision oncology reflects this shift, emphasizing education, governance, and practical safeguards rather than raw model capability alone.

Governance and regulation will become more specific

Regulation is also moving forward. The FDA now explicitly states that it will explore methods to identify and tag devices that incorporate foundation models, from LLMs to multimodal architectures, in future updates to its AI-enabled device list. It has also issued draft lifecycle guidance for AI-enabled device software functions and continues research on performance evaluation, uncertainty quantification, evolving devices, and postmarket monitoring. Separately, in January 2026, EMA and FDA announced common principles for good AI practice in the medicines lifecycle, covering evidence generation and monitoring from early development through post-authorization phases.

That shift matters for oncologists because AI is no longer just a devices conversation. It is becoming part of the broader medicines ecosystem, including drug development, evidence generation, and safety monitoring. As regulators become more specific, oncology programs will need better internal governance as well: multidisciplinary review, clearer procurement standards, subgroup performance requirements, monitoring dashboards, and clinician education on intended use and failure modes.

The field needs better prospective evidence

The next few years should not be judged by how many AI abstracts appear at major meetings. They should be judged by whether the field produces stronger prospective evidence. That includes external validation across institutions, subgroup performance analyses, calibration reporting, workflow measures, override patterns, patient-centered outcomes, and post-deployment monitoring for drift. The NCI, NIH, FDA, and major oncology societies are all pointing, in different ways, toward a more rigorous and less promotional phase of evaluation.

How physicians should evaluate AI in oncology before adoption

Start with the task, not the brand

When evaluating AI in oncology, physicians should first define the exact task the tool is supposed to assist. Is it detecting a lesion, segmenting a target, ranking trials, summarizing a chart, or suggesting treatment options? Narrow tasks are easier to validate and govern. Vague promises like “improves cancer care” are not clinically useful. FDA and oncology governance frameworks both implicitly favor clarity of intended use because safety, evaluation design, and monitoring all depend on it.

Demand external validation in a population like yours

A retrospective study from a single academic center is not enough for most oncology use cases. Ask whether the model was externally validated, whether the population resembles your practice, whether performance was reported across subgroups, and whether the tool was tested in the actual workflow where it will be used. This is especially important in pathology, imaging, and precision oncology, where local variation can materially affect performance.

Verify what the tool is and is not cleared or authorized to do

If a product influences diagnosis, treatment planning, or clinical management, physicians should know whether it is FDA-authorized, under what pathway, and for what intended use. Authorization is not the whole answer, but it is a basic due-diligence step. If the tool is an internal model or a non-device workflow assistant, the practice still needs a governance process that addresses oversight, privacy, quality assurance, and escalation when outputs conflict with clinician judgment.

Look for fairness, calibration, and drift monitoring

Discrimination metrics alone are insufficient. Clinicians should ask how the model performs across age groups, racial and ethnic groups, sex, disease stages, imaging platforms, pathology workflows, or sites of care. They should also ask how calibration is monitored over time and what triggers re-evaluation or rollback. ASCO’s principles and contemporary governance reviews make clear that fairness and accountability are not optional add-ons. They are central requirements for responsible use.

Define the human review step before deployment

If no one can clearly state who reviews the output, when they review it, and how disagreements are handled, the workflow is not ready. This is one reason radiation oncology has a relatively practical playbook for AI implementation: the human review step is explicit. The same principle should apply to note summarization, trial matching, pathology assistance, and decision support. Human oversight should be designed into the workflow, not added after a problem occurs.

Measure outcomes that matter to oncology practice

Finally, oncology programs should evaluate whether the tool changes outcomes that matter: time to diagnosis, contouring time, chart review time, trial enrollment, treatment delays, unnecessary workup, physician burden, patient understanding, or downstream errors. A tool that slightly improves one benchmark but increases editing burden or introduces hidden risk may not deserve a place in practice. The most useful AI in oncology will usually be the tool that removes real friction without creating new ambiguity.

The bottom line

AI in oncology is no longer hypothetical. It is already helping physicians in imaging, pathology, radiotherapy, trial matching, record summarization, and other bounded tasks. The strongest current use cases are assistive, supervised, and workflow-specific. The weakest are broad, unsupervised, and overly confident in settings where evidence changes quickly and patient stakes are high.

For MDs, the right stance is neither reflexive adoption nor reflexive dismissal. It is disciplined, clinical skepticism. Use AI where it demonstrably improves consistency, efficiency, or information retrieval. Be cautious when the task depends on nuanced reasoning, evolving evidence, or complex value judgments. And insist on validation, transparency, governance, and human accountability before any tool touches treatment decisions. That is the path by which AI in oncology becomes genuinely useful to physicians, rather than just impressive in a demo.

References:

National Cancer Institute. AI and Cancer
U.S. Food and Drug Administration. Artificial Intelligence in Software as a Medical Device
U.S. Food and Drug Administration. Artificial Intelligence-Enabled Medical Devices
The Lancet Digital Health. Application of artificial intelligence and digital tools in cancer pathology
The Lancet Digital Health. External validation of a digital pathology-based multimodal artificial intelligence-derived prognostic model in patients with advanced prostate cancer starting long-term androgen deprivation therapy: a post-hoc ancillary biomarker study of four phase 3
Radiation Oncology. Artificial intelligence contouring in radiotherapy for organs-at-risk and lymph node areas
The Royal College of Radiologists. Auto-contouring in radiotherapy
arXiv. CORAL: Expert-Curated medical Oncology Reports to Advance Language Model Inference
Journal of the National Comprehensive Cancer Network. Artificial Intelligence in Cancer Care: Opportunities, Challenges, and Governance
American Society of Clinical Oncology. AI and Oncology
American Society of Clinical Oncology. Introducing ASCO Guidelines Assistant
Nature Reviews Urology. The state of the art in artificial intelligence and digital pathology in prostate cancer
JAMA Network Open. Perspectives of Oncologists on the Ethical Implications of Using Artificial Intelligence for Cancer Care
JAMA Network Open. Performance of Large Language Models on Medical Oncology Examination Questions

View Free Oncology Courses