Previous Chapter: 5 Digital Twins
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

6
Multimodal Applications of Artificial Intelligence

Artificial intelligence (AI) is such a broadly useful tool that its potential in radiation oncology, medical diagnostics, and occupational and public safety can seem almost limitless—and daunting to consider. With so many potential uses, determining which are most likely to be beneficial in the near future and which are not quite ready for the real world can be difficult. This session’s speakers took on the question of which potential applications of AI and machine learning (ML) seemed most likely to be providing value to these three areas in the near term to mid-term, with a particular emphasis on multimodal AI models.

FINDING USE CASES FOR MULTIMODAL AI IN RADIOLOGY

Nur Yildirim, assistant professor in the School of Data Science at the University of Virginia, described the results of a survey in which she asked radiologists and clinicians about how AI-based applications could most help them in their jobs.

Yildirim’s team conducted comprehensive research to identify the most clinically relevant AI use cases for radiology, focusing specifically on vision language model applications. The study involved interviews with seven radiologists and clinicians, four brainstorming sessions, and follow-up interviews with five radiologists and eight clinicians to evaluate proposed use cases.

She stated that the typical radiology workflow begins when a referring clinician requests patient imaging. The radiologist’s team generates the images, and the radiologist examines them, provides findings and impressions, and then sends a report back to the referring clinician to inform patient care and treatment decisions.

Yildirim’s research showcases multiple key AI use cases for radiology practice. The first use case involves AI-generated draft reports from radiology images. Clinicians and radiologists viewed this as highly valuable, particularly for saving time on multi-slice images and examinations outside their specialty areas. One radiologist noted the challenge of reviewing unfamiliar imaging types: “I might be a seasoned reporter for lung or cardiac, but . . . we’ll get a neck computed tomography (CT) image . . . it is extremely difficult.”

For this use case to succeed, Yildirim noted that AI is expected to deliver near-perfect performance, with clinicians accepting only a 5–10 percent error rate for corrections. They prefer short, standardized reporting formats with bullet points rather than prose. She noted that the optimal approach would have AI handle the findings section while radiologists maintain control over impressions. When accessed by clinicians, it is helpful for reports to be clearly labeled as preliminary, with potential applications for triage to help clinicians escalate suspicious cases.

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

The second use case focuses on clinical decision-support tools. She stated that rather than generic chatbot interfaces, clinicians strongly prefer tool-based interactions due to time constraints. They want specific contextual information tied to individual cases rather than general medical knowledge. This involves integration of image data with electronic health records, patient-specific information rather than generic knowledge, and workflow-specific assistance with separate interfaces for different specialties.

The third use case addresses patient history summarization. Both clinicians and radiologists spend significant time reviewing patients’ past images and reports to understand baseline conditions, changes over time, and current status. AI-powered summarization of a patient’s historical radiology reports and images could provide brief highlights of key events and findings. One clinician noted that having a summary of preexisting conditions would save a lot of time.

Several critical implementation guidelines emerge from this research. The most useful applications may be task-specific rather than attempt to create one model for everything. Clinicians think in terms of specific tools for specific tasks, and their mental models may align with system functionality to build trust. AI outputs could be presented in forms that fit seamlessly into existing clinical workflows. If radiologists spend more time working with AI-generated reports than current methods, adoption will fail. Chatbots are not optimal for clinical environments, as clinicians prefer direct tool-based interactions that provide immediate actionable information.

Given the high failure rate of healthcare AI applications, Yildirim outlined four key risk mitigation strategies. First is involving clinicians throughout the entire development process, not just at the beginning or end. Second is carefully considering when, where, and how AI outputs are presented to ensure they align with clinical workflows and decision-making processes. Third is designing for entire healthcare teams rather than individual clinicians, considering how different roles interact and collaborate in patient care. Fourth is beginning with lower-risk applications that do not directly impact clinical decision making or patient care, such as information retrieval and presentation tools, before progressing to more critical applications.

The research team is developing a hyperlinked image-report dataset at the University of Virginia, which includes radiology reports, images, and links between them. This high-quality dataset will enable exploration of downstream AI applications and further research into practical implementations of the identified use cases.

The research emphasizes that successful AI implementation in radiology involves deep understanding of clinical workflows, specific rather than general solutions, and careful attention to how technology integrates with existing practices. Yildirim explained that the focus could be on augmenting rather than replacing clinical expertise, with particular attention to time-saving applications that enhance rather than complicate current workflows.

AI AND MULTIMODAL MODELING FOR LUNG CANCER TREATMENT

Jia Wu, associate professor in the Department of Thoracic/Head and Neck Medical Oncology at the MD Anderson Cancer Center, described stratifying lung cancer patients for immune checkpoint inhibitor therapy using radiomics applications in combination with omics data. He presented two case studies focused on lung cancer and then spoke about future directions for the field.

Wu’s research focuses on advancing precision oncology through AI, addressing the fundamental challenge of delivering the right drug in the right dose to the right patient at the right time. Currently, while multiple treatment options exist for cancer patients, medical oncologists lack reliable methods to determine which patients would benefit most from specific therapies. His work demonstrates how AI can bridge this gap through sophisticated predictive modeling frameworks.

The challenge of treatment variability became apparent through two contradictory studies involving early-stage non-small-cell lung cancer patients. In one trial, patients treated with stereotactic ablative radiotherapy combined with immunotherapy showed improved outcomes compared to radiation therapy alone (Chang et al.,

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

2023). However, a similar trial using identical settings showed no improvement with the combined therapy. This discrepancy raised the critical question of why similar strategies work in some populations but not others, suggesting that patient selection may be the key factor determining treatment success.

To address this variability, Wu’s team developed a comprehensive predictive modeling framework using patients’ imaging data and clinical information. The process begins with radiomics processing and qualification, where researchers meticulously analyze pretreatment scans, segment tumors and blood vessels, and extract quantitative features from the tumor, background parenchymal tissue, and vascular structures. After incorporating standard clinical information such as age and gender, the team feeds these data into separate predictive models for radiation therapy alone and combined immunotherapy treatment.

The framework includes a sophisticated analysis approach involving counterfactual analysis, where patient treatments are hypothetically switched to calculate individualized treatment effects. This enables ML-guided patient stratification, which is then compared to random stratification to determine clinical benefits. The model successfully identified subgroups that did not benefit from adding immunotherapy and patients who received only radiation therapy but might have benefited from combined treatment.

Understanding model interpretability became crucial for clinical adoption. Wu’s team collaborated extensively with clinicians to decode what the AI models were identifying, examining blood vessels, tumors, intensity patterns, and surrounding tumor areas. They employed Shapley value analysis to determine which individual features the AI model used for clinical stratification and how these features interconnected. Radiation oncologists were shown patient information and AI predictions to help them understand the model’s decision-making process.

The second major application involved addressing medical oncology challenges in late-stage lung cancer treatment. Standard treatment for patients without driver gene mutations typically begins with immunotherapy, followed by additional therapies like chemotherapy. However, oncologists currently have no reliable indicators for selecting additional therapies or determining treatment duration. Wu’s team approached this through a multicenter collaboration with Mayo Clinic, Stand Up To Cancer, and Dana Farber that examined patient data including genomic information accessible to any clinician, not next-generation sequencing data.

The research revealed important insights about model uncertainty and performance variability. When building and training five individual models, researchers found significant uncertainty associated with each model, and performance fluctuated when applied to different patient cohorts. Their solution involved quantifying uncertainty and incorporating it into predictions, which significantly improved model performance without requiring complex deep learning approaches. This highlighted the lesson that no model is perfect, and even small changes in optimizers or feature selectors can dramatically alter predictions.

Looking toward the future, Wu emphasized the need for integrated multimodal approaches to cancer treatment. The challenge lies in the vast amount of scattered cancer patient data across different systems, including radiology scans, digital pathology slides, molecular data, omics data, and clinical notes. The primary obstacle is that many of these data, particularly various images, lack structure, making meaningful feature extraction difficult and expensive. Because no agreement currently exists on how to represent image features in clinically interpretable ways, extensive annotation of pathology slides and other materials is required.

The integration process involves several steps: identifying correlations among different data modalities, aligning them appropriately, and integrating these diverse data streams into unified models for clinical applications. Wu demonstrated this concept through a study of stage 4 metastatic non-small-cell lung cancer patients, predicting immune checkpoint inhibitor therapy benefits using CT images, laboratory data, and clinical information without relying on standard clinicopathological markers.

The multimodal model provided additional information beyond benchmark models based on comprehensive radiology, pathology, and genomic information. While the deep learning model’s predictions for overall

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

survival and progression-free survival weren’t quite as accurate as benchmark models individually, combining them significantly improved predictive performance and patient stratification capabilities.

Wu introduced the concept of “habitat imaging” to address tumor heterogeneity challenges. Cancer tumors contain subclones with varying aggressiveness and treatment resistance, potentially dominating treatment outcomes. His team developed methods to identify high-risk subregions using multimodal imaging by applying two-step clustering to align CT and positron emission spectroscopy (PET) data at the population level and then mapping clusters back to individual scans to create intratumor heterogeneity maps.

These habitat maps enable patient stratification into low-, medium-, and high-risk groups for cancer recurrence after surgery using pretreatment scans. Adding circulating tumor DNA data further increases prediction accuracy. The team also explored AI’s potential to predict PET scans from CT scans; the team found that synthetic PET scans closely resembled actual scans, with experienced radiologists only distinguishing them by noting the synthetic versions’ superior sharpness.

Beyond oncology, Wu’s work extends to cardiovascular applications. Myocardial blood flow assessment typically involves expensive, complex cardiac PET scans that are difficult for patients to access. His team investigated whether electrocardiogram (EKG) data, which are cheaper and universally available, could predict myocardial blood flow using AI models. While some information remained uncaptured by EKG data, the approach offers a faster, more accessible alternative with reasonable accuracy.

He stated that additional applications include pathology data analysis for stratifying patient outcomes after immunotherapy and understanding lung adenocarcinoma precursors. The latter research identified TIM-3 as a potential target for preventing tumor progression, leading to a pioneering immunosuppression lung cancer trial. Early signals suggest that administering immunotherapy to precancerous conditions may intercept lung cancer development before it becomes invasive.

Wu ended his talk by emphasizing the potential of multimodal models incorporating medical histories, clinical notes, structural and functional images, genomics, proteomics, transcriptomics, and metabolomics. He feels that once researchers gain access to these diverse data sources and can apply meaningful analyses, they will be equipped to address clinical challenges and advance precision oncology toward the goal of personalized, optimized cancer treatment.

INTEGRATING MECHANISTIC AND ML MODELS TO ASSESS CAUSAL EFFECTS OF RADIOTHERAPY ON PATIENT OUTCOMES

Igor Shuryak, associate professor of radiation oncology at the Columbia University Irving Medical Center, spoke about enriching the linear quadratic model (which has been the basis of radiation oncology for many years) with omics data and using the techniques of ML to assess the effects of radiotherapy on patient outcomes.

Shuryak’s research addresses a fundamental challenge in radiation biology and oncology by combining traditional mathematical modeling with modern ML approaches. Mathematical modeling in this field has a long history, exemplified by the linear quadratic model, and these models draw from diverse data sources including animal studies, in vitro experiments, and human clinical data. However, traditional models are inherently simple and cannot incorporate multiple relevant features such as patient demographics, treatment and disease details, omics data, and imaging information. ML models, while capable of integrating multiple features and modalities to generate accurate predictions, have a perceived mystery to them that makes result interpretation significantly more difficult than with simple mechanistic models.

The integration of these two approaches aims to create more accurate and interpretable models by injecting concepts from simple models, such as the biologically effective dose from the linear quadratic model, into ML frameworks as engineered features. This integration improves ML model interpretability and can guide clini-

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

cally actionable insights. By incorporating mechanistic elements, ML models benefit from a broader knowledge base rather than relying solely on their training datasets.

Shuryak’s specific example involves predicting outcomes for head and neck cancer using tabular clinical data, with plans to extend the work to multimodal analysis incorporating image data. The foundation for this work lies in decades of understanding about tumor repopulation, which has been recognized since the 1980s as a strong factor in radiotherapy outcomes for head and neck squamous cell carcinoma. Shortening radiotherapy course length helps reduce repopulation effects by limiting time for tumor cell proliferation, while treatment gaps have the opposite effect.

The Withers “hockey stick” model captures this phenomenon by assuming that accelerated repopulation begins at a fixed time point, typically 28 days after beginning radiotherapy, with a repopulation rate independent of cell-killing intensity (Withers et al., 1988). Shuryak refers to this as the dose-independent model. To improve biological plausibility while maintaining simplicity, his team hypothesized that repopulation might compensate for cell killing and suggested that more intense cell killing through larger doses could alter both repopulation onset time and rate. This led to development of a dose-dependent modification to test whether the model could predict population dynamics more accurately.

The team tested both models against various datasets, including older clinical trials and the comprehensive RADCURE dataset from Princess Margaret Hospital, which contains data on more than 2,600 head and neck cancer patients with detailed radiotherapy and chemotherapy information, clinical variables, and long-term mortality outcomes. This analysis employed a two-step approach that combined mechanistic modeling concepts with ML techniques, using random survival forests for exploratory analysis followed by causal survival forests for focused causal analysis (Shuryak et al., 2018, 2019, 2024).

Both dose-independent and dose-dependent models incorporate identical cell-killing terms from the linear quadratic model but differ in their repopulation terms. To maintain simplicity, the team made various assumptions and simplifications, resulting in models with only five easily interpretable parameters each. Using RADCURE data and simple Cox regression on biologically effective dose, calculations showed that the dose was a significant survival predictor in the dose-dependent model but not the dose-independent model. When applying the more complex random survival forest model, which is nonlinear and makes fewer assumptions, similar trends emerged, with higher biologically effective dose associated with reduced mortality in both models.

Detailed analysis revealed that several variables were more influential than biologically effective dose, including age, human papillomavirus status, and smoking, which all demonstrated much larger effects on outcomes. This finding led the team to employ causal ML approaches to examine how biologically effective dose and other factors affected outcomes.

Shuryak noted that while ML is commonly used for predictive tasks, causal ML techniques exist and are rapidly evolving because exploring causality is scientifically important. Causal approaches offer advantages over typical correlational ML for personalized medicine, particularly in studying heterogeneous treatment effects and how patient demographics and disease details modify treatment effects. Additionally, he said, causal effects translate better than correlations to other datasets where data distributions and correlation structures may differ, and causal ML can work with observational clinical data, which are much more widely available than randomized controlled trial data, to provide causal insights about treatment effects.

Shuryak then said that unlike predictive tasks that involve only inputs and outputs, causal tasks incorporate three variable types: inputs, treatments, and outputs. The main objective of causal ML is quantifying causal effects of treatments on outputs, which Shuryak describes as an inference problem where the structure is assumed known but effect strengths are unknown—distinguishing it from the much more difficult causal discovery problem.

He said that specific causal ML techniques include double-debiased ML, which is doubly robust and can provide reliable causal effect estimates if either the treatment or outcome model is correctly specified. This ap-

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

proach involves modeling the treatment, modeling the outcome, and building a third model to relate residuals from the first two models, with this relationship interpreted as the causal effect.

The causal forest technique, originally developed by Stanford economists and econometricians (Athey et al., 2019), applies to medical contexts as well. Like double-debiased ML, the key step involves calculating residuals. It uses a random forest–like approach to split data into small parts, estimating local treatment effects within each through residual-versus-residual regression. The causal survival forest variant handles survival data, which are particularly challenging due to censoring that affects many patients in medical datasets.

When Shuryak’s group applied causal analyses to RADCURE data, the group found that under both dose-independent and dose-dependent conditions, high biologically effective dose increased patient-restricted mean survival times by 0.5 to 1.0 years and increased survival probability by 5–15 percent several years after treatment. Similar analysis of chemotherapy patients found that chemotherapy increased survival probability by 15 percent at both 3 and 5 years.

The team’s next step involves incorporating image data into the analyses. The result demonstrates that combining mechanistic mathematical modeling concepts with predictive and causal ML models enables detection of biologically effective dose effects that align with current knowledge, although the chemotherapy effect was larger than previously published trials suggested. This approach has significant potential for enhancing knowledge about treatment effects from non-randomized clinical data to complement randomized controlled trial analyses, generate new hypotheses, and support personalized medicine development.

The work represents a promising direction for radiation oncology research, bridging the gap between traditional mechanistic understanding and modern computational capabilities. By maintaining interpretability while leveraging the power of ML to handle complex, multidimensional data, Shuryak suggested that this integrated approach could significantly advance personalized treatment strategies and improve patient outcomes in head and neck cancer and potentially other malignancies.

THE PROMISE AND CHALLENGE OF DEEP LEARNING IN RADIATION RISK ASSESSMENTS

Zhenqiu Liu, senior scientist at the Radiation Effects Research Foundation (RERF) in Japan, spoke about the promise and the challenge of deep learning in radiation risk assessment.

As background, Liu explained that exposure to ionizing radiation from natural and manmade sources is inevitable, making accurate risk assessment essential for radiation protection. While the risks of high-dose radiation are well established, the health risks of low-dose radiation—less than 100 milligray (mGy)—remain uncertain. Although some epidemiologic studies provide evidence about low-dose risks, the findings remain controversial.

He noted that traditional approaches to calculating excess relative risk (ERR) rely on nonlinear parametric (NLP) models with Poisson loss that account for factors such as sex, age, and location. These models have been the standard for radiation risk estimates for the past 40 to 50 years due to their simplicity and ease of interpretation. However, parametric radiation dose-response models face significant challenges. They have been criticized for inadequately addressing uncertainties in the low-dose range, and age-related modifications to radiation risk are typically constrained by specific functional forms.

To address these limitations, Liu described how researchers, like himself, have begun exploring deep learning applications in radiation risk assessment. Deep learning, also known as deep neural networks (DNNs), offers certain advantages including greater accuracy and flexibility, although these models are notably difficult to interpret. The integration of deep learning into risk models may resolve some current limitations of traditional approaches.

The three basic types of DNNs include feed-forward neural networks where information flows from input to output, recurrent neural networks that model sequential data ideal for time-series analysis, and convolutional

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

neural networks that process spatial and temporal data used in image recognition. Neural networks are classified as “deep” when they contain multiple hidden layers, while those with only one hidden layer are termed “shallow.”

Liu provided examples of recent research that has applied deep learning to improve radiation risk estimates, particularly for low-dose radiation exposure. One study utilized two large datasets based on atomic bomb survivor information: one containing solid tumor incidence data from 17,448 cases among 105,427 survivors (Preston et al., 2007) and another with leukemia data from 491 cases among 113,011 survivors (Hsu et al., 2013).

The research teams employed a DNN with three hidden layers to estimate radiation risk. DNNs offer several advantages over traditional models: They are data-driven and model-free, approximating underlying functions without encoding specific functional forms. They are also inherently nonlinear and do not rely on predefined parametric settings, allowing them to infer dose-response relationships directly from the data.

When comparing DNN performance with NLP models for solid tumor and leukemia incidence, the DNN performed comparably to the NLP in predicting leukemia incidence and somewhat better in predicting solid tumor incidence. However, the two models produced significantly different results for ERR despite very similar cancer incidence predictions. This finding suggests that accurate tumor incidence prediction does not guarantee precise radiation risk estimation, and since the actual ground truth remains unknown, determining which model provides superior ERR values is impossible. Similar differences between models were observed for ERR predictions at low doses.

To analyze and understand AI model predictions, researchers employed SHAP values (Shapley Additive Explanations). SHAP values decompose model predictions into additive contributions from different variables, providing clear measures of feature importance. While linear models assess variable importance through coefficient magnitude, SHAP values generalize this concept to any ML model, capturing both linear and nonlinear effects (Janzing et al., 2020; Lundberg and Lee, 2017).

Liu stated that an analysis of SHAP values for solid tumor rates and ERRs revealed notable differences between DNN and NLP models. For solid tumor ERRs, dose was by far the most important factor in the DNN, while age at exposure was most important in the NLP model (with dose ranking second). These differences reflect underlying model assumptions. The parametric model assumes that radiation affects tumor risk only through age and dose interactions, and that radiation risk depends on dose in linear or linear quadratic form while following exponential or power functions with age. In contrast, SHAP values for the DNN indicate that radiation contributes to tumor risk both independently and through interactions with baseline variables. After adjusting for baseline effects on tumor risk, the contribution of baseline variables to radiation-induced tumor risk becomes smaller, which aligns with intuitive expectations.

Liu noted several limitations in DNN adoption for radiation risk estimation. Identifying sources of baseline risk and effect modification in parametric models is challenging, raising questions about whether DNNs can automatically detect baseline risk and model effect modifications. Domain knowledge presents another challenge regarding whether modifications should be guided by radiation physics or biology. Balancing data-driven learning with domain expertise remains a key challenge. Additionally, deep learning models are intricate and less interpretable than NLP models, require substantial computing resources, and make confidence interval estimation difficult.

Estimating risks from low-dose radiation presents unique difficulties for several reasons. Liu outlined how tumor risks vary significantly among individuals, and radiation risks at low doses are likely heterogeneous. Standard parametric models struggle to detect statistically significant radiation risk at low doses. The research goal is to leverage DNN insights about individual risk heterogeneity to improve low-dose radiation risk assessment.

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

Liu discussed four candidate models that exist for evaluating ERRs as functions of low-dose exposure: supra-linear, linear no-threshold, threshold, and hormesis models, each with distinct advantages and disadvantages. Individual responses to radiation can be categorized into different groups. Tumor-sensitive individuals develop tumors and are influenced by various risk factors including radiation. Tumor-resistant individuals never develop tumors despite risk factor exposure. Some individuals may be tumor-sensitive but have not yet developed tumors. Tumor-sensitive individuals who have developed tumors are likely radiation-sensitive, while tumor-resistant individuals are likely radiation-resistant.

Since individuals with leukemia are likely more radiation-sensitive, researchers developed a zero-truncated Poisson model using only subjects with leukemia. This approach detected statistically significant coefficients and ERRs for exposures in the 30–80 mGy range. Similar analysis of solid tumors yielded statistically significant coefficients and ERRs for exposures in the 90–100 mGy range.

Liu believes that DNN models may provide new insights into radiation risk assessment, with SHAP values helping identify both radiation-sensitive and radiation-resistant population groups. Considering tumor heterogeneity could enhance low-dose radiation risk assessment for both leukemia and solid tumors, as dose-response relationships at low doses may vary significantly across different tumor types and individuals. Identifying radiation-sensitive groups using epidemiological data alone presents challenges, since tumor sensitivity does not always correlate with radiation sensitivity, and any existing low-dose threshold may vary by tumor type and individual characteristics. The integration of epidemiological data with insights from radiation biology and physics may improve identification of radiation-sensitive and radiation-resistant groups, with personalized radiation risk assessment holding significant promise for future field advancements.

DISCUSSION

The discussion was moderated by Anyi Li, chief of computer service at Memorial Sloan Kettering Cancer Center, and Ceferino Obcemea, program director of medical physics at the Radiation Research Program at the National Cancer Institute. Li opened the discussion by asking Yildirim about the difficulties of interviewing diverse potential users of AI technology with varying backgrounds and perspectives. Yildirim acknowledged the challenges but explained her approach: “You’re just trying to get the initial reactions of people to understand the clinical utility and their sort of overall acceptance around the potential use case or an application.” She emphasized that her preliminary study aimed to identify promising directions, noting the importance of including participants with varying AI familiarity levels to avoid bias.

Obcemea raised concerns about data quality in multimodal modeling, questioning whether acceptable data quality could be maintained across different institutions with varying resources. Wu responded that data quality issues are complex, particularly in medical applications. He explained that while clinicians ask how many data he needs, his answer of “millions, if possible” is met with skepticism. Additionally, evolving treatments create compatibility issues—datasets curated before immunotherapy existed may not reflect current practices. Shuryak added that his head and neck cancer dataset involved extensive manual curation despite being only several thousand cases. He noted that models trained on U.S. datasets do not translate well to European countries like Poland, possibly due to population differences in risk factors.

When Obcemea asked about the generalizability of models using historical data given changed treatment techniques since the 1980s, Shuryak replied that doses have remained relatively stable, making this less concerning for his work.

Addressing an audience question about labeling practices for multimodal data across various sources (radiology, pathology, genomics, etc.), Yildirim emphasized the importance of building tools with labeling in mind and ensuring clinician participation. She noted that current healthcare systems are often old enterprise platforms that do not facilitate easy labeling.

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

Wu stressed the value of letting experts do their work first, learning from their processes, and then potentially introducing helpful tools. He shared his experience hiring five foreign-trained radiologists to create what may be the world’s largest set of annotated whole-body CT scans (5,000 cases over 5 years), noting that an average radiologist lacks time for such detailed annotation work. Shuryak agreed, emphasizing the need for adaptation to existing clinical workflows. Liu noted different challenges at RERF, where he works with historical data and focuses on standardizing formats and building databases to link data sources.

When asked about generating synthetic PET images from CT data, Wu acknowledged the difficulty in understanding how anatomic data can predict function. Despite publication challenges due to reviewer skepticism, he noted that “anyone can take the code from our GitHub and try it; we test it and we find out [that it is] surprisingly really something good.” He suspects AI discovers hidden patterns, similar to radiomics predicting gene mutations from scans.

Wu’s team is scaling up to collect roughly half a million PET/CT scans from about 50,000 patients at MD Anderson Cancer Center. His team plans to embed this in clinical practice, generating synthetic PET scans from existing CT scans and comparing them with actual PET results. Disagreements between synthetic and real scans might indicate biological differences or new phenotypes rather than model failures.

Radiologist Don Frush asked about responsibility assignment when AI causes adverse patient outcomes. Liu stated that while adverse events are “everyone’s responsibility,” ultimately doctors decide whether to use these technologies and would have to take responsibility for interpreting and conveying the result. Wu emphasized that AI tools aren’t magic and can make erroneous predictions with incorrect uncertainty estimates. He advocated for keeping humans in the loop rather than moving to full autonomy; he warned that radiologists might become complacent after AI succeeds in many cases, leading to errors when they stop paying attention. Yildirim ended the session by stating that humans should always remain in the decision-making loop, as clinicians excel at decision making and knowing when something isn’t right. She sees the real value of AI in its time savings and cognitive load reduction through tasks like “drafting initial findings” rather than in clinical decision making, leaving more time for clinicians to do what they do best.

REFERENCES

Athey, S., J. Tibshirani, and S. Wager. 2019. Generalized random forests. Annals of Statistics 47(2):1148–1178.

Chang, J. Y., S. H. Lin, W. Dong, Z. Liao, S. J. Gandhi, C. M. Gay, J. Zhang, S. G. Chun, Y. Y. Elamin, F. V. Fossella, G. Blumenschein, T. Cascone, X. Le, J. V. Pozadzides, A. Tsao, V. Verma, J. W. Welsh, A. B. Chen, M. Altan, R. J. Mehran, A. A. Vaporciyan, S. G. Swisher, P. A. Balter, J. Fujimoto, I. I. Wistuba, L. Feng, J. J. Lee, and J. V. Heymach. 2023. Stereotactic ablative radiotherapy with or without immunotherapy for early-stage or isolated lung parenchymal recurrent node-negative non-small-cell lung cancer: An open-label, randomised, phase 2 trial. Lancet 402(10405):871–881.

Hsu, W. L., D. L. Preston, M. Soda, H. Sugiyama, S. Funamoto, K. Kodama, A. Kimura, N. Kamada, H. Dohy, M. Tomonaga, M. Iwanaga, Y. Miyazaki, H. M. Cullings, A Suyama, K. Ozasa, R. E. Shore, and K. Mabuchi. 2013. The incidence of leukemia, lymphoma and multiple myeloma among atomic bomb survivors: 1950–2001. Radiation Research 179(3):361–382.

Janzing, D., L. Minorics, and P. Bloebaum. 2020. Feature relevance quantification in explainable AI: A causal problem. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics 108:2907–2916.

Lundberg, S. M., and S.-I. Lee. 2017. A unified approach to interpreting model predictions. Red Hook, NY: Curran Associates Inc.

Preston, D. L., E. Ron, S. Tokuoka, S. Funamoto, N. Nishi, M. Soda, K. Mabuchi, and K. Kodama. 2007. Solid cancer incidence in atomic bomb survivors: 1958–1998. Radiation Research 168(1):1–64.

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.

Shuryak, I., E. J. Hall, and D. J. Brenner. 2018. Dose dependence of accelerated repopulation in head and neck cancer: Supporting evidence and clinical implications. Radiotherapy & Oncology 127(1):20–26.

Shuryak, I., E. J. Hall, and D. J. Brenner. 2019. Optimized hypofractionation can markedly improve tumor control and decrease late effects for head and neck cancer. International Journal of Radiation Oncology, Biology, Physics 104(2):272–278.

Shuryak, I., E. Wang, and D. J. Brenner. 2024. Understanding the impact of radiotherapy fractionation on overall survival in a large head and neck squamous cell carcinoma dataset: A comprehensive approach combining mechanistic and machine learning models. Frontiers in Oncology 14:1422211.

Withers, H. R., J. M. Taylor, and B. Maciejewski. 1988. The hazard of accelerated tumor clonogen repopulation during radiotherapy. Acta Oncologica 27(2):131–146.

Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 50
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 51
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 52
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 53
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 54
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 55
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 56
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 57
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 58
Suggested Citation: "6 Multimodal Applications of Artificial Intelligence." National Academies of Sciences, Engineering, and Medicine. 2025. Gilbert W. Beebe Symposium: AI and ML Applications in Radiation Therapy, Medical Diagnostics, and Radiation Occupational Health and Safety. Washington, DC: The National Academies Press. doi: 10.17226/29200.
Page 59
Next Chapter: 7 Bias, Ethics, and Regulatory Issues
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.