The maturation of biology into the field of biotechnology and the precise manipulation of biological form and function have accelerated considerably in recent years, largely driven by improvements in measurement, editing, and engineering techniques such as directed evolution and CRISPR/Cas9 genetic engineering. Computational tools such as cloud computing and, in recent years, artificial intelligence (AI) have added another dimension as drivers of scientific and technological progress.
These advances have also elevated biotechnology to a U.S. emerging technology priority that is critical to national security. In 2019, the U.S. Department of Defense established biotechnology as an enterprise modernization priority to place an emphasis on the importance of its responsible development (U.S. Department of Defense, 2018; Titus, van Opstal, and Rozo, 2020), and the White House has placed biotechnology on the U.S. Critical and Emerging Technology list (Fast Track Action Subcommittee on Critical and Emerging Technologies, 2024). In parallel, the COVID-19 pandemic caused devastating economic and health damage to countries around the world and placed an intense spotlight on the risks of emerging infectious diseases. The rapid development of vaccines and therapeutics with which to treat COVID-19 is a testament to the importance of biotechnology in the 21st century. The release of ChatGPT in 2022 launched an international race to advance the capabilities of modern AI across every field, leading to
the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence in 2023.1
The rapid parallel expansion of capabilities in AI and biotechnology, set against the backdrop of the COVID-19 pandemic, has renewed concerns that the convergence of these two technologies may pose significant biosecurity risks. However, at the same time, these technologies may present opportunities to mitigate these risks. As a result, the 2023 Executive Order on AI called for an in-depth exploration to assess the biosecurity concerns and mitigation opportunities of the use of AI in the life sciences. As this report outlines, consideration of both the potential benefits and risks of AI applications in the life sciences is determined by data, scientific underpinnings of the biology, and the technical capabilities of AI models. See Box 1-1 for the statement of task.
Computational approaches to understanding and applying biology are not new, having been used now for the better part of four decades (Deshpande et al., 2024). AI applications in the life sciences promise to enhance and accelerate research capabilities to understand biological systems, given the increasing availability of biological data, technologies such as sequencing to generate multi-omics data, and advanced computing power. The convergence of AI and the life sciences is a relatively new area. The foundation of modern AI, or deep learning based on neural networks, emerged in 2012, and the application of AI to the life sciences lagged by a few years. Early work applying AI tools to study biology demonstrated promising results, but it was the 2021 release of AlphaFold and RoseTTAFold, which computationally met the 50-year-old challenge of predicting protein structure from an amino acid sequence, that captured the most attention. In 2024, the Nobel Prize in Chemistry was awarded to David Baker for computational protein design and to Demis Hassabis and John M. Jumper for protein structure prediction.2 In recent years, new tools to study and manipulate biology as well as AI biological models (see Appendix A) have increased rapidly across diverse applications, albeit with variable performance.
The expansive and pervasive nature of AI in the life sciences has generated debate as to just how powerful these tools are for enabling a wider range of users to conduct biological experiments and intentionally engineer
___________________
1 Exec. Order No. 14110, Fed. Reg. 24283 (October 30, 2023). See https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/ (accessed October 10, 2024).
2 See https://www.nobelprize.org/prizes/chemistry/2024/press-release/ (accessed October 9, 2024).
new forms and functions. The challenge posed in assessing the risks and benefits of AI in the life sciences is largely driven by the expansive definition of AI and the pace of progress in both fields. At the same time, a principal concern is whether bad actors can recreate existing biological agents or create new agents of previously unknown form and function and/or that have tailored properties. This biosecurity concern existed before AI arrived on the scene, and the question now is whether and how AI changes it.
An additional dimension of uncertainty is just how capable AI biological models are at designing and predicting novel functions based on currently available computational, data, and experimental resources as well as future capabilities. This uncertainty in the AI threat model has given rise to heightened concern from the biosecurity and national security communities. In addition, questions remain as to whether AI lowers the barriers for actors to create a biothreat, such as a novel functional and pathogenic virus, or enhance the harmful traits of an existing infectious agent. These are the types of questions that were addressed by the 2018 framework for synthetic biology (NASEM, 2018) and now need to be revisited to include recent advancements of AI tools.
At the same time, AI tools are being used by practitioners every day to develop new medical countermeasures (MCMs) and pharmaceutical interventions for a wide range of illnesses, from cancer to infectious diseases (Lee, Bubeck, and Petro, 2023; Wong et al., 2024; Xu et al., 2024). Given the emerging capabilities of AI tools, can researchers develop a new vaccine for an emerging infectious disease threat faster than ever before? Can these tools enable public health and biosecurity practitioners to identify and respond to biosecurity incidences more rapidly and efficiently?
With each new advance that aids in manipulating biological systems (e.g., genetic engineering techniques, AI-enabled tools), there has been a concomitant struggle to extract the benefits of the technology to human health, agriculture, and the economy as well as to prevent the misuse of those advances from causing harm. Owing to the dual-use nature of biological knowledge and tools, constant monitoring and evaluation of the potential risks and benefits of biotechnological advances will be needed, along with tailoring of current and development of new governance mechanisms to reduce the potential for misuse or accidents while still allowing beneficial advances that promote health and grow the economy (Titus and Russell, 2023).
Compounding the difficulty of preventing harm is the wide range of potential actors to deter—from nation-states to organized non-state actors to individuals—all with a wide range of motivations, resources, technical know-how, and capabilities. A wide range of potential pathogens and toxins already exists and could be deployed for harm anywhere in the world without any further engineering. While many pathogens that have been
developed historically as weapons do not readily spread from person to person, such as Bacillus anthracis (the causative agent of anthrax), many regulated pathogens, designated as Biological Select Agents and Toxins in the United States,3 arguably have that potential, and infectious disease agents famously do not respect national boundaries. The experiments and biological tools that could be used to modify pathogens to make them more challenging to respond to (e.g., transmissibility or host range) are exquisitely dual-use in that the same tools and experiments can be used to help develop countermeasures and assist in the response. Often, basic biological research improves our understanding of biological systems in small increments, making it very difficult to decide a priori which pieces of information should be classified or protected. Moreover, most, if not all, basic research in the discipline falls under the fundamental research exclusion rules.
Finally, there are the significant known and unknown biological threats from nature, including those that stem from human-made perturbations of nature, such as the wildlife trade (Gómez and Aguirre, 2008); alteration of human–animal interactions due to urbanization and population growth (Gibb et al., 2020); and climate change–driven expansion of environments in which potential disease vectors can thrive (Ebi et al., 2017). The latter is an increasing concern especially regarding arboviruses (i.e., viruses that replicate in and are transmitted to humans by arthropods [insects and ticks], including West Nile virus, yellow fever virus, and Zika virus) but also regarding protozoans such as plasmodia, some of which cause malaria. The historical record is full of misery and changed fortunes as a result of communicable diseases, and changing climate is likely to affect modern experiences and frequency of spillovers as well. Some have argued that there are upward of 320,000 viruses in nature that have the potential to infect humans (Anthony et al., 2013), and this number does not consider threats to animals other than humans or to plants, which could be devastating to the economy. Thus, there are potentially many more threats from nature than biosafety and biosecurity concerns surrounding research laboratories that may impact animal (including humans) and plant health under the appropriate circumstances.
Countering these biological threats and benefiting economically while curtailing the inappropriate use of biotechnology for harm is no small feat. Overlapping governance structures aim to balance these at-times conflicting priorities, and often they are criticized as either being too open—lowering
___________________
3 See https://www.selectagents.gov/sat/list.htm (accessed October 10, 2024).
barriers to misuse or accident—or too stringent—hampering economic growth of new biotechnologies, generation of new knowledge, and the ability to counter emerging and re-emerging natural threats to health and safety. The balance is often determined by how these relative risks are perceived by the person or group empowered to make these decisions, as there can be no unassailably objective formula to predict the future and determine where resources are best placed. Natural disease events happen all the time. Deliberate and accidental events have been rare, but they remain theoretically possible and could have catastrophic consequences. It is the responsibility of policymakers and scientists to understand and prepare for these occurrences.
Some regulatory and oversight guidelines specifically related to AI and the life sciences4 have been generated, though this is a relatively new policy space. Executive Order 14110 on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence5 contains sections related to the use of AI in biological applications. About a year later, the Memorandum on Advancing the United States’ Leadership in Artificial Intelligence; Harnessing Artificial Intelligence to Fulfill National Security Objectives; and Fostering the Safety, Security, and Trustworthiness of Artificial Intelligence6 (hereafter referred to as the National Security Memorandum on AI) was released. The sections specifically relevant to AI and biology are listed in Table 1-1.
___________________
4 Policies and regulatory frameworks that govern research and development in AI and the life sciences, separately, will inevitably overlap with and impact how AI and the life sciences are governed and regulated. For the purposes of this report, we focus primarily on the policies that apply specifically to AI and the life sciences.
5 Exec. Order No. 14110, Fed. Reg. 24283 (October 30, 2023). See https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence (accessed October 10, 2024).
6 See https://bidenwhitehouse.archives.gov/briefing-room/presidential-actions/2024/10/24/memorandum-on-advancing-the-united-states-leadership-in-artificial-intelligence-harnessing-artificial-intelligence-to-fulfill-national-security-objectives-and-fostering-the-safety-security/ (accessed November 3, 2024).
TABLE 1-1 Summary of National Security Memo on AI Relevant to Biology
| Task | Section | Agency |
|---|---|---|
| Facilitate testing of frontier AI models; establish capability to lead testing and risk assessment related to biosecurity | 3.3(c) | U.S. AI Safety Institute (AISI) (National Institute of Standards and Technology [NIST]/Commerce) |
| Pursue voluntary preliminary testing of at least two frontier AI models that includes assessing models’ capabilities to accelerate bioweapons development | 3.3(e)i | AISI |
| Issue guidance on measuring capabilities relevant to the risk that AI models could enable the development of bioweapons; develop mitigation measures and testing efficacy of safety and security | 3.3(e)ii, A-E | AISI |
| Develop a roadmap for future classified evaluations of advanced AI models’ capacity to generate or exacerbate deliberate chemical and biological threats | 3.3(g)i, A-B | U.S. Department of Energy (DOE), U.S. Department of Homeland Security (DHS), and AISI, in consultation with the U.S. Department of Defense (DoD) and other relevant agencies |
| Establish a pilot project to provide expertise, infrastructure, and facilities capable of conducting classified tests in this area | 3.3(g)i, C | DOE |
Support efforts to utilize high-performance computing resources and AI systems to enhance biosafety and biosecurity in development of AI systems trained on biological and chemical data:
|
3.3(g)ii, A-E | DoD, U.S. Department of Health and Human Services (HHS), DOE, DHS, National Science Foundation (NSF), and other relevant agencies |
| Task | Section | Agency |
|---|---|---|
| Incorporate guidance by AISI outlined in subsection 3.3(e) for agencies that develop publicly available dual-use foundation AI biological and chemical models | 3.3(g)iii | Relevant agencies |
| Convene academic research institutions and scientific publishers to develop voluntary best practices and standards for publishing computational biological and chemical models, datasets, and approaches, including those that use AI | 3.3(g)iv | NSF, DoD, Commerce (AISI within NIST), HHS, DOE, other relevant agencies, Office of Science and Technology Policy (OSTP) |
| Develop guidance promoting the benefits of and mitigating the risks associated with in silico biological and chemical research | 3.3(g)v | OSTP, National Security Council, Office of Pandemic Preparedness and Response Policy, in consultation with relevant agencies |
SOURCE: Adapted from “Memorandum on Advancing the United States’ Leadership in Artificial Intelligence; Harnessing Artificial Intelligence to Fulfill National Security Objectives; and Fostering the Safety, Security, and Trustworthiness of Artificial Intelligence,” The White House, October 24, 2024, https://bidenwhitehouse.archives.gov/briefing-room/presidential-actions/2024/10/24/memorandum-on-advancing-the-united-states-leadership-in-artificial-intelligence-harnessing-artificial-intelligence-to-fulfill-national-security-objectives-and-fostering-the-safety-security/.
AI-enabled biological tools are an emerging area of research and development in the life sciences. The increasing prevalence of AI tools utilized in the life sciences and their rapid development have brought into focus both the promising benefit of their uses and concern that they may increase biosecurity risks. The National Academies of Sciences, Engineering, and Medicine established the Committee on Assessing and Navigating Biosecurity Concerns and Benefits of Artificial Intelligence Use in the Life Sciences to conduct this study (see Appendix C for biographical information). The statement of task provided to the committee is in Box 1-1.
The charge to the committee is focused on the ways in which AI-enabled biological tools trained on biological data may impact the design of transmissible biological agents that could lead to epidemic- or pandemic-scale consequences or how they could be leveraged for mitigation strategies and other beneficial applications. In a public briefing held on June 21, 2024, the study sponsors stated their interest in the study providing a balanced understanding of both the benefits and risks.7 In interpreting the statement
___________________
7 Presentation to the committee by representatives from the U.S. Department of Defense, National Security Council, and Office of Science and Technology Policy, June 21, 2024. See Appendix B.
of task, the committee considered it worthwhile also to examine current AI-enabled biological tools’ capabilities with respect to designing transmissible biological threats that may pose a more limited or local scale of impact than that of an epidemic or pandemic. General-purpose AI trained on data beyond biological data, such as chatbots or general large language models (LLMs; e.g., ChatGPT, Claude, Gemini, Llama), is considered outside the scope of this study. Scientific LLMs that are trained or fine-tuned on biological data are, however, within the scope. The committee also considered future developments in utilizing general-purpose LLMs as interfaces or assistants with other biological computational tools. The committee also acknowledged that while nonbiological AI-driven technologies may increase biosecurity risks that are not focused on the design aspect, they are also beyond the scope of this report.
The committee reviewed relevant literature and heard from experts with a broad range of expertise and perspectives as part of the information-gathering process. Three public meetings were held from August
2024 to October 2024 and covered topics such as the current state of AI-enabled biological tools, understanding the science of pathogenesis of transmissible biological agents, AI-driven MCM development, biological data resources, and high-performance computing (see Appendix B for agendas of public sessions).
The committee did not access classified information in consideration of the questions related to the statement of task, and the resulting report is unclassified. As such, the report focuses on the capabilities of AI biological models and the state of science and is not intended to be a threat assessment. Instead, the study utilized a framework for assessing potential biodefense concerns developed in a 2018 National Academies report, titled Biodefense in the Age of Synthetic Biology.
In the current report, the committee adopts the term “AI-enabled biological tools” to refer broadly to applications used to interact with AI models that are trained on biological data. The committee also distinguishes between biology-specific tools and general AI tools that have broad domain applications. References to biological data in this report are limited to include the types of biomolecules such as DNA, RNA, protein, or large sets of biomolecules known collectively as omics data (e.g., genomics, transcriptomics, proteomics, metabolomics) that are publicly available. Unless stated otherwise, biological data as used in the report do not include health or medical data.
This report is divided into five chapters that examine, from different angles, the biosecurity risks and benefits of AI applied to biological sciences and potential mitigations. Chapter 2 examines the capability uplift of AI-enabled biological tools on the Design-Build-Test-Learn cycle of the synthetic biology process and discusses future developments that may help increase ∆AI. It also provides a detailed discussion of the considerations for AI-enabled design with respect to differing biological complexities, from simple molecules to full replicative infectious agents. Because AI model innovation will rely heavily on access to data, Chapter 3 discusses how AI-enabled biological tools can uniquely enable biological design at different levels of biological complexity in harmful applications. AI-enabled biological tools have vast potential to enhance biosecurity, and Chapter 4 covers the ways AI can be leveraged to bolster biosurveillance and accelerate the development of MCMs in response to biological threats. Finally, Chapter 5 explores the importance of high-quality biological data to train AI models and optimizing and coordinating data infrastructure and access to resources. The chapter also discusses the potential vulnerabilities of biological datasets used for training, and various approaches for model evaluations and mitigation.
Anthony, S. J., J. H. Epstein, K. A. Murray, I. Navarrete-Macias, C. M. Zambrana-Torrelio, A. Solovyov, R. Ojeda-Flores, N. C. Arrigo, A. Islam, S. A. Khan, P. Hosseini, T. L. Bogich, K. J. Olival, M. D. Sanchez-Leon, W. B. Karesh, T. Goldstein, S. P. Luby, S. S. Morse, J. A. K. Mazet, P. Daszak, and W. I. Lipkin. 2013. “A strategy to estimate unknown viral diversity in mammals.” mBio 4 (5):e00598–13. https://doi.org/10.1128/mbio.00598-13.
Deshpande, D., K. Chhugani, T. Ramesh, M. Pellegrini, S. Shiffman, M. S. Abedalthagafi, S. Alqahtani, J. Ye, X. S. Liu, J. T. Leek, A. Brazma, R. A. Ophoff, G. Rao, A. J. Butte, J. H. Moore, V. Katritch, and S. Mangul. 2024. “The evolution of computational research in a data-centric world.” Cell 187 (17):4449–4457. https://doi.org/10.1016/j.cell.2024.07.045.
Ebi, K. L., N. H. Ogden, J. C. Semenza, and A. Woodward. 2017. “Detecting and attributing health burdens to climate change.” Environmental Health Perspectives 125 (8):085004. https://doi.org/10.1289/EHP1509.
Fast Track Action Subcommittee on Critical and Emerging Technologies. 2024. Critical and Emerging Technologies List Update. Office of Science and Technology Policy. https://www.govinfo.gov/content/pkg/CMR-PREX23-00185928/pdf/CMR-PREX23-00185928.pdf. (accessed October 23, 2024)
Gibb, R., D. W. Redding, K. Q. Chin, C. A. Donnelly, T. M. Blackburn, T. Newbold, and K. E. Jones. 2020. “Zoonotic host diversity increases in human-dominated ecosystems.” Nature 584 (7821):398–402. https://doi.org/10.1038/s41586-020-2562-8.
Gómez, A., and A. A. Aguirre. 2008. “Infectious diseases and the illegal wildlife trade.” Annals of the New York Academy of Sciences 1149 (1):16–19. https://doi.org10.1196/annals.1428.046.
Lee, P., S. Bubeck, and J. Petro. 2023. “Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine.” New England Journal of Medicine 388 (13):1233–1239. https://doi.org/10.1056/NEJMsr2214184.
NASEM (National Academies of Sciences, Engineering, and Medicine). 2018. Biodefense in the Age of Synthetic Biology. Washington, DC: The National Academies Press. https://doi.org/10.17226/24890.
Titus, A., and A. Russell. 2023. “The promise and peril of artificial intelligence—Violet teaming offers a balanced path forward.” arXiv. https://doi.org/10.48550/arXiv.2308.14253.
Titus, A. J., E. van Opstal, and M. Rozo. 2020. “Biotechnology in defense of economic and national security.” Health Security 18 (4):310–312. https://doi.org/10.1089/hs.2020.0007.
U.S. Department of Defense. 2018. Summary of the 2018 National Defense Strategy of the United States of America. https://dod.defense.gov/portals/1/documents/pubs/2018-national-defense-strategy-summary.pdf. (accessed November 17, 2024)
Wong, F., E. J. Zheng, J. A. Valeri, N. M. Donghia, M. N. Anahtar, S. Omori, A. Li, A. Cubillos-Ruiz, A. Krishnan, W. Jin, A. L. Manson, J. Friedrichs, R. Helbig, B. Hajian, D. K. Fiejtek, F. F. Wagner, H. H. Soutter, A. M. Earl, J. M. Stokes, L. D. Renner, and J. J. Collins. 2024. “Discovery of a structural class of antibiotics with explainable deep learning.” Nature 626 (7997):177–185. https://doi.org/10.1038/s41586-023-06887-8.
Xu, H., N. Usuyama, J. Bagga, S. Zhang, R. Rao, T. Naumann, C. Wong, Z. Gero, J. González, Y. Gu, Y. Xu, M. Wei, W. Wang, S. Ma, F. Wei, J. Yang, C. Li, J. Gao, J. Rosemon, T. Bower, S. Lee, R. Weerasinghe, B. J. Wright, A. Robicsek, B. Piening, C. Bifulco, S. Wang, and H. Poon. 2024. “A whole-slide foundation model for digital pathology from real-world data.” Nature 630 (8015):181–188. https://doi.org/10.1038/s41586-024-07441-w.