Notes
CHAPTER 9 Perspectives on AI from Across the Disciplines
NAS/NAE/NAM Working Group Members
Authored by the working group participants who are members of the National Academy of Sciences (NAS), the National Academy of Engineering (NAE), or the National Academy of Medicine (NAS), the perspective pieces included in this chapter responded to the conveners’ request to briefly describe how AI had or could transform their disciplines. We include an edited digest of the responses they delivered at the two Sunnylands convenings. Academy members are listed in alphabetical order followed by their respective NAS, NAE, or NAM section(s).
—Kathleen Hall Jamieson, William Kearney, and Anne-Marie Mazza
David Baltimore (NAS, NAM)
Microbial Biology
Biochemistry, Cellular and Developmental Biology, Medical Microbiology and Immunology, and Genetics
Biology has been undergoing a continual revolution since I began working in biology in 1960. And the depth of that revolution, call it every five years, is astounding. The things that we think about today and the things that we do today bear no relation to what we did five years ago, ten years ago, 20 years ago. We’ve had to adapt to this continuing revolutionary behavior because it’s exciting and because each revolution generates a new depth of understanding.
Now the latest involves melding AI into what we built over these many years, and the results have been astounding. AlphaFold, which allows us to predict fairly accurately the structure of proteins from simple sequence, something we dreamed of doing, is itself one of these revolutionary moments. So we in biological research have thought a lot about how you control something moving at this extraordinary pace. Most of you will be aware of the Asilomar process of many years ago, where we established a procedure for taking on a revolutionary methodology that might have downsides as well as upsides. That set the stage for the more recent adaptation of genome editing into our portfolio of techniques, the potential of changing the inheritance of our whole race. I hope that thinking about these precedents will be in the agenda of this meeting, and biologists can be a little helpful with the experience that we’ve had in thinking about regulatory issues and the other meta issues that allow science to move forward.
Vinton G. Cerf (NAS/NAE)
Computer and Information Sciences
Computer Science and Engineering
In 1943 Warren McCullough and Walter Pitts invented the perceptron. In 1957, Frank Rosenblatt implemented the concept using a neural network that had three layers and was capable of classifying groups of objects that are linearly separable (e.g., by drawing straight lines in a two-dimensional space). More complex separation functions required more layers, as researchers discovered in subsequent work.
The term artificial intelligence (AI) was coined at a time when computers were still relatively new (early 1960s) and had been called “artificial brains” by some. There was a kind of hubris in this nomenclature. Researchers began with heuristic programs that seemed to exhibit intelligent behavior some of the time. A more codified version of AI emerged, called expert systems, that used a programming structure based on if-then-else logic. For example, “if this symptom is present, then there is x probability of some diagnosis, else check for a different symptom.”
In the 1980s, some researchers returned to multilayer neural networks to solve more complex classification problems. The primary computations required to adjust the weights of each neuron are forms of matrix multiplication. With the repurposing of graphical processing units, whose original purpose was fast triangular texture computation, to do fast matrix multiplies, neural networks became capable of surprisingly effective applications including speech recognition, translation, image classification and, ultimately, text and multimodal synthesis in response to a prompt. At Google, even more specialized hardware, called tensor processing units (TPUs), has been developed.
This generative form of AI has become topic A in the last few years, leading to a great deal of hyperbolic speculation about the capability of these specialized computing systems we now call large language models (LLMs). Some of them can write software. Others can distinguish and classify images. Others can synthesize voices, videos, and images from text prompts with considerable creativity. It is not hard to become excited to see systems like ChatGPT producing poetry like this haiku, which resulted from a prompt: “write a haiku about a rose in Shakespearean style”:
ChatGPT: In fairest garden, Sweet rose blooms ‘neath summer’s gaze, Beauty’s fragrant blush.
This may be compared to one of Shakespeare’s related sonnets written in 1609:
From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His tender heir might bear his memory:
But thou, contracted to thine own bright eyes, Feed’st thy light’s flame with self-substantial fuel, Making a famine where abundance lies,
Thyself thy foe, to thy sweet self too cruel:
Thou that art now the world’s fresh ornament, And only herald to the gaudy spring,
Within thine own bud buriest thy content,
And tender churl mak’st waste in niggarding: Pity the world, or else this glutton be,
To eat the world’s due, by the grave and thee.1
There are many much more compelling examples that can be offered. Some of these LLMs can produce software. Others generate text, sound, imagery, or video based on text prompts. Others write essays or provide advice in response to queries.
One might reasonably wonder, “Why are these artifacts so seemingly creative, knowledgeable, and intelligent?” The training of the neural networks involves the ingestion of large quantities of text that has been tokenized. A token can be a word or a phrase. A high dimensional model is created to capture the probability of a word occurring after the input of a prompt or a line of text. During the training period, an LLM is formed using random weights associated with the neurons of a multilayer neural network. The model starts out with a fairly poor representation of the statistical relationships among the many (sometimes billions) of tokens. The training consists of presenting the model with partial sentences, asking the model to “fill in the blanks.” In what is called back-propagation, good responses, for some value of good, adjust the producing weights accordingly. Bad responses cause the weights to be adjusted to be less likely to produce the undesired response.
A text LLM is essentially a statistical reflection of the text that it was trained on. It could be thought of as a compressed representation of the ingested text. It should not be surprising that this statistical representation produces conceptually coherent output that mimics human discourse and even reasoning. After all, the content used to train the model had meaning that was expressed more or less coherently. The statistical model of which words might reasonably follow preceding words contains some of that knowledge. If the source material is grammatically correct and uses a broad vocabulary of words, it can mimic human discourse in convincing ways.
It is precisely because of this convincing mimicry that one is led to imagine that the LLM (or “bot”) is nearly sentient. Of course, it is nothing of the sort. It is simply a generative system that is driven in part by the coherent expression of fact and belief contained in the training material.
Because of its statistical nature, a generative LLM can also produce counterfactual output in response to prompts. The training often lacks any indication of context such that text can be generated that draws on words occurring in different contexts that, when strung together, produce false assertions. This phenomenon is sometimes referred to as hallucination. We are still some distance from understanding how to curb this tendency.
Perhaps even more disturbing is that these generative systems produced generally very good quality sentences. These often sound very anthropomorphic: “I am just a chatbot.” The self-reference imbues the system with the verisimilitude of humanity and self-consciousness. Users of these systems sometimes see the responses as empathetic and often give them credibility for social awareness that the LLM does not deserve. Of course it sounds humanistic; it was derived from the expression of human discourse and writing!
These systems produce the illusion of human discourse and are often extremely convincing, even when completely wrong. We are learning to use them in myriad ways but should be wary of being misled by the glib responses to our prompts. Critical thinking is our friend in the use of these artifacts. At some point, perhaps there will also be norms, rules, and even regulations to protect users from taking advice that sounds authoritative but is dangerously wrong. We have a lot to learn about these complex artifacts and meanwhile should be wary of their application for anything that might be high risk.
Joseph S. Francisco (NAS)
Chemistry
In the field of chemistry, AI is being increasingly embraced by both publishers and researchers. Its application has surged significantly in recent years, revolutionizing various aspects of journal operations. AI is now instrumental in enhancing manuscript quality by aiding authors in refining their work with greater precision. Moreover, AI’s role extends into the production process, a development that many, including myself, were previously unaware of. However, this rapid integration of AI also brings with it a few growing concerns.
One major concern is the role of AI in aiding paper mills. AI can help paper mills produce papers that evade detection, making it challenging for journals to identify submissions originating from paper mills. The papers produced by paper mills often lack real data, have manipulated images, and have authors without institutional emails or public records, who are hard to trace.
AI tools are being used by publishers to detect paper mill submissions by reviewing visual content and sub-images to identify discrepancies. Such tools can flag duplicated and manipulated images and figures before publication, enabling publishers to correct unintentional errors or reject manipulated manuscripts. Publishers are actively working to detect these kinds of submissions to maintain the integrity of the research they publish in their journals.
In chemistry, analytical chemistry and biochemistry lead in integrating AI into their research compared to other subfields. However, areas like organic synthetic chemistry have not yet seen AI’s influence. Despite this, the field holds tremendous potential for AI to facilitate the discovery of new molecules.
An emerging issue in organic chemistry involves the use of AI to generate synthetic procedures. The question is whether AI can reliably produce procedures to synthesize molecules. Additionally, if a novice chemist follows an AI-generated procedure without adequate experience, it could lead to dangerous situations. To mitigate this risk, we need to ask upfront questions: Should there be post hoc filtering for the synthesizability of AI-generated results? If a procedure is generated by AI without validation, who is held accountable—the AI, the user, or the journal? This presents significant chemical safety issues that have not yet been fully addressed.
AI systems are typically trained on representative datasets, learning from these datasets to formulate predictions based on observed patterns or to generate new data. Consequently, AI models require accurate and readily accessible datasets. However, the reliability of the databases used to train AI remains a significant concern. Many databases that AI systems rely on lack reliability, even though some dependable databases do exist. The success rate of various databases and libraries used to train AI is not well established. An important benchmark in this context is the number of synthetic steps required to produce molecules generated by AI. This emerging issue in synthetic chemistry might explain why fewer researchers are integrating AI into this area.
Barbara J. Grosz (NAE)
Computer Science and Engineering
Generative AI models have been changing computer science in the various ways John Hennessy describes for engineering fields at large, and their ability to help researchers find relevant prior work has the advantages and challenges he notes. More profoundly, generative models are providing new ways of interacting with computer systems, and, as they have proven capable of producing useful segments of code, they are radically changing the ways programmers work. Computer science education is changing as a result.
AI research in natural-language processing has had as its goals understanding people’s linguistic capacities and building systems that could match those capacities. It aimed, in part, to enable systems to participate in dialogues similar to those that occur when people talk with one another. Generative AI methods have enabled stunning successes in natural-language processing. In myriad ways, dialogues with systems based on these models now help people more easily use computer systems to find information and accomplish tasks across a broad range of domains. Though carried out in the languages people ordinarily speak, and thus more natural than programming, these dialogues lack certain features of human-to-human dialogue. A kind of guided, sometimes collaborative, search for an answer, suggestion, or solution, they are a mixture that is best captured by the new phrase (and job opportunity), prompt engineering: the prompts are in a natural language, but the need for engineering is a symptom of the distinction.
The engineering must be good engineering; for that, anyone who does prompt engineering needs to learn not only effectiveness and efficiency but also ways to judge the quality of the results. Years ago, a theoretical physicist railed at me that computing was making his students less competent modelers. They “just coded,” without questioning the answers they got back; they trusted the computer and had not developed intuitions that immediately made them consider whether the answers it generated made sense. For any complex computing system, it is hard to know whether a program does what one intends and expects it to do. Our current inabilities to understand why generative models produce the answers they do, and the hallucinations for which they are well known, exacerbates the problem of knowing whether the code they produce actually correctly performs the functions a user intends. Computer scientists are as susceptible to pro-automation bias—the assumption that if a computer produces an answer, it is right—as others. Generative AI thus raises a critical new challenge for AI researchers, that of verifying the results that prompt dialogues produce. Meeting this challenge is likely to require expertise from several other areas of computer science, for instance, work on program verification and on interaction techniques. Notably, the new methods that are developed could be useful far beyond programming and computer science.
Computer science education is evolving in light of these new AI capabilities. Students, like professional programmers, are now using generative AI systems to code for them. They differ from professional programmers, though, in the amount of experience they have programming without such support. How will they develop intuitions for detecting if the AI system has provided good code? What new skills do they need to learn for debugging and testing? The powers that generative AI has released makes it even more important for computing researchers, developers, and systems’ deployers to consider not only what systems they could build, but what systems they should build and the right way to design them, taking account of their potential impacts on communities and societies as well as individuals. Teaching the skills to reason about such matters is also beginning to become part of computer science education in some institutions.
John L. Hennessy (NAS, NAE)
Computer and Information Sciences
Computer Science and Engineering
We find ourselves in an interesting and fast-moving era. As Eric Horvitz and Tom Mitchell have discussed in their survey chapter, the emergence of deep leaning has created a discontinuity in the capabilities of AI systems. Many new engineering faculty members across a wide variety of disciplines have machine learning in a description of their research. We are seeing an incredible revolution in engineering in which these machine learning techniques are going to be used as scouts to find novel approaches to problems and as tools to narrow the solution space, particularly for complex, high-dimension optimization problems.
For example, researchers exploring new battery structures might use deep-learning techniques to search for materials that avoid some of the downsides of lithium. Another researcher might explore new methods to capture methane. One of the most amazing applications I have seen is the use of machine learning to understand turbulence and turbulent flow. Turbulence is a classically hard problem that has resisted most of our numerical attempts. A breakthrough in analyzing and understanding turbulent flow would have applications in the design of wind turbines, automobiles, trains, and planes, as well as applications in other areas.
Of course, these deep-learning systems can be joined with traditional computational methods, as AlphaFold does. AlphaFold isn’t just an AI system. It uses computational techniques as well. Melding these techniques together, allows a researcher to combine the strengths of each. The deep-learning system may work better to determine the overall structure of a protein, while computational techniques may be more useful at fine-tuning the structure.
For engineers, finding the general structure of a solution is only step one of a process to realize a product that efficiently solves a real problem, which in the end is what drives engineers. Of course, you must worry whether an AI system is guiding the researchers in the right direction. Just as in other applications of machine learning, verification of the accuracy of predictions will be important, and that will likely require human intelligence for some time to come.
Eric Horvitz (NAE)
Computer Science and Engineering
Reflecting on the current state of AI, I find myself immersed in two interrelated realms: the scientific advancements of AI and their societal impacts. We are in an exciting period for AI, with the capabilities of neural network models rising faster than our understanding of the principles underlying the emergent behaviors we are observing. These advancements have stimulated scientific curiosity and catalyzed new directions for AI research, bringing novel questions, methods, energy, and intensity to colleagues and teams that I collaborate with. Simultaneously, the rapid diffusion of AI tools into everyday life has deepened my sense of responsibility regarding the short- and long-term societal influences of AI technologies. I have invested increasing time and resources in reflecting on and addressing potential disruptions, ethical concerns, and the opportunities AI presents in various realms.
Scientific Journey
I was drawn to do my doctoral work in AI as the best path forward for gaining an understanding of the mysteries of human cognition. Close colleagues and I contributed to the ignition of a probabilistic revolution in AI, moving away from the dominant logic-based methods of the time, and working to advance the development of AI systems based on a foundation of probability and utility theory. The axioms of probability were extended in the 1940s to inferences about taking ideal actions in the world via the axioms of utility theory, as first formulated by von Neumann and Morgenstern. Probability and utility theories form a widely assumed and celebrated set of principles that have been considered a normative basis for rational reasoning and decision-making. There are multiple challenges with building AI systems in accordance with these principles, including computational complexity. A long-term complaint in AI was that the normative basis was unrealistic in terms of the requirements for computational resources. I focused during my doctoral efforts and for many years later on developing principles and models of bounded rationality built on a foundation of probability and utility that could enable systems with limited computational resources to perform well amid the complexity of the open world. The work included the development of formal mechanisms for guiding evidence gathering and inference. Other teams explored numerous other approaches for leveraging probability in representations and reasoning. This shift to a rationalist approach to AI—harnessing a normative foundation of probability and utility—became central in advancing machine learning, perception, reasoning, and decision-making.2 The approach enabled the community to build systems that could address real-world challenges, such as providing recommendations on medical diagnoses and decisions. The rationalist approach provided clear semantics and a strong theoretical foundation for building systems operating on understandable and sound principles.
Recent advancements in neural network models mark a significant inflection point in AI’s trajectory.3 Impressive capabilities and rates of improvement are seen in vision, speech recognition, and language understanding benchmarks. Generative AI has recently emerged with models being built at increasing scales demonstrating surprising powers in generating language, images, video, and molecules. Neural-network models are being harnessed in numerous areas, including the sciences. For example, advances in predicting protein structure and drug design are accelerating research in the biosciences, including efforts to design new therapeutics.
Despite the excitement, we grapple with the relationship of neural models to prior advances. In distinction to the clarity of previous work based on the rationalist approach, much of the detailed operation of generative models remains a mystery. Neural networks have thrust us into empirical studies of these large-scale systems, akin to methodologies for probing and studying nervous systems.4 This jump, from a successful multidecade trajectory of advances with rationalist approaches in AI to the mysteries of neural networks, frames intriguing and interesting opportunities to pursue answers to significant questions that remain unanswered. We face a critical scientific challenge of bridging the gap between empirical observations of the behavior of neural networks and foundational principles of well-understood theories of inference and action. I hope to see bridges constructed over the next decade.
Societal Implications and Responsibilities
I believe that AI scientists and engineers have a critically important role and responsibility to identify and share technical developments that have implications for people—and society more broadly. Responsibilities include informing and engaging with multiple stakeholders across domains and sectors and working to broaden awareness and participation. This work involves being available for expert consultations, organizing and participating in special meetings and engagements around milestone developments, and establishing committees, organizations, and initiatives for tracking, guiding, and communicating about AI advances over time.
Fifteen years ago, AI was beginning to make its way into real-world applications as I assumed the presidency of the Association for the Advancement of Artificial Intelligence (AAAI). I themed my presidency “AI in the Open World,” highlighting the need to develop AI systems that could perform robustly and in a trustworthy manner on real-world tasks, and also our responsibility to understand and address the potential societal impacts of the AI systems that we build.5 To explore societal influences, I commissioned the AAAI Presidential Panel on Long-Term AI Futures. This initiative culminated in a retreat at Asilomar in 2009, chosen for its symbolic connection to the historic meeting on recombinant DNA.6 The clear value of the discussions and collaborations at the AAAI Asilomar retreat and pre-meetings sparked the establishment five years later of the One Hundred Year Study on AI at Stanford, which was created to bring experts together every five years to observe, synthesize, and provide assessments and guidance in the spirit of the AAAI Asilomar meeting.7 The study is endowed to continue this process for the life span of Stanford University. Projects of the study include the creation of faster-paced analyses, including the AI Index, an annual assessment of AI capabilities and influences.8
Beyond recurrent studies by experts, the ubiquity of AI’s influences requires that diverse voices participate and help to guide the development and use of AI systems. AI scientists have a responsibility to organize, alert, and educate a spectrum of stakeholders—as well as to establish venues for listening and responding. In 2016, AI scientists from industry, academia, and nonprofit research centers cofounded the Partnership on AI, bringing together stakeholders from industry, academia, and civil society to foster discussions, analyses, and make recommendations on the responsible advancement of AI.9 As the founding chair, I’ve observed the power of bringing scientists together with policymakers, civil liberties experts, and a broad spectrum of civil society organizations. While still in its first decade, the Partnership on AI has already made significant contributions to multiparty collaboration on key topics.
With potential fast-paced developments, AI scientists may need to engage quickly at times and bring diverse expertise to the table as early as possible when new capabilities and issues arise. Given the behaviors I saw in our internal studies of an early prerelease version of GPT-4 in August of 2022, I felt it important to gain permission to share the confidential prerelease model with experts across disciplines. This initiative led to the AI Anthology effort, which provides multiple viewpoints on how the new capabilities might be best leveraged for human flourishing.10
AI scientists need also to inform and provide guidance to government agencies and leaders about technical advancements with AI and work with policymakers on steps forward. It has been an honor to be invited to testify on AI at both open hearings and closed sessions of Congress11—and to have opportunities to engage with senior leadership at the White House and colleagues via my role as a member of the President’s Council of Advisors on Science and Technology (PCAST).
These diverse projects, engagements, and organizational efforts are examples of AI scientists’ responsibilities to engage and inform across sectors, to work to broaden awareness and participation, and to promote research on AI’s responsibilities, ensuring that we include multiple voices in assessments and decisions, and that we stay ahead of the innovation wave with technical, sociotechnical, and regulatory advancements.
Moving Forward
Looking ahead, the interplay between AI’s scientific advancements and societal impacts will become even more critical. We urgently need to grow our scientific understanding of the operation of systems built on neural network methodologies. Better scientific understandings will help us to shape the development and application of safe, reliable, fair, and understandable AI methods. We need to complement curiosity-driven research and the thrill of scientific breakthroughs in AI foundations with investments in technology and policy to understand, shape, and regulate influences of the technologies on people and society. This work includes ongoing study spanning technology, design, and psychology of human-AI interaction.12
The potential benefits of AI are immense—from accelerating scientific discovery to improving education and raising the quality of health care outcomes. However, we have to consider recognized risks, particularly with information and media integrity, biosecurity, fairness and equity, safety and reliability, and privacy and security. We must also stay on top of “deep currents” of more complex interactions of AI with culture and society, such as how these systems may change and disrupt—in costly and in valuable ways—education, the creative arts, scientific discovery, jobs, and the economy. We must work to monitor and come to better understandings of the subtle but potentially powerful influences of AI applications on the human psyche, including the impacts on our human dignity and agency.13 Outcomes need not be dominated by situations and equilibria reached via laissez-fair flows of technology into society. With the maturation of AI and its applications, we have opportunities to manage and guide the technology with foresight and responsibility.
The current state of AI is marked by fast-paced progress and significant challenges. As a scientist driven by curiosity about human cognition and devoted to reaching understandings of computational principles of intelligence, I’m excited by potential AI discoveries, machinery, and new applications on the horizon. At the same time, I am cautious and concerned about the influences of AI innovations on people and society. We need to make investments in steering AI’s development to promote human well-being and societal progress. Through continued scientific exploration and a thoughtful, inclusive, and multidisciplinary approach to applications and influences, we can leverage AI as a force for good, advancing our understandings of the scientific foundations of intelligence and enriching human society. AI scientists, with their unique insights, must lead at the frontier, providing awareness of developments and implications and a commitment to engage with the public, civil society organizations, government leaders and agencies, and experts across various fields to address these responsibilities and to help shape AI’s future.
Kathleen Hall Jamieson (NAS)
Social and Political Sciences
Three of our retreatants—Barbara Grosz, Mary Gray, and John Hennessy—played important roles in shaping the germinal National Academies of Sciences, Engineering, and Medicine report, “Fostering Responsible Computing Research: Foundations and Practices,” that grounds our deliberations. By observing that “The social and behavioral sciences provide methods for identifying the morally relevant actors, environments, and interactions in a sociotechnical system,” that report draws attention to the role that the social and behavioral sciences should play in framing discussions and decision-making about generative AI.14
It is the behavioral and social sciences, for example, that remind us that our language and our frames embed assumptions about ethics and equity about which we are largely unaware, a point made in Chapter 7 by Shobita Parthasarathy and Jared Katzman. Raw data for example are not “raw” but rather the product of choices and the values of those who frame the research questions, privilege some methods over others, and in the process determine what is and is not considered evidence and proof. At the same time, conventionalizing the language of “artificial intelligence” risks changing our sense of what it means to say that someone or something is intelligent.
Social scientists who focus on human interaction and the ways in which humans act within social and political structures are grappling with such questions as: How does what we humans know, how we know it, and how we interact with each other and make sense of our worlds change if AI is layered atop the dispositions that humans have to deceive, distort, and act on their fears and venal impulses?
FactCheck.org, which I cofounded, was premised on the idea that journalists could arbitrate disputes about “fact” by turning to evidence in impartial trusted sources such as the Bureau of Labor Statistics and the National Academies that honor scientific norms and have generated reliable knowledge in the past. That common knowledge could in turn help ground deliberation and governance. However, in an AI world, someone who seeks out the National Academy of Sciences’ website may find a hyperrealistic but fake site featuring a supposed President Marcia McNutt, who looks, sounds, and seems more like Marcia than Marcia herself but is promulgating pseudo-science. How can factcheckers or the public tell that the deepfake is not deep reality? I wrote a book on how Russian trolls and hackers helped elect a president in 2016. If you add the currently available AI technologies to their equation, the Russians would have succeeded to an even greater extent because their efforts probably would have gone undetected.
As AI scientists are developing ways to identify and constrain AI-generated content, social and behavioral scientists are among those probing its impact both for good and ill on democratic systems and informed voting as well as on how and what we know and how we interact with each other and with these new technologies.
Marcia K. McNutt (NAS, NAE)
Geology
Earth Resources Engineering
In my view, there is hardly a field that has more benefited from AI, but also is imperiled because of AI, than the environmental sciences. And the benefits have come because so much of this universe is inaccessible to humans or only accessed by humans at great cost and peril.
Deep space exploration was one of the earliest applications of an AI precursor called “automated planning and scheduling.” These smart systems used sensors on space probes to allow an unmanned vehicle itself to make decisions on operations based on what it was learning from its own instrumentation without having to endure the delay in sending data back to Earth for a human to make the decision.
Deep sea exploration followed suit and delivered even a higher payoff application. While underwater exploration can be conducted directly by humans, it is only with much sacrifice. Deep-diving human-occupied submersibles are cold, cramped, and uncomfortable for any length of time, and their use is further limited by high cost and extremely limited range. Remotely operated vehicles are more affordable but require an umbilical-cord tether to provide power from a surface ship and control from a ship-based pilot because the ocean is opaque to electronic message transmission. However, the tether restricts the spatial extent of the mission. Conversely, automated planning and scheduling installed in autonomous (untethered) underwater vehicles totally revolutionized our opportunities to explore the deep sea, both in cost and complexity of the mission. AI-guided vehicles can make their own decisions, execute complex search patterns in all dimensions, collect data and samples, know when the mission is accomplished, and then return home loaded with data and samples. Humans no longer needed to be involved in real time. AI-guided autonomous vehicles have greatly reduced the cost of exploration of hostile environments and increased the scientific return.
These systems were likely the forerunners of today’s automated driving routines, except that there was no safety issue in the deep space or ocean. If the vehicle misidentified something and ran into it, no one was going to die the way it is with cars on congested roads.
Other areas of the environmental sciences are benefiting from AI beyond exploration of Earth and space. For meteorology, AI is able to forecast more accurate weather predictions and track dangerous storms like hurricanes. AI could as well predict the impact of some interventions on climate change. Using the same advances that allow AI systems to distinguish faces, AI is now regularly used to identify plants, animals, and other natural features from photos. This capability has been a boon to citizen science, for example, in improving the accuracy of annual bird counts.
On the negative side, I am concerned about the impacts to the environmental sciences from very successful and convincing fakeries, especially in terms of climate science. So much is at stake with our response to the current climate crisis that big money will be invested in trying to debunk climate science and in arguing that interventions are not worthwhile. Our ability to detect when AI has been used in malevolent ways to overturn what is strong scientific consensus is constantly being challenged by more convincing fakes.
Saul Perlmutter (NAS)
Physics
In physics, cosmology, and astrophysics, some of the more frequent AI applications that we’ve seen have to do with speeding up simulations. Simulations have become a large part of so much science nowadays. It makes it possible to hunt for rare solution spaces that you wouldn’t have ever considered with a slow simulation. You might now be able to hunt for those solutions with a fast mimic of the simulation that you get with AI.
This also means that AI changes a lot about how we do statistics. Over the years, we’ve moved toward more and more Monte Carlo–style statistics where you mimic the system that you’re working with, do many renditions of it, and that gives you the contours of statistics, rather than calculate them from first principles. This is another real advantage of having a much faster technique for simulation. Statistics plays a huge role in the sciences, one that we don’t usually talk about (unless statistics is your field!). It’s a hidden-in-plain-sight important tool, and I think it’s going to change dramatically with this AI capability.
Large language models also can enable advances that combine different fields, because they make it easier to do cross-disciplinary translations. I’ve already found myself in meetings with people from different fields during which I quickly looked up terminologies, acronyms, and jargon that they’re using in those fields, and this allowed me to be part of the cross-disciplinary conversation within a matter of seconds. Previously, you would have to go back and find this whole body of knowledge. That’s going to be an important game changer since so many scientific developments have to do with working at the edges between different domains in different fields.
Similarly, AI offers more fluid data wrangling. So much data science involves getting data from point A to point B in a form that you can use, and we’re finding that these AI systems are very helpful in making that possible. You can read entire datasets without looking up the manuals. AI can explain to you what every column is, and it actually does a good job in giving you a structure to be able to work with data that you might otherwise never have accessed.
Finally, people can take more mathematically sophisticated approaches because you can treat entire mathematical derivations as if it’s a calculator helping you do an arithmetic problem. And so that makes some activities much faster.
I don’t yet know about the idea of using AI to stimulate new ideas, like feeding the AI a bunch of papers and asking it “What’s missing here?” I can’t tell yet whether this is already something that’s becoming useful or whether it’s something that we might expect to become useful in the next generation. Here, a big concern is that we don’t want to get into idea feedback loops where the AI is training on material that comes out of people working with the previous generation of AI. We want to make sure that we don’t inadvertently feed our AI-generated material back into the AI training.
William H. Press (NAS)
Computer and Information Sciences
AI will be transformative, but I am not waiting for the transformation. Right now, I use it every day for quick facts and a range of administrative and programming tasks that I would once have characterized as frustrating, fussy, or boring. (Of course, I always check the results.) Here are a few of my recent prompts:
1. “From various journals I have cut and pasted a bunch of references below for a paper I am writing. Please convert them all to PNAS format.”
2. “Where did the funding of the Einstein Foundation in Berlin originally come from? I want to be sure that it is not money from a controversial industrial source. Please check your answer against reputable Web sources.”
3. “What serious human diseases are thought to already have been endemic in the Native American population in pre-Columbian times?”
4. “Give specific names of good reviewers for a paper that builds a large NN model (not an LLM) for predicting results from a large combinatorial biology experiment? It’s similar to, but different from, drug-discovery, so I want people with broader ML and NN experience. I especially want names of junior faculty at good universities.”
5. “In Python with Numpy, if I write something like neww = oldd[3:6,10:15], does neww point to data within oldd, or is a copy made?”
6. “I have a Jupyter notebook named mynotebook.ipynb . In Python, how can I extract the text of a particular cell and then reformat it to LaTeX format? The cell I want begins with the comment #ThisCellPlease.”
7. “In Python with Pandas and Numpy, I have a dataframe df. All entries are small integers. I want to make a large crosstabulation where each column is expanded to its number of unique values. So, the crosstabulation will be an N by N matrix where N is the sum of the number of uniques for each column. How do I do this? Code only, please, no explanations.”
8. “I have a very big numerical dataframe and want to fit it with a Gaussian copula, and then generate synthetic rows from the fitted model. Show me PyTorch code for doing this efficiently on a CUDA GPU.”
9. “I have an HTML and PHP page that uses Google’s Recaptcha v2 like this:
$recaptcha = $_POST[“g-recaptcha-response”];
$secret_key = ‘my-secret-key’;
$url = ‘https://
. $secret_key . ‘&response=’ . $recaptcha;
$response = file_get_contents($url);
$response = json_decode($response);
What would the code be to upgrade this to Google’s Recaptcha v3?”
Jeannette M. Wing (NAE)
Computer Science and Engineering
We are witnessing unfettered growth in the deployment of AI systems in critical domains such as autonomous vehicles, criminal justice, education, health care, and public safety, where decisions taken by AI agents directly impact human lives. This growth underlines the need for computer scientists to understand and harness this technology better.
We need a scientific understanding of why today’s AI models work so well. We do not know their mathematical properties. We do not know how to quantify or predict their behavior. We do not know how to explain why an AI model produces one result and not another. Small perturbations to input data can lead to wildly different outcomes. When will adding more compute and more data to build larger models hit a wall? Experimentation in AI is far ahead of any kind of theoretical understanding.
We need trustworthy AI.15 How can we trust decisions made by AI models to be accurate, fair, reliable, robust, safe, and secure, especially under adversarial attack? One approach is to use formal methods, based on mathematical logics and symbolic reasoning, to provide provable guarantees about AI systems. Formal methods applied to AI would require probabilistic reasoning and characterizing verifiable properties of real-world data.
AI raises new ethical issues. The Belmont Principles of beneficence, justice, and respect for persons are a good starting point for AI. They need to be lifted to operate on groups of individuals, not only on individuals. Finally, we need to revisit the codes of conduct in all professions that incorporate the use of AI.
Michael Witherell (NAS)
Physics
I am speaking as a leader of Berkeley Lab, where I have the privilege of leading 1,600 scientists working in a wide range of science and technology. As part of my job, I’ve had the joy of reading published impactful research on applying machine learning techniques in cosmology; particle and nuclear physics; material science; synthetic biology; matter genomics; environmental biology; geoscience; climate modeling; and smart grid, water treatment, and accelerator operations.
AI has had a transformative effect across all these fields of science, but much of the effort is invested in developing stable, robust, interpretable methods that can be explained, exposed, and verified to a skeptical scrutiny of researchers in these fields. And that’s actually what a lot of the work has been. What is often the primary barrier to accelerating R&D using AI is not the computing power available but rather the size and quality of the experimental or computational datasets available for training the models.
As an example, most of the data on local ecosystems were collected in small projects, producing specialized datasets in many areas that will not take advantage of meta-analysis in general, let alone AI, unless we have interoperable datasets. Several groups around the country are working on projects to integrate these datasets, including one at Berkeley Lab.
I would like to offer another quick example that has been recently in the news. Researchers have developed fast, agile, and reliable weather models using AI that offer an unprecedented level of high-resolution information. Such models could provide improved guidance to prepare communities for extreme weather events. In a very short time, these models have gotten to the point that their results are as reliable as for the traditional models and can be run much faster. The new models produce a range of scenarios, each one taking less than two seconds, which is several orders of magnitude faster than existing models. One can now create huge ensembles of predicted weather outcomes, greatly increasing the ability to forecast low-probability, high impact events. Consider the lives that could be saved if such models can be made very reliable and if the predictions they make can be communicated in a way that is trusted by the public.
Most of the AI-enabled advances in research to date have been accomplished with special purpose models. Because the remarkable general-purpose large AI models are so new, we still need to understand their full potential to accelerate scientific research. If we consider the fields of science in which the data is not personal data, the principal risk is that an apparent discovery might be due to an artifact or a hallucination by the model. How do you show that this new type of black hole is real and not something that was manufactured by the model? One must closely embed computer scientists with physical scientists, biologists, and climate scientists from the beginning of the research project. By working together as an integrated team they can develop analytic tools that will produce verifiable results able to stand up to rigorous scrutiny by the scientific community.
Finally, although many of these areas do not work with human data, they still can have complex and sensitive interactions that have ethical and societal implications. For example, consider a system for detecting, measuring, and reporting methane leaks using satellite data and ground-based observations, all integrated with AI. This is a really important problem with great significance for the global community. Who is to be trusted with the design and use of such a system? An oil and gas company, a consortium of utilities, the US Department of Energy? The governance of such a system is critical in making sure it serves all of us well.
Notes
1. William Shakespeare, Sonnet 1, 1609.
2. John von Neumann and Oskar Morgenstern, Theory of Games and Economic Behavior (Princeton, NJ: Princeton University Press, 1944).
3. Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, et al., “Sparks of Artificial General Intelligence: Early Experiments with GPT-4,” arXiv preprint arXiv:2303.12712, March 22, 2023, https://
arxiv .org /abs /2303 .12712. 4. Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov, “Locating and Editing Factual Associations in GPT,” Advances in Neural Information Processing Systems (2022), 17359–17372; Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, et al., “In-Context Learning and Induction Heads,” arXiv:2209.11895, September 24, 2022, https://
arxiv .org /abs /2209 .11895; Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, et al., “Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models,” arXiv (2023). 5. Eric Horvitz, “AI in the Open World, Presidential Lecture,” speech at the AAAI National Conference, Chicago, Illinois, July 2008, https://
erichorvitz .com /AAAI _Presidential%20Address _Eric _Horvitz .pdf. 6. “AAAI Presidential Panel on Long-Term AI Futures,” Association for the Advancement of Artificial Intelligence, 2009, https://
aaai .org /about -aaai /aaai -presidential -panel -on -long -term -ai -futures -2008 -2009. 7. Eric Horvitz, “One Hundred Year Study on AI: Reflections and Framing,” Stanford University, 2014, https://
ai100 .stanford .edu /about /reflections -and -framing. 8. The AI Index Annual Report 2024 (Palo Alto, CA: Stanford University, 2024), https://
aiindex .stanford .edu /report /. 9. Partnership on AI, https://
partnershiponai .org. 10. Eric Horvitz, “Reflections on AI and the Future of Human Flourishing,” AI Anthology (2023), https://
unlocked .microsoft .com /ai -anthology /eric -horvitz. 11. Eric Horvitz, Reflections on the Status and Future of Artificial Intelligence, Testimony Before the United States Senate, Hearing on the Dawn of Artificial Intelligence, Committee on Commerce Subcommittee on Space, Science, and Competitiveness (November 30, 2016), testimony: https://
erichorvitz .com /Senate _Testimony _Eric _Horvitz .pdf; video: https:// youtube .com /watch ?v =fl -uYVnsEKc; Eric Horvitz, AI and Cybersecurity: Rising Challenges and Promising Directions, Hearing on AI Applications to Operations in Cyberspace before the Subcommittee on Cybersecurity of the Senate Armed Services Committee, 117th Cong., May 3, 2022, https:// erichorvitz .com /Testimony _Senate _AI _Cybersecurity _Eric _Horvitz .pdf. 12. Abigail Sellen and Eric Horvitz, “The Rise of the AI Co-Pilot: Lessons for Design from Aviation and Beyond,” Communications of the Association for Computing Machinery 67, no. 7 (June 28, 2024): 18–23, https://
doi .org /10 .1145 /3637865. 13. Eric Horvitz, Vincent Conitzer, Sheila McIlraith, and Peter Stone, “Now, Later, and Lasting: 10 Priorities for AI Research, Policy, and Practice,” Communication of the Association for Computing Machinery 67, no. 6 (May 7, 2024): 39–40, https://
doi .org /10 .1145 /3637866. 14. National Academies of Sciences, Engineering, and Medicine, Fostering Responsible Computing Research: Foundations and Practices (National Academies Press eBooks, 2022), https://
doi .org /10 .17226 /26507. 15. Jeannette M. Wing, “Trustworthy AI,” Communications of the ACM 64, no. 10 (October 2021): 64–71, https://
doi .org /10 .1145 /3448248.