Two of the summit’s sessions were devoted to offering some context to the overall topic of finding ways to collaborate on research data. Speakers addressed a broad range of subjects, from current collaborative research projects to non-data issues that need to be considered when developing research data collaborations. This chapter describes those presentations and the takeaways from each.
Martin Halbert, the science advisor for public access for the National Science Foundation (NSF), used the EarthCube family of projects as a case study to illustrate what is required to create successful research data collaborations.
Halbert began his presentation with a mention of Vannevar Bush’s classic 1945 book, Science, The Endless Frontier, which is generally recognized as responsible for setting in motion the federal government’s support of scientific research after the Second World War. In the book, Bush argued that restricting access to the most recent research results is counterproductive, as it hinders the progress of scientists and their ability to build on the work of others. “It’s for this reason that policies and practices to advance collaborative research data sharing are and should be a key pillar of contemporary science,” Halbert said.
He added that data sharing does not happen without effort, emphasizing the costs involved in cleaning the data, documenting the structure of the data, and maintaining the data in accessible repositories. “Almost every decision involved in sharing research data entails cost-benefit trade-offs of the sort,” he said, “and it’s no surprise that it’s so intensely debated today.”
When making decisions on how best to share research, it is also important to consider the principles of justice, equity, diversity and inclusion, or JEDI, Halbert added. Integrating these principles into research data practices and collaborations, he noted, presents an additional challenge.
Determining the best collaborative practices that consider these principles can be difficult, at least partly because of the variation across scientific disciplines in research practices as well as in practices for collecting and maintaining data. “The key priorities in one discipline may not be an issue or even exist in another discipline because of these variations,” he said, indicating that success in sharing research data in one field might not be generalizable to other fields. Still, it can be valuable to examine such successes and lessons learned. Halbert discussed a case study of the EarthCube family of projects.
EarthCube refers to a series of initiatives and constituent projects funded primarily by NSF’s Directorate for Geosciences and Office of Advanced Cyberinfrastructure over the past decade involving data-intensive endeavors in geoscience. The initiative has created an extended community of more than 2,000 researchers and technologists who have engaged in hundreds of collaborative scientific projects. The impetus for the initiative was the idea that funding collaborative research at the intersection of geosciences and cyberinfrastructure would lead to a broad range of discoveries not only in the geosciences and adjacent interdisciplinary spaces but also in the understanding of generalizable cross-cutting issues in data-intensive science. Halbert noted that this is precisely what happened.
“The key lessons learned in the EarthCube program are effectively a litany of hard-won and valuable—but often inconvenient truths—about collaborative research data practices at scale,” he continued. EarthCube demonstrated that effective data-intensive research requires orchestrated and well-funded efforts by talented individuals from a range of sectors with different skill sets. “The inherently complex nature of cross-cutting research today,” he said, “requires teamwork by not only disciplinary scientific experts but also technologists, information specialists, science writers, program evaluators, and other experts in various operational areas.” In turn, this teamwork requires the careful development of func-
tional organizational structures, rather than the more typical scattershot approach of relying solely on volunteer work through committees. There is a need for leadership commitments and informed planning in order for the collaborative endeavor to set and achieve goals in reasonable time frames. Collaborative research practices also require building a reliable cyber infrastructure that is periodically refreshed and maintained by dedicated staff as well as communities of researchers and other interested stakeholders who interact through various events and interfaces that are effectively managed. There needs to be a commitment to the principle that everyone involved is respectful of other perspectives and committed to the success of the extended group rather than promoting their own self interests. Careful attention must be given to the incentive structures established, recognizing the diverse interests of the various groups involved in the collaboration. The governance processes created for the project need to prioritize and incorporate the JEDI principles. Finally, sustainability planning needs to be part of the process from the outset, with clear and reasonable expectations for continuity and, where necessary, procedures for phasing out the processes.
“None of this is cheap or easy,” Halbert said. “All of these inconvenient truths require funding to address, and that again comes back to trade-offs. Funding all these elements that may seem to be more boring or mundane—rather than another shiny research toy—usually results in people clamoring to fund everything. But that isn’t possible. We have limited funding; we can’t fund everything we would like to. If we want to realize the true power of collaborative research data, we have to be willing to forgo other things that are also desirable.” Thus, he continued, the decision-making process here must be fundamentally about finding the right balance in expectations. “When we do, the results are wondrous.”
David Hart, professor of public policy at the Schar School of Policy and Government at George Mason University, introduced the forthcoming Foundation for Energy Security and Innovation (FESI).1 When established, FESI will become a member of agency-affiliated foundations, which are
___________________
1 FESI was subsequently established in the months following the summit in 2024.
nonprofit partners to federal agencies that carry out various activities to help the agencies achieve their missions. FESI will be affiliated with the Department of Energy (DOE).
Hart highlighted its potential for research data collaborations and encouraged interested summit participants to share ideas with the Friends of FESI: “I would say the main objective [of FESI] is to help the agency move more quickly than it could otherwise,” he said, “and also to hear things that might not otherwise [be heard].” The foundation, similar to other foundations such as the Foundation for the National Institutes of Health (FNIH) and the Foundation for Food and Agricultural Research (FFAR), is authorized by congressional legislation, so it has an “official” imprimatur. It is intended to be flexible and nimble, be able to act on ideas from inside or outside the affiliated agency and has the potential to reduce the costs of collaboration with the private sector and philanthropic partners.
Although the various agency-affiliated foundations have similarities, there are also differences. The National Park Foundation collaborates with the National Park Service to do projects that the park service wants to do but does not have the funding to do. The FFAR works with the Department of Agriculture to facilitate more competitive research. The FNIH, which works with the National Institutes of Health, is the one that most closely parallels what FESI is intended to do. Specifically, FNIH focuses on delivering therapies, tests, and other tools to providers—bridging the well-known “valley of death” between research and practice. This is an approach that the DOE could benefit from adopting.
FESI’s establishment was mandated in the 2022 CHIPS and Science Act (P.L. 117-167), legislation primarily focused on strengthening the U.S. semiconductor industry, with provisions that also support broader scientific initiatives. Prior to the appointment of FESI’s inaugural board of directors in May 2024, Hart worked with a group called Friends of FESI, established by the Federation of American Scientists and the Information Technology and Innovation Foundation’s Center for Clean Energy Innovation. Friends of FESI helped to get FESI off the ground and identify projects that could be reviewed by the board once it was established.
FESI’s projects could be in any of the many areas that DOE is involved in, from basic research to national security, but what Hart is particularly interested in is the commercialization of technologies that can address the climate crisis, including decarbonization technologies. Developing these technologies will require the sharing of data from both public and proprietary sources, he said. “We ultimately need the market to drive technologies
that are going to be solutions for climate, but there’s a public dimension at the beginning of any innovation, and there are public datasets that would be useful to all kinds of innovators.” He added that a variety of questions remain to be addressed, and this summit was designed to explore some of them. For example: What are the norms that might govern those datasets? How much of the data will be considered proprietary? How much becomes public? FESI and DOE will be interested mostly in generating public datasets, but these public datasets can motivate private activity.
According to the CHIPS and Science Act, FESI, similar to other agency-affiliated foundations, will be a 501(c)(3) organization. The legislation outlines certain requirements for FESI’s board, Hart said, prompting the energy secretary to carefully consider the selection of board members, to ensure that all relevant interests are adequately represented. FESI will also need skilled staff, he added—“a staff that understands how the agency works, how private industry works, how philanthropy works, because its function is to be able to pull all these things together into projects.”
Hart added that Friends of FESI had not come up with many data projects at the time the summit was being organized, which is why he was eager to speak. “There have to be a lot of opportunities to build projects around data,” he said. “They have the chance to be a lot less expensive than building nuclear reactors or other things that FESI might contribute to. So, I hope to inspire you a little bit.”
As a point of comparison, Hart noted that the FNIH is now involved in some very large projects in the $100 million to $1 billion range. “And so, this is what we’re shooting for,” he said, “maybe not next year, but in less than 30 years, which is about how old the FNIH is.” FESI has a congressional appropriation, but the ultimate goal is to leverage public support by identifying projects that will attract industry and philanthropic support.
Hart offered four possible projects that FESI could undertake. For the first, he pointed to the geothermal industry, which is just getting off the ground and it does not have a standardized data platform. FESI could convene industry and philanthropy to identify data needs, develop data standards, and build public-use datasets that would advance DOE’s mission by reducing project risks and accelerating project development. The data would be of two main types: The first would be data on subsurface resources for the industry—the locations of potential hydrothermal sites and their characteristics. The second would be data on surface-level resources. How easy would it be to plug into the electrical grid? What are the communities like where potential hydrothermal projects might be developed? Much of
that information is already out there, some of it easily accessible and some not. How can all that data be gathered and put into some kind of standardized format that will be useful to the industry? That would be a useful data-related project for FESI to take on, Hart said, and there are probably many other industries that have this kind of challenge.
In the second example, Hart spoke about FESI helping facilitate markets for new green technologies, which often face adoption challenges due to misalignments between customer needs and seller priorities. One example is green hydrogen, produced by the electrolysis of water, which has the potential to make a significant contribution to decarbonizing energy systems. Projects that catalyze new markets through the development of competitive bidding processes for sale of green hydrogen products would require new data resources. Data-focused FESI projects could play a similar role in facilitating markets for other decarbonized commodities such as clean steel, clean cement, and carbon dioxide that is captured during power generation or manufacturing and then stored underground. “So, this is another area where a nongovernmental entity that is one step removed but shares the same mission as the DOE might contribute,” Hart said.
A third potential area for FESI to be involved in is decarbonizing the PVC (polyvinyl chloride) value chain. PVC is a major chemical product whose production releases large amounts of carbon. Some experts in the industry who had access to proprietary information were able to gather information about what each plant is contributing to this value chain, that is, what kind of energy they use and what their emissions are. However, according to Hart, there are gaps in the data needed to accurately model energy systems and emissions. Filling in these gaps would provide an opportunity for FESI. Specifically, FESI could “support the development of improvements to models on all scales to better represent technological innovations and their impacts on the provision of energy services, greenhouse gas emissions, and other environmental and social impacts.” In addition, given how much related work is going on in academic labs around the world, FESI could play an intermediary role in bringing together stakeholders from industry, government, and academia.
The fourth example that Hart offered was related to the Energy Information Administration, which is a part of the DOE that provides near real-time data to various markets, including oil, natural gas, and electric power markets. Working with DOE, the private sector, and philanthropy, FESI could support the development of real-time indicators that would allow energy market participants and technological innovators to identify risks
and opportunities with much greater precision than is possible today. In this case, FESI would be playing a sandbox role, acting as an incubator for new data sources that the government could later integrate into its processes. In his vision, Hart said, FESI, similar to FNIH, will mostly be funded by philanthropy and by the private sector.
“So, this is my plea to you,” Hart continued. “Help us build this toolbox. I’ve only scratched the surface of what the potential uses are” and requested anyone interested in FESI to get in touch with him with ideas for potential data-related projects. “We’re trying to just stimulate as much momentum as we can for this.”
Casey Weston, who co-leads LinkedIn’s Data for Impact program, described how that program makes aggregated cuts of LinkedIn data available to various partners as an example of how private-sector entities can play a role in research data collaborations.
Weston spoke about the role that the private sector can play in fostering collaborative synergies and advancing research data practices as well as creating opportunities for cross-sector collaboration. LinkedIn’s Data for Impact program brings together data that LinkedIn collects and then makes aggregated cuts of those data available to various partners that can make use of the data. The program began with a collaboration between LinkedIn and the World Bank, after which the company began working with other multilateral organizations, Weston said. “But it’s expanded to involve working with governments, nonprofits, and other private-sector actors through the industry-led partnership, to think about how cuts of our data can be useful in research, program design, and policy implementation.”
Data for Impact engages in three basic types of data-sharing projects, he said. The first involves collaborative, highly engaged research with partner organizations that require some research capacity from LinkedIn. In such cases the outside organization has a specific research question. For example, a pending project with NSF is evaluating the outcomes of innovation investments at the local level. This type of project is resource-intensive for LinkedIn and thus high risk, he explained, and so the program’s capacity to do that sort of project is limited.
In the second type of data-sharing project, LinkedIn provides specific cuts of data from a data menu that it has developed for partners. For example, Data for Impact provides data that can be used to inform the World
Bank’s investigation of gender inequalities by industry in the labor market in Argentina. In this case, the data from LinkedIn are intended to complement and supplement the administrative data that the World Bank already has to help better understand and tackle challenges.
The third type of data-sharing project involves monitoring and surveillance data—for instance, working with governments to deliver data on a regular basis. One example is a project with the German statistical authority, where LinkedIn delivers labor market data every month that the authority has found to be useful and to complement the data that they have. “It’s useful because it’s faster than any administrative data source,” Weston said. “Two days after the month ends, we know who’s updated their LinkedIn profile, and that’s our statistic.”
One issue with the LinkedIn datasets, he noted, is that not everyone is on LinkedIn, which speaks to one of the major challenges of using privately sourced data for research, program design, or policy implementation—it is not unbiased to the degree that one would hope. Given that the LinkedIn data are not as reflective of the entire population, Weston said that he often engages in collaborative thinking with partners about how the data could be improved and how the biases that they recognize to exist can be accounted for. In particular, he added, they try to bring in as many voices as they can, particularly from populations that are known to be underrepresented on LinkedIn.
Costs represent another challenge, Weston said. LinkedIn absorbs all the costs of the program, and some requests from partners require time from data scientists that Weston has to lobby for. Ensuring the privacy of LinkedIn users, protecting LinkedIn’s business interests and maintaining sufficient data quality all entail costs. One of Weston’s tasks is to reduce the costs of the data-sharing process as much as possible because that is critical to the sustainability of the program.
In closing, Weston offered some thoughts about how to prepare private and public organizations to work together more effectively in data sharing, specifically about designing an enabling environment that facilitates this type of work. Often, he said, he finds himself engaging with the leader of a project who is asking for a specific type of data. For example, it might be someone interested in Turkey’s recovery from an earthquake and how employment there has suffered because of that earthquake and hoping that LinkedIn data might be able to provide some insights. He appreciates being able to help, he said, but the problem is “I spend almost all of my time explaining the exact same problems and challenges to partners, which
is not so effective for me.” It is also not effective for the partners if they spend a lot of time working to get and use the data, only to realize that because of the limitations on what LinkedIn can share, it will not meet their needs. It is crucial, he said, to create a system that caters both to the very high value of private-sector data but also to its limitations, “so that we skip forward in that process and can work collaboratively a little bit faster and more efficiently.”
Stefaan Verhulst, co-founder of the Governance Laboratory (GovLab) and a research professor at the Center for Urban Science and Progress at New York University’s Tandon School of Engineering, spoke about the challenges of achieving a match between the supply of data and the demand for data.
Verhulst began his remarks by describing GovLab’s mission as using new technologies to transform how decisions are made, such as by making connections between governments and outside experts who can help inform those decisions. Typically, he added, GovLab focuses on two types of assets: people and data. “How do we connect with people in new ways? How do we make sure that we understand people’s experiences and their expertise and bring that to the decision cycle?” GovLab has done a lot of work on how to become more data driven in decision-making. In recent years, there has been a tremendous increase in available data, not only in the amount of data but in the type of data as well. At the same time, there has been an increase in demand for data. The problem is, he said, that society has not figured out how to match supply with demand—that is, the data being generated are not always the data that scientists and decision-makers need to do their jobs effectively.
Ideally, Verhulst continued, there needs to be a match between the supply of and demand for data that satisfies four characteristics: it should be systematic, sustainable, rapid, and responsible. He explained, “Responsibility is not just about protecting data, it’s also responsibility to provide access to data when there is a clear need and a clear public interest at stake.” Achieving such a match is not easy, Verhulst said. “Otherwise, we would not have these sessions here today.” And making progress toward such a match will require work on five foundational issues.
The first, he said, is becoming more sophisticated on the demand side. Too often, he deals with groups of people who say they need data, but when
he asks them about the specific questions they want to answer, they do not provide clear responses or justifications. It is crucial that people have a clear idea of which questions they want answered because that helps specify the data that will be useful. Without a more sophisticated demand side, a great deal of time and effort gets wasted. “We actually quite often will get access to data and then realize it’s actually not really that important for what we actually care about,” Verhulst said.
He added, “We need to make sure that it’s not just researchers formulating the questions.” A variety of stakeholders need to be involved to ensure that the questions to be answered are relevant beyond the research context. “It’s not about data equity, it’s about question equity. Who actually formulates the questions is equally important to who has access to the data.”
Verhulst suggested that summit participants address the following issue: how to prioritize the questions that matter. Individuals always tend to think that the questions they are interested in are the ones that matter most, so what sort of process could consider the various objectives and trade-offs and settle on priorities?
The second issue concerns the supply side of data. “We need to professionalize data stewardship,” Verhulst said. Noting that many of the summit attendees had worked on data stewardship in the research sector, he said that it is also important to have data stewardship within the private and public sectors. “We need to invest in data stewardship as a profession,” he continued. “We need chief data stewards that complement chief data officers.” Improving data stewardship, he said, could lead to increased sophistication in how people think about accessing data for reuse.
The third area of interest is consent. This is an issue particularly in cases where data are collected for one purpose but used for another, he said, mentioning as an example the program that Weston described in which LinkedIn data are made available to other users. “Consent has not caught up in a world where we really want to reuse data for public interest purposes,” Verhulst said. It is a fundamental principle that should be respected, but he suggested that it should also be complemented with what he called a “social license,” which would involve getting sign-off for data use from the communities from which the data are drawn. “We need to start thinking about how to get a social license for reuse, so that we understand the expectations and the preferences of communities—as opposed to individuals—with regard to how data are being reused,” he said.
The fourth issue involves governance. “A lot of this space is still guided and governed by data-sharing agreements,” Weston said. “We cannot just
always have the lawyers come up with and reinvent data-sharing agreements. We actually need to have and streamline data-sharing agreements in order to unlock the data that are made available.”
The final issue is sustainability. His group is currently examining how to measure the value and the opportunity cost of data. “I always make the bad joke that we talk a lot about data, but we don’t have a lot of data about data,” he said. “And we don’t really have a good methodology to understand the cost, nor do we have a good methodology to understand the value if we want to do a cost-benefit analysis.” Being able to calculate those values would make it easier to advocate for data and make it clear what is being missed out on if certain data are not available. Finally, he said, there are many uncertainties about the costs of collecting data and making them available in a form that is maximally useful.
Jason T. Black, associate professor at the School of Business and Industry at Florida A&M University in Tallahassee, described his experiences in bringing more data science opportunities to Historically Black Colleges and Universities (HBCUs) through the HBCU Data Science Consortium. Those experiences offer many lessons in how to bring more students from underrepresented minorities into the area of data science.
Black had been working with the South Big Data Innovation Hub, a program funded by NSF to foster research and collaboration among institutions in the southeastern United States. When that program put out a call for proposals for seed grants aimed at fostering more collaborative educational opportunities in data science, he and his colleagues began discussing ways they could bring more data science opportunities to HBCUs.
As Black explained, most HBCUs are private institutions, and many of them are small. Some have significant research infrastructures, while others have none. However, he continued, students at colleges and universities today, including at HBCUs, are more interested in and want to have more exposure to data science. This is in part because it gives them a competitive advantage, as many companies are looking to recruit students who have that background and skill set. Many HBCUs are trying to start data science programs or get involved in data science initiatives, he said. “However, they may not have the infrastructure, they may not have the connections, they may not have access to resources.”
Florida A&M is one of the larger HBCUs, and, Black said, it is on the cusp of Research I (R1) status and just moved into the top 100 research institutions in the United States—so it has the sort of research infrastructure in place that most HBCUs do not. Black and his colleagues came up with an idea for an NSF 1-year seed grant to establish a network for educational and research collaborations, with the particular goal of allowing HBCUs to get access to data science infrastructure that would otherwise be unavailable to them.
This led to the formation of the HBCU Data Science Consortium. It is based on four pillars—education, research, industry, and inclusion. The consortium’s first deliverable was a monthly speaker series, and it held an inaugural workshop in the spring of 2021, with about 20 HBCU representatives attending as well as people from industry and research labs. Much of the discussion at the workshop was aimed at starting conversations and collaborations with and among the HBCUs.
One thing came out of the consortium, Black said. In the first year, five smaller HBCUs—Delaware State University, Clark Atlanta University, Morehouse College, Alabama State University, and Stillman College—were offered small grants of $10,000 each. “The goal was to have them start to put together something that they can build from around data science,” he said. “We saw that those institutions were able to do things like create courses, they were able to start bootcamps, they were able to put together curriculum that they’re now implementing at their schools.” Those programs are still in place, he continued, and it is a very exciting result with a promising future.
Since that time, the HBCU Data Science Consortium has moved on from its 1-year seed grant and has now become a nonprofit organization. It has also continued to have an annual workshop each March, now called the HBCU Data Science Celebration. Black extended an invitation to the workshop attendees to attend. “We’d love to have as many people there as possible that are willing and interested in working with HBCUs and trying to establish collaborations around data science.”
Stephanie Carroll, citizen of the Native Village of Kluti-Kaah in Alaska, an associate professor of public health at the University of Arizona, and the director of the Collaboratory for Indigenous Data Governance, spoke about recent successes and challenges related to working with data from
Indigenous peoples as well as some lessons that can be drawn from those experiences.
Carroll began by saying that in addition to seeing such data through a JEDI lens, she and the people she works with at the Collaboratory for Indigenous Data Governance also bring a rights perspective, which “allows us to enter into the conversation and bring leadership and design.” Therefore, they are not only advancing their interests in and their ability to access and use data but are also working to create policies that are based on rights, which could then be applied to other communities and collectives as well.
She noted that she co-led the creation of the CARE Principles for Indigenous Data Governance, where CARE refers to Collective benefit, Authority to control, Responsibility, and Ethics. The principles, which were introduced in November 2018, have remained timely, Carroll said, and interest in and uptake of them has grown dramatically over the past 5 years. “The heart of the CARE principles is the A, which is authority to control,” she continued, “and that means that we need Indigenous leadership—not only people at the table, but indigenous design.” This in turn has implications not just for policies but also for the cyber infrastructure. “From Indigenous perspectives, you bring a multiplicity of ways of knowing to the table,” she explained, “so the ideas are coming together in ways that they might not if you just had the usual way that we do business around policy and design work.”
Carroll spoke about the U.S. Indigenous Data Sovereignty Network and the Global Indigenous Data Alliance, both of which she was involved in founding. She said that funding has been difficult to obtain and that much of the groups’ work has been done “by moxie and duct tape.” Their approach has been to get small amounts of funding from the National Institutes of Health, NSF, and the Henry Luce Foundation, and “patch it together . . . so that we can work together as Indigenous data sovereignty networks but then, more broadly, with all of the partners that we have.”
She shared that she often finds that when she is talking with potential funders or potential project partners, they will ask her about collaborators such as the American Geophysical Union or the California Digital Library. “We have those connections,” she said. “We just need the money to move forward.”
Carroll finds that few federal agencies or foundations have Indigenous people in decision-making positions, and this creates difficulty in getting funding. “Time and time again, we hear things like, ‘Oh, your project looked great. It just didn’t reach the bar.’ Or, ‘It was right under the funding
level.’ And so how do we make sure that we have [Indigenous] people who are reviewing, and we have people in the program officer seats and we have people in leadership positions within these organizations who can push and move things through?”
Concerning education, Carroll said that the networks are planning to create training on cultural, ethical, legal, and social issues related to data in addition to the standard training in the areas of libraries, software, and data. The search for funding for that work is ongoing.
Finally, regarding publishing, she said that the Indigenous data networks are working with the American Geophysical Union to create a group to come up with new guidelines for publishers and authors that will implement the CARE principles and other relevant elements of Indigenous peoples’ rights within publishing. They have already seen some uptake of these principles, as certain editors are flexible with topics, word usage, and even how the authors report their names.
John Havens, executive director of the Institute of Electrical and Electronics Engineers (IEEE) Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems, made a case for the importance of taking into account both ecological flourishing and human well-being in efforts involving working with large amounts of research data. In advancing modern technology, Havens argued, it is crucial to prioritize both ecological flourishing and human well-being.
He began by pointing to Pope Francis’s 2015 encyclical letter Laudato Si’ (Praise Be to You), in which he warns against the environmental degradation of the Earth. Likening the home of humanity to a sister with whom one shares a life or a mother who offers a loving embrace, he calls on all people to take action to save the planet. Then, just a week before the workshop, the Pope issued Laudate Deum (Praise God), a follow-up to the earlier letter in which he warned more strongly against the risks of global climate change and said that humanity’s efforts since the previous message had not been sufficient.2 He also argued that the paradigm underlying the development of
___________________
2 For the English version of Laudato Si’, see https://www.vatican.va/content/francesco/en/encyclicals/documents/papa-francesco_20150524_enciclica-laudato-si.html, and for the English version of Laudate Deum (Praise God), see https://www.vatican.va/content/francesco/en/apost_exhortations/documents/20231004-laudate-deum.html.
technology is largely focused on growth and productivity, with insufficient focus placed on non-technological issues such as the environment.
The rapid growth in artificial intelligence (AI) applications illustrates the tension that can arise between technology and the environment. The increase in AI workloads has led to a rapid increase in water usage; between 2021 and 2022, Microsoft’s water usage in its data centers increased by 34 percent, while Google’s increased by 20 percent, most of which could be attributed to the computing necessary for AI applications (Wheatley, 2023). ChatGPT accounted for much of the increased usage of AI. Researchers at the University of California, Riverside, estimated that a single ChatGPT discussion with 25 to 50 questions led to the consumption of half a liter of water and that training the first version of ChatGPT required 85,000 gallons of water (Li et al., 2023).
A significant amount of the nation’s water supply is held in underground aquifers, and a recent article in The New York Times documented how aquifers across the country are being drained much faster than they can be replenished (Rojanasakul et al., 2023). It is a classic case of the “tragedy of the commons.” Because the water in an aquifer does not belong to any one individual or entity but instead to anyone who can drill down to it and pull the water up, the benefits of using that water accrue to only a few—those who have drilled down to the aquifer—but the costs of the water are borne by all, as the aquifer may eventually be completely drained, and no one will be able to use it.
The key point, Havens said, is that it is important to involve systems thinkers in a design from the beginning and when accounting for the costs and benefits of a technology, double materiality needs to be considered. This is the concept that “risks and opportunities can be material from both a financial and nonfinancial perspective” and that “companies and financial institutions must manage and take responsibility for the actual and potential adverse impacts of their decisions on people, society and the environment” (Deloitte, 2024). “That aquifer is being used for many things,” Havens said. “People drinking water, animals needing water, agriculture, and for that company to be able to have their company for a long time, let’s make sure we take care of the aquifer.”
Another lesson, he said, is that such harms are much less likely to occur when the design of something considers the points of view of all people who may be affected by the technology. Often, he said, women, Indigenous people, and marginalized people are not in the room.
Havens spoke of an IEEE document, Ethically Aligned Design: A Vision for Prioritizing Human Well-Being with Autonomous and Intelligent Systems
(IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems, 2017) that he helped coauthor. When the coauthors asked for feedback after the release of the first draft, they received positive feedback on the document, but that it did not consider non-Western perspectives. Where were non-Western ethics? Where was Confucianism? Where were Shinto traditions? Where was anything that was not based on Plato, Aristotle, and other Western thinkers? If IEEE wanted to present itself as a global organization, it needed some non-Western input. “It was the biggest gift we’d ever gotten,” Havens said. Later, after revisions, the document’s introduction was translated to Japanese, Arabic, Thai, and other languages. “I can’t say how honored I was that people so appreciated what we were trying to do with AI and ethics in 2016 but then also say, I have to bring this to the language of my people.”
Another IEEE publication, Strong Sustainability by Design, argues that it is important to incorporate “strong sustainability” logic into design to fight global warming and to inspire nature-positive solutions using technology (IEEE Standards Association, 2024). The introduction to the publication says that the goal should be eudaimonia, “a practice elucidated by Aristotle that defines human well-being, both at the individual and collective level, as the highest virtue for a society. Translated roughly as ‘flourishing,’ the benefits of eudaimonia begin with conscious contemplation, where ethical considerations help define how we wish to live.” The first draft included the idea of eudaimonia, Havens said, but several Indigenous readers pointed out that the subtitle referred to “prioritizing human well-being” but said nothing about the planet. “And they were right,” Havens said. “Human well-being has a lot of facets, but it also requires … that nature be honored, that we as individuals and organizations recognize and respect planetary boundaries, the role limits of natural capital.”
Havens concluded that the key to a better future will be putting the planet and people first with metrics to do that and then asking what sorts of innovation will make that possible and how can such technology be created. “If we know we can say we’re not just not harming the planet, but we’re improving the long-term flourishing of the planet—that is the accountability as responsible thought leaders we need to have from systems-level thinking like with the aquifers, et cetera, or else we won’t have a planet that’s livable pretty soon.”
Deloitte. n.d. The challenge of double materiality: Sustainability reporting at a crossroads. https://www2.deloitte.com/cn/en/pages/hot-topics/topics/climate-and-sustainability/dcca/thought-leadership/the-challenge-of-double-materiality.html (accessed March 28, 2024).
IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. 2017. Ethically aligned design: A vision for prioritizing human well-being with autonomous and intelligent systems. Version 2. https://standards.ieee.org/wp-content/uploads/import/documents/other/ead_v2.pdf (accessed March 28, 2024).
IEEE Standards Association. 2024. Strong sustainability by design: Prioritizing ecosystem and human flourishing with technology-based solutions. https://sagroups.ieee.org/planetpositive2030/our-work/ (accessed March 28, 2024).
Li, P., J. Yang, M. A. Islam, and S. Ren. 2023. Making AI less “thirsty”: Uncovering and addressing the secret water footprint of AI models. arXiv 2304.03271 [cs.LG].
Rojanasakul, M., C. Flavelle, B. Migliozzi, and E. Murray. 2023. America is using up its groundwater like there’s no tomorrow. The New York Times, August 28. https://www.nytimes.com/interactive/2023/08/28/climate/groundwater-drying-climate-change.html (accessed March 27, 2024).
Wheatley, M. 2023. Report: Data centers guzzling enormous amounts of water to cool generative AI servers. SiliconANGLE. https://siliconangle.com/2023/09/10/report-data-centers-guzzling-enormous-amounts-water-cool-generative-ai-servers/ (accessed March 27, 2024).
This page intentionally left blank.