Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief (2025)

Chapter: Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
NATIONAL ACADEMIES Sciences Engineering Medicine Proceedings of a Workshop—in Brief

Convened September 25–26, 2024

Future Directions for Social and Behavioral Science Methodologies in the Next Decade
Proceedings of a Workshop—in Brief


A planning committee of the Committee on National Statistics (CNSTAT) of the National Academies of Sciences, Engineering, and Medicine (the National Academies) convened an in-person and virtual workshop to bring together a broad group of experts to explore methodological and analytical frontiers in the social and behavioral sciences. The intent was to identify approaches that deserve more research attention and are expected to benefit more than one discipline. The workshop was designed to consider developments in methods and approaches, such as artificial intelligence (AI) and machine learning for data mining, along with causal and spatial analysis.

INTRODUCTION

Planning committee chair Kristen Olson (University of Nebraska–Lincoln) welcomed participants and explained that speakers would provide their insights and ideas about the current state and development potential for innovative analytical, statistical, and computational methods that can be used across multiple social and behavioral science disciplines. She said the planning committee, whose members moderated the sessions, wanted participants to think about where methodological innovations are likely to head in the next five to ten years. The planning committee, Olson noted, generated far more topics than could be covered in two days and used multiple methods, including extensive list generation, coding the list for themes, and rating and ranking possible topics, to narrow down the sessions. The final agenda covered seven topics: new data sources; new study designs; causal inference; new directions in spatial analysis; AI and data analysis; sensors, apps, and other technologies; and data protection and dissemination.1

Melissa Chiu (director, CNSTAT) explained that the National Science Foundation’s Methodology, Measurement, and Statistics program periodically conducts an environmental scan of new and emerging statistical methods, models, and innovative approaches for the future of research in the social, behavioral, and economic sciences. She noted that the workshop provided a unique opportunity for thought leaders from diverse disciplines to come together and share their insights.

New Data Sources

Kathleen Cagney (University of Michigan) introduced the session. The presentations centered on the integration of surveys with administrative and nonprobability data, leveraging AI in data collection, and addressing privacy challenges, with an emphasis on balancing trade-offs.

__________________

1 See https://www.nationalacademies.org/our-work/future-directions-for-social-and-behavioral-science-methodologies-in-the-next-decade-a-workshop

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

Andrew Mercer (Pew Research Center) highlighted significant quality issues with online opt-in panels, such as “bogus respondents” providing unreliable data. He called for advanced validation techniques, including leveraging generative AI for respondent verification and integrating administrative data to improve participant sampling accuracy. Mercer emphasized the importance of probability-based samples, despite declining response rates, for providing benchmarks to correct biases in nonprobability data. The balance between improving accuracy and reducing costs remains, in his view, a critical area for further exploration.

Michael Elliott (University of Michigan) presented frameworks for combining probability and nonprobability samples, leveraging quasi-randomization and doubly robust models to address selection biases. These approaches depend on high-quality auxiliary data and assumptions of ignorability, which are often challenging to meet. Elliott called for extending these methods to support causal inference and small-area estimation, while developing sensitivity analyses to address the failure of key assumptions. A critical future direction, in his opinion, is to improve pseudo-weight generation and refine Bayesian Additive Regression Tree (BART) applications for these mixed datasets.

Budhendra Bhaduri (Oak Ridge National Laboratory) demonstrated how remote sensing can enhance socioeconomic and behavioral research by providing high-resolution imagery, digital exhaust (e.g., social media), and biometric data. Applications include neighborhood mapping and population density analysis. Challenges include ensuring data quality, addressing privacy concerns, and adapting models to diverse geographies. He emphasized that integrating remotely sensed data with traditional sources could provide actionable insights into urban planning, disaster resilience, and socioeconomic differences.

Nancy Potok (NAPx Consulting) highlighted challenges to federal statistical agencies’ traditional survey designs from lower response rates and declining budgets.2 She felt the erosion of these data is a real problem for social science research, which needs benchmarks. Otherwise, reliance on nonrepresentative data and biased sources will increase, undermining trust and data utility. Potok underscored the need for short- and long-term investments in integrating survey, administrative, and open data, which can add depth and breadth for public policy (e.g., combining socioeconomic status with climate data) and improve data quality. She also highlighted gaps in privacy protection, advocating for user accountability and penalties for misuse of granular data. Potok called for research into reconciling longitudinal datasets with evolving data collection methods and into using machine learning algorithms for analyzing complex combined datasets to help establish causality.

Potok and other participants stressed the critical need to establish data quality standards, which are at the heart of scientific integrity, and communication frameworks and collaborative communities to guide interdisciplinary research. Potok also commented that additional research is needed into the ethical use of sensing and other data for which informed consent is not possible, and ways to communicate quality and other concerns to policymakers and the public.

New Study Designs

Fred Oswald (Rice University) introduced the session, which featured presentations on adaptive intervention designs, megastudy designs, and AI–human interaction designs to bolster experimental social and behavioral science research.

Inbal Nahum-Shani (University of Michigan) explained adaptive interventions, which dynamically tailor treatment based on participants’ responses over time. These interventions rely on two primary frameworks: Sequential Multiple Assignment Randomized Trials for slower, human-delivered adaptations, and Micro-Randomized Trials for rapid, digital interventions. Hybrid experimental designs integrate both frameworks, enabling interventions across multiple time scales and modes—for example, combining human-delivered counseling with real-time digital messaging for substance abuse interventions. Nahum-Shani commented that two key research priorities are to identify optimal sequencing strategies and to understand heterogeneity in treatment effects, particularly for sustained engagement in complex health behaviors.

__________________

2 See https://www.amstat.org/policy-and-advocacy/the-nation’s-data-at-risk-meeting-american′s-information-needs-for-the-21st-century

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

Katherine Milkman (University of Pennsylvania) discussed the megastudy approach, in which a large-scale experimental framework encompasses multiple sub-experiments conducted simultaneously to test a variety of interventions to achieve a specified outcome. This method can enhance comparability, reduce costs, and accelerate scientific discovery, she said. Examples provided involved nudges that could increase activities such as gym attendance and seeking vaccinations. Findings to date reveal the importance of repeated nudges, micro-incentives, and clear language on where the nudges are coming from. Challenges include forecasting intervention efficacy and managing multiple hypothesis tests appropriately. In Milkman’s view, future work should focus on refining prediction models to identify “what works for whom” and ensuring findings are generalizable across populations.

Thomas Costello (American University) highlighted the transformative role of generative AI in behavioral science experiments. AI models, such as chatbots, can act as “confederates” in interactive studies, delivering treatments, adapting responses based on participants’ input, and monitoring for data quality. This innovation unlocks new paradigms for studying persuasion, learning, and social interactions at scale. He emphasized that AI–human interactions can provide high ecological validity but pose challenges, such as biases in AI responses, difficulty in causal inference, and reproducibility concerns as AI models evolve. Research, in Costello’s view, should focus on developing statistical tools to analyze high-dimensional treatments and research should also address ethical considerations with these designs.

The discussant, Stephanie Coffey (U.S. Census Bureau), underscored the cross-cutting themes of the session, noting that technological advancements enable new methods for treatment delivery, behavior measurement, and intervention evaluation. Key areas for future research, in her opinion, include generalizing findings across contexts, integrating machine learning for adaptive experimentation, and addressing challenges in causal inference.

Open discussion raised several critical questions. For adaptive interventions, participants debated the potential for “backfire effects,” where nudging or correcting beliefs may entrench people’s beliefs instead, stressing that identifying predictors of heterogeneous responses is essential. For megastudies, many participants raised issues around intervention selection, scalability, and forecasting accuracy. AI–human experiments prompted questions about participants’ trust in AI, the reproducibility of results as models improve, and whether AI interventions may outperform human-delivered treatments in certain contexts. For experiments to change long-standing behaviors or ways in which, for example, respondents handle chronic illnesses, several participants affirmed the importance of sustained engagement.

Causal Inference

Rodrigo Pinto (University of South Florida) introduced the session, which featured presentations on approaches to enhancing causal inference frameworks, using machine learning to address treatment effect variability, advances in sensitivity analysis, obtaining additional measures for econometric models from new data sources, and leveraging modern computational tools. Several speakers and participants emphasized the need for new tools and methods to be user-friendly.

Tyler VanderWeele (Harvard University) underscored the importance of distinguishing between association and causation through robust study designs. He emphasized the need for longitudinal studies over cross-sectional studies to ensure that causal interpretations are valid. A significant area of concern is reliance on overly simplified psychosocial constructs; for instance, life satisfaction indicators may not always align with a single unidimensional latent variable. VanderWeele advocated for considering indicator-specific causal analysis when structural assumptions fail. Future research priorities, in his view, include developing additional tests to investigate structural latent factors, and expanding methods that reconcile psychosocial measurements with causal inference frameworks.

Jennie Brand (University of California, Los Angeles) highlighted the growing importance of understanding treatment effect heterogeneity—how different groups respond differently to the same intervention. She demonstrated the utility of causal trees and machine learning approaches to uncover previously unrecognized heterogeneity. For instance, studies on college completion

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

revealed that marginalized groups, such as those with lower parental income or school disadvantage, tended to experience the largest gains in being able to avoid low-wage work. Brand emphasized the need to combine machine learning’s flexibility with causal inference rigor, particularly using methods like cross-validation to prevent overfitting. As she noted, a key research frontier is using machine learning to automate the identification of response heterogeneity while ensuring interpretability and relevance for social policy.

Carlos Cinelli (University of Washington) addressed the limitations of strong assumptions in causal inference, particularly the challenge of unobserved confounding. He presented advancements in sensitivity analysis to quantify the robustness of causal estimates to assumption violations. A major development includes general nonparametric frameworks for assessing omitted variable bias using interpretable sensitivity parameters, such as R-squared bounds. Cinelli argued for automating sensitivity analysis through algorithmic tools capable of handling complex causal models, which would free researchers to focus more on theoretical insights.

Pinto emphasized the role of new measures in improving economic behavior models. Traditional econometric analyses often rely on restrictive assumptions due to limited observational data. He highlighted the value of novel data sources, such as subjective expectations, social norms, and parental beliefs, to enrich behavioral models. For example, introducing belief distortions into child development models allows researchers to better capture under- or over-investment dynamics. Furthermore, Pinto discussed the potential of combining instrumental variable identification strategies with behavioral economic theory to enhance causal interpretations. In addition, the integration of machine learning techniques with traditional econometric tools can advance forecasting, automate model selection, and identify heterogeneous treatment effects, ultimately improving policy evaluations.

A recurring theme among the speakers and participants was the need for computational tools to automate complex causal analyses. Cinelli and Pinto both emphasized that automation could reduce the burden of deriving identification strategies and sensitivity bounds, enabling broader adoption of advanced causal methods. However, challenges remain in scaling existing algorithms, particularly for nonparametric and high-dimensional models. Shaowen Wang (University of Illinois Urbana-Champaign) noted that lack of access to computational resources may limit the application of such methods as large-scale meta-analyses or networked systems. Brand and Wang further emphasized that researchers need to strike a balance between methodological rigor and usability, developing user-friendly software tools and pedagogical resources to integrate advanced methods into social science practice and expand access to them.

New Directions in Spatial Analysis

Shaowen Wang introduced this session, which addressed two main themes: (a) the effects of geography on human behavior, health, and social outcomes and (b) advances in spatial analysis methods, use of large-scale data, and approaches for constructing social spatial networks. The session underscored the growing importance of geographic methods in addressing complex societal challenges, paving the way for more nuanced and effective policymaking.

Stewart Fotheringham (Arizona State University) presented on the importance of geographical context in influencing human decisions, independent of individual attributes like age or income. Using such approaches as multilevel modeling and geographically weighted regression, he demonstrated how location-specific factors accounted for a significant portion of variation in voting preferences. In Fotheringham’s view, a research priority is to incorporate geographical context into behavioral models to improve inference accuracy and policy relevance.

Harvey Miller (Ohio State University) explored how urban sustainability and public health can benefit from mesogeography, an intermediate approach between micro- and macrogeography that balances reductionist and aggregate perspectives. He proposed the use of urban observatories to continuously monitor and analyze data on cities as complex systems, which could enable informed interventions in areas such as transportation and health. Key research priorities, in Miller’s opinion, include experimenting with real-world and simulated data to understand spatial dynamics and developing tools to address

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

uncertainties in geographic data, while integrating local context and heterogeneity.

Clio Andris (Georgia Tech) highlighted the need to advance methods for analyzing spatial social networks (SSNs), which extend across people and geographies, to understand how physical space influences social relationships. Examples include mapping the travels of the 9/11 terrorist network and rural social networks along the Amazon River and its tributaries. She introduced techniques such as EdgeScan and NDScan for detecting network hotspots and mapping relationships. Future research, in Andris’ view, could usefully explore developing null hypotheses for SSN structures; identifying geographic features that foster or hinder social connections; and designing efficient tools to assess network centrality and local connectivity, with implications for urban planning.

Eric Chyn (University of Texas at Austin) presented insights on place-based effects on economic and health outcomes, leveraging large-scale administrative and historical datasets. As an example, analysis of large-scale displacement events, such as Hurricane Katrina, revealed that relocating to different geographic contexts can help improve health outcomes for vulnerable populations. He emphasized the importance of new data sources, such as machine-learning-generated historical records, for uncovering long-term spatial effects. Chyn advocated for enhanced administrative data sharing and sustained funding to support these efforts.

Artificial Intelligence and Data Analysis

John Eltinge (U.S. Census Bureau) introduced the session, which explored emerging methods, opportunities, and challenges in leveraging AI for research and practical applications. The discussions centered on how AI and modern statistical tools can enable more efficient, scalable, and personalized interventions, while also addressing key limitations pertaining to causal inference, data integration, and ethics.

Susan Athey (Stanford University) spoke on AI and machine learning as research tools in social science. AI can help with empirical analysis of predictive features, treatments, and outcomes and in setting up experiments, such as assigning treatments, activating interventions (e.g., through chatbots), and presenting alternate versions of images and texts to subjects. She provided examples of secondary analyses where AI algorithms determined population groups for which Medicaid coverage was more effective in reducing blood pressure and types of workers, jobs, and communities where job losses had the most impact on earnings losses. Athey then turned to examples of how machine learning could be used in analysis—for example, using pre-trained language models to predict the helpfulness of online product reviews or to predict ideology preferences in media articles. She posed questions for future research, including the pros and cons of off-the-shelf or custom AI models; interactions between model size and data size and heterogeneity; the role of fine-tuning and how it should be done; incorporating richer but messier input data; and richer confounders or controls to produce richer outcomes.

Chris Bail (Duke University) posed the question of whether generative AI can improve social science. He discussed potential applications, such as automated text analysis, synthetic surveys, synthetic experiments, using generative agent-based models (e.g., in experiments on hearing the other side of an issue), and blending synthetic and human experiments. Bail identified challenges such as bias and replicability, ethical issues, and the known proclivity of AI to hallucinate (i.e., devise false or nonsensical output). He also noted the high computational costs (in terms of energy use) to train and run AI algorithms.

Seth Spielman (Microsoft, Inc.) discussed the future of AI in social science research. He stated that his presentation would not review studies but instead explore fundamental ideas. Spielman focused on three such ideas in his remarks. First, he asserted that AI algorithms are not databases and should not be used to simply regurgitate facts. Second, Spielman said that AI’s ability to reason over heterogenous inputs and output well-reasoned descriptors of a place, person, or other entity at scale is fundamentally new and that AI-generated data hold particular promise for analyzing complex latent constructs (e.g., “vulnerability,” “resilience,” or “well-being”). Robust validation of AI-generated data is essential because the “ground truth” for complex latent variables

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

often does not exist. Third, he asserted that the boundary between AI and data is increasingly blurry, citing the example of the extensive amount of imputation for item nonresponse for income data in the American Community Survey (ACS). Consequently, much of what is considered first-class data is model-dependent, even official statistics such as measured poverty that feature in fund allocation formulas (although these models are easier to understand than an AI system with billions of parameters). Spielman called for further research on methods of using AI to generate data.

Frauke Kreuter (University of Maryland) discussed the presentations from the perspective of survey research. She noted the need to address bias and lack of transparency in AI methods and the need for team science and benchmarks. Kreuter said that another way to think of generative AI is as an assistant rather than a replacement for a survey program. Here, AI can help with coding, translation of legacy code, and ideation.

In the open discussion, Athey noted the variation in optimism about the use of new tools. She also stressed the need for researchers to simulate their data from prototyping or piloting before embarking on expensive field work. AI can help by analyzing data from multiple pilots that test question wording and the need for a question. Bail said open-source infrastructure is necessary to help researchers address bias in human input to AI and in AI models themselves.

Sensors, Apps, and Other Technologies

Fred Oswald (Rice University) introduced the session. The presentations covered a range of data sources and multimode approaches for enhancing surveys and qualitative information.

Akane Sano (Rice University) discussed multimodal sensing and modeling techniques for health care applications. She asked how to take advantage of daily life moment-to-moment and other data (from wearables, text messages, videos, etc.) to develop disease markers, predictors, and other information that can help tailor individual interventions and treatments. In her view, the ideal dataset would be long-term, multimodal, cover many demographic and geographical groups, include critical rare events, and be consistent in data formats and streams.

Sano identified a large number of data collection, modeling, feedback loop, and deployment challenges and focused on five of them. She noted challenges in the large amount of energy required to run machine learning algorithms for sensing data, as well as the noisiness and missingness in the data themselves. For inference, Sano noted the expense and time to label enough sensed data to enable an algorithm to properly encode the rest, also noting the biases that all too often can infect algorithms. She asked workshop participants to think whether and how to design actionable, personalized, and adaptive feedback that is safe, reliable, sustainable, and acceptable to users. Sano stressed the need for research environments that support integrating and comparing datasets from multiple studies and that can accelerate developing and testing the effectiveness and safety of models.

Leah Christian (NORC at the University of Chicago) explored the uses of texting in surveys. She outlined the advantages of texting—its ubiquity and that its use for contacting respondents can speed up data collection, reduce costs, and reduce response bias. There are some legal and technical constraints (e.g., cell phone carriers can flag texts as spam). Christian reported on research that compared reminding respondents through texting or with a postcard. Both reminders raised completion rates; text reminders were less expensive, but postcard reminders brought in a wider set of demographic groups. Text invitations and early text reminders boosted completion rates. She provided best practices for texting as an integral part of mixed-mode surveys.

Sunshine Hillygus (Duke University) considered the future of video interviewing and noted that innovation for innovation’s sake carries risks. She reported that in a lab experiment, face-to-face and video interviews tended to be of higher quality (e.g., lower item nonresponse rates) than online responses, but produced greater social desirability bias than online responses. In the American National Election Survey during the COVID-19 pandemic lockdown, offering video lowered response rates compared with online responses. Respondents experienced both technical and nontechnical problems with video (e.g., disliking technology), and participation rates varied widely by education and political party. Hillygus suggested that video was best used in a mixed-mode

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

design and that further research was needed on optimizing the participant experience and modeling response propensities.

Florian Keusch (University of Maryland and University of Mannheim) discussed the collection of digital trace data (e.g., activity on Instagram) via data donation. Digital trace data are high frequency, in-the-moment information, collected without increasing burden on participants, unaffected by measurement itself, and scalable. Methods for collecting digital trace data, however, have drawbacks—for example, some social media platforms are limiting the use of APIs; collaboration with industry can compromise independence; and getting users to install meters and apps to continuously track such behaviors as web browsing can raise privacy concerns and make it hard to achieve unbiased coverage and participation. He then introduced data donation, which takes advantage of the European Union’s General Data Protection Regulation Articles 15 (right of access by the data subject) and 20 (right to data portability). Keusch cited several examples, such as donation by 9th-graders on their preferred social media activity data. It appears that people are quite willing to donate data but that a large share (up to 50%) drop out during the donation period because the process is cumbersome. So far, few studies have assessed nonparticipation bias. In his opinion, research is needed on communicating the value of data donation to participants, integrating data donation into the survey infrastructure, keeping track of changing platforms, and combining digital trace data with self-reports.

Cameron McPhee (SSRS) discussed the presentations from the point of view of a traditional survey methodologist, who views each new data source and collection mode as a new source of error. She is concerned about error correlating with outcomes—for example, people’s willingness to use video technology or donate digital data is not random. In addition, McPhee said research is needed on how to address item missingness for the future of mixed-mode, mixed-source survey designs. It is also essential to have human data for small geographies and demographies. She asked how the possibility of feedback can be used to help respondents and encourage cooperation.

Some speakers and workshops participants noted the benefit of more research on respondent preferences for different modes of data collection—for example, some people are fine with video but many people do not want to be on camera or online. Preferences may also interact with the topic—for example, people may not want to discuss their vote on camera. It is important to distinguish between useful innovations, versus innovation for innovation’s sake.

Data Protection and Dissemination

John Eltinge (U.S. Census Bureau) introduced the session. Presentations addressed inference with nonprobability surveys, privacy and protection methods and designs, future directions for post-survey adjustments, and protecting confidentiality in official statistics.

Qixuan Chen (Columbia University) addressed the challenge of using nonprobability samples (e.g., convenience, quota, network samples) for inference, given that nonprobability samples are not representative, but have the attractive features that they are easy, cheap and fast to conduct. She said that the use of additional information, such as administrative records or high-quality probability surveys, is essential for inference. Methods for integration include inverse propensity or calibration weighting; regularized regression prediction; and leveraging high-dimensional auxiliary variables with new methods, such as BART. However, it may not be possible to release high-dimensional auxiliary data because of confidentiality risks. Also, heterogeneity among data sources poses significant challenges to data integration, and workflow and software tools are needed to facilitate integration. There may be ways to improve the design of nonprobability surveys to improve inference from the outset, Chen noted.

Ruobin Gong (Rutgers University) addressed new methods and designs to limit disclosure risk for research data—specifically differential privacy (DP) and its variants. She said that such methods have evolved considerably over the past decade and have been applied for such uses as protecting tabulation files from the 2020 U.S. Census and Israel’s National Registry of Live Births. Gong said that current DP algorithms are well suited for one-time simple univariate statistics, but struggle to produce accurate regression parameter estimates and associated confi-

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

dence intervals. Also, designing formally private mechanisms for public files intended for multiple downstream use is challenging. In her view, research priorities are to develop formally private disclosure avoidance methods suited to the reality of official statistics; workable ways of assessing privacy-usability tradeoffs; and ways to integrate rigid formal privacy standards with other legal requirements (e.g., the Health Insurance Portability and Accountability Act [HIPAA]) and with user expectations.

Natalie Shlomo (University of Manchester) covered challenges and research priorities for the future of surveys, use of generative AI in surveys, multi-source statistics (“blended data”), and dissemination and confidentiality guarantees. She said the world of surveys and survey dissemination is changing fast with both opportunities and challenges (e.g., response rate declines, increasing costs, nonprobability surveys becoming more attractive, increasing disclosure risks and privacy concerns). Shlomo said that probability surveys are essential and can often benefit from rotational panel designs, adaptive designs, and mixed data collection modes. Nonprobability online surveys can be useful for hard-to-capture populations. She felt that official statistics needed to be more accepting of model-based estimation, data integration, and small-area estimation. Generative AI has the potential to improve questionnaire development, take the place of interviewers, offer automatic coding and web scraping, and produce data imputations and synthetic data. Multisource statistics offer advantages, and a research priority is to develop appropriate quality assessments. Shlomo said that public-use files are becoming more high risk, and research is needed on disclosure avoidance methods and ways to expand use of data enclaves, including the provision of synthetic data to allow researchers to pretest their analyses.

Matt Williams (RTI) discussed the presentations, emphasizing the complexity of survey designs that integrate different types of data and the challenges posed by new privacy protection algorithms. Regarding disclosure avoidance, he speculated that we may be too reliant on statistical tools to rescue us, even though differentially private algorithms, for example, cannot handle the ACS, the Survey of Income and Program Participation, or any not completely random survey. Williams wondered about policy and societal levers that could help and whether it is an impossible burden on statistical agencies to try to prevent all possible privacy exposures, which leads to lower-quality data. He wondered how to incentivize reporting of inappropriate use of statistical data.

Several speakers and workshop participants raised the need to think about integrating data sources from the beginning and for an overarching framework within which to make design choices. They proposed research on which variables raise common integration issues across surveys and which are particular to the topic.

Suggestions for Future Research

Over the course of the workshop, many participants made suggestions for research to foster appropriate use in the social, behavioral, and economic sciences of the many new methods and technologies presented during the workshop’s two days. These suggestions are summarized in Box 1.

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

BOX 1
Suggestions for Future Research on Innovative Directions for Social and Behavioral Science Methodologies in the Next Decade

New Data Sources
  • Explore the balance between improving accuracy and reducing costs in online opt-in nonprobability panels and develop advanced validation techniques using generative AI, administrative data, and probability-based samples (Mercer, Elliott, Chen)
  • Research ways to integrate remote sensing with traditional data (Bhaduri)
  • Pursue efforts to develop data quality standards for complex combined datasets and explore ethical use of sensing and other data for which informed consent is not feasible (Potok)
New Study Designs
  • Identify optimal strategies for sequencing interventions in hybrid experimental designs that use human-delivered and digital interventions (Nahum-Shani)
  • Refine prediction models to identify “what interventions work for whom”—treatment effect heterogeneity—in the context of a megastudy approach with multiple experiments within an overall framework and/or assisted by machine learning (Milkman, Brand)
  • Develop statistical tools to analyze high-dimensional treatments in which AI models (e.g., chatbots) assist interactive studies (Costello)
Causal Inference
  • Explore robust study designs (e.g., longitudinal data) to help ensure that causal interpretations are strongly supported (VanderWeele)
  • Pursue promising advancements in sensitivity analysis to quantify how robust causal estimates are to assumption violations (Cinelli)
  • Explore novel data sources, such as subjective expectations, social norms, and parental beliefs, to enrich behavioral models (Pinto)
New Directions in Spatial Analysis
  • Incorporate geographical context into behavioral models to improve inference accuracy and policy relevance (Fotheringham)
  • Adopt a mesogeography framework (in between micro and macro frameworks) to study complex urban issues, such as transportation and health (Miller)
  • Experiment with real-world and simulated data to understand spatial dynamics and develop tools to address uncertainties in geographic data, while integrating local context and heterogeneity (Miller)
  • Explore analysis of spatial social networks and design efficient tools to assess network centrality and local connectivity, which have implications for urban planning (Andris)
  • Investigate new data sources, such as machine-learning-generated historical records, for uncovering long-term spatial effects on economic and health outcomes (Chyn)
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Artificial Intelligence (AI) and Data Collection
  • Explore how AI can help with empirical analysis of predictive features, treatments, and outcomes and in setting up experiments, such as assigning treatments, activating interventions (e.g., through chatbots), and presenting alternate versions of images and texts to subjects (Athey)
  • Explore how AI and machine learning can assist with secondary analyses (Athey)
  • Evaluate the benefits (e.g., generating new data, experiments) and costs (e.g., energy consumption, hallucinations) of generative AI in social science research (Bail, Spielman)
Sensors, Apps, and Other Technologies
  • Explore multimodal sensing and modeling techniques for applications in health care and other behavioral fields (Sano)
  • Pursue the uses of texting in surveys (Christian)
  • Explore the advantages and limitations of video interviewing (Hillygus)
  • Determine the viability of obtaining digital trace data via donation from respondents (the data owners) (Keusch)
  • Understand error properties in the data from mixed-mode, mixed-data source surveys and, from that, how to address missingness in the data (McPhee)
Data Protection and Dissemination
  • Develop formally private disclosure avoidance methods suited to the reality of official statistics, workable ways of assessing privacy-usability tradeoffs, and ways to integrate rigid formal privacy standards with other legal requirements (e.g., Health Insurance Portability and Accountability Act) and with user expectations (Gong)
  • Explore ways to make more effective use of data enclaves, given the increased disclosure risks of public data (Shlomo)
  • Explore ways for statistical agencies and data users to share responsibility for disclosure avoidance and, in this context, assess reasonable risks to expect statistical agencies to guard against give, given the need for high-quality data (Williams)

Overarching Themes

In addition to suggestions for research, four crosscutting themes emerged from the many workshop presentations.

Role of AI—Although the agenda intentionally included only one session on AI with a narrow scope, nearly all the presentations mentioned AI in one form or another. Many presenters expressed the need for broad and deep research to determine appropriate uses of AI in the social, behavioral, and economic sciences.

Role of Theory—Kristen Olson, in concluding remarks, emphasized the importance of using theory to drive research questions. In turn, she indicated that research questions needed to drive methods and cautioned researchers not to fall into either the trap of innovation for innovation’s sake or the pursuit of low-cost methods without consideration of data relevance and quality.

Heterogeneity—Many presenters emphasized the role of heterogeneity of population and spatial groups with the consequence that “one size does not fit all” with respect

Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.

to data collection and analysis methods in the social sciences.

Research Infrastructure—Given the many disciplines using AI for social science data collection and analysis, and the new tools being developed in different spheres, many presenters pointed to the utility of a robust infrastructure to foster fruitful research. This infrastructure would include such elements as a common language across fields and methodologies; data quality standards and error frameworks for mixed-mode and mixed-sources study designs; user-friendly, open source collection and analysis tools and documentation; information on what works in what contexts and what does not; pooling of studies to determine costs and benefits of new methods; guidelines for ethical use of new methods, including addressing respondents’ concerns; and regular convenings across disciplines.

DISCLAIMER This Proceedings of a Workshop—in Brief was prepared by Constance F. Citro as a factual summary of what occurred at the workshop. The statements made are those of the rapporteur or individual workshop participants and do not necessarily represent the views of all workshop participants; the committee; or the National Academies of Sciences, Engineering, and Medicine.

REVIEWERS To ensure that it meets institutional standards for quality and objectivity, this Proceedings of a Workshop—in Brief was reviewed by Frederick L. Oswald, Rice University. We also thank staff member Kelly Robbins for reading and providing helpful comments on this manuscript. Kirsten Sampson Snyder, National Academies of Sciences, Engineering, and Medicine, served as the review coordinator.

PLANNING COMMITTEE MEMBERS Kristen M. Olson, University of Nebraska–Lincoln; Abdullah M. Almaatouq, MIT Sloan School of Management; Kathleen A. Cagney, University of Michigan; John L. Eltinge, U.S. Census Bureau; Frederick L. Oswald, Rice University; Rodrigo Pinto, University of South Florida; Shaowen Wang, University of Illinois Urbana-Champaign

STAFF Daniel Cork, Celeste Stone, Anthony Mann, Committee on National Statistics

SPONSORS This workshop was supported by a grant from the National Science Foundation to the National Academy of Sciences (SES-2217307). Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.

For additional information regarding the workshop, visit: https://www.nationalacademies.org/our-work/future-directions-for-social-and-behavioral-science-methodologies-in-the-next-decade-a-workshop

SUGGESTED CITATION National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: National Academies Press. https://doi.org/10.17226/29083.

Division of Behavioral and Social Sciences and Education

Copyright 2025 by the National Academy of Sciences. All rights reserved.

NATIONAL ACADEMIES Sciences Engineering Medicine The National Academies provide independent, trustworthy advice that advances solutions to society’s most complex challenges.
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 1
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 2
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 3
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 4
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 5
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 6
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 7
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 8
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 9
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 10
Suggested Citation: "Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2025. Future Directions for Social and Behavioral Science Methodologies in the Next Decade: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/29083.
Page 11
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.