-
Measuring Social Media Network Effects
Authors:
Sinan Aral,
Seth G Benzell,
Avinash Collis,
Christos Nicolaides
Abstract:
We use representative, incentive-compatible online choice experiments involving 19,923 Facebook, Instagram, LinkedIn, and X users in the US to provide the first large-scale, empirical measurement of local network effects in the digital economy. Our analysis reveals social media platform value ranges from $78 to $101 per consumer, per month, on average, and that 20-34% of that value is explained by…
▽ More
We use representative, incentive-compatible online choice experiments involving 19,923 Facebook, Instagram, LinkedIn, and X users in the US to provide the first large-scale, empirical measurement of local network effects in the digital economy. Our analysis reveals social media platform value ranges from $78 to $101 per consumer, per month, on average, and that 20-34% of that value is explained by local network effects. We also find 1) stronger ties are more valuable on Facebook and Instagram, while weaker ties are more valuable on LinkedIn and X; 2) connections known through work are most valuable on LinkedIn and least valuable on Facebook, and people looking for work value LinkedIn significantly more and Facebook significantly less than people not looking for work; 3) men value connections to women on social media significantly more than they value connections to other men, particularly on Instagram, Facebook and X, while women value connections to men and women equally; 4) white consumers value relationships with other white consumers significantly more than they value relationships with non-white consumers on Facebook while, on Instagram, connections to alters eighteen years old or younger are valued significantly more than any other age group-two patterns not seen on any other platforms. Social media platforms individually generate between $53B and $215B in consumer surplus per year in the US alone. These results suggest social media generates significant value, local network effects drive a substantial fraction of that value and that these effects vary across platforms, consumers, and connections.
△ Less
Submitted 6 July, 2025;
originally announced July 2025.
-
Are Crypto Ecosystems (De)centralizing? A Framework for Longitudinal Analysis
Authors:
Harang Ju,
Ehsan Valavi,
Madhav Kumar,
Sinan Aral
Abstract:
Blockchain technology relies on decentralization to resist faults and attacks while operating without trusted intermediaries. Although industry experts have touted decentralization as central to their promise and disruptive potential, it is still unclear whether the crypto ecosystems built around blockchains are becoming more or less decentralized over time. As crypto plays an increasing role in f…
▽ More
Blockchain technology relies on decentralization to resist faults and attacks while operating without trusted intermediaries. Although industry experts have touted decentralization as central to their promise and disruptive potential, it is still unclear whether the crypto ecosystems built around blockchains are becoming more or less decentralized over time. As crypto plays an increasing role in facilitating economic transactions and peer-to-peer interactions, measuring their decentralization becomes even more essential. We thus propose a systematic framework for measuring the decentralization of crypto ecosystems over time and compare commonly used decentralization metrics. We applied this framework to seven prominent crypto ecosystems, across five distinct subsystems and across their lifetime for over 15 years. Our analysis revealed that while crypto has largely become more decentralized over time, recent trends show a shift toward centralization in the consensus layer, NFT marketplaces, and developers. Our framework and results inform researchers, policymakers, and practitioners about the design, regulation, and implementation of crypto ecosystems and provide a systematic, replicable foundation for future studies.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Explaining Sustained Blockchain Decentralization with Quasi-Experiments: Resource Flexibility of Consensus Mechanisms
Authors:
Harang Ju,
Madhav Kumar,
Ehsan Valavi,
Sinan Aral
Abstract:
Decentralization is a fundamental design element of the Web3 economy. Blockchains and distributed consensus mechanisms are touted as fault-tolerant, attack-resistant, and collusion-proof because they are decentralized. Recent analyses, however, find some blockchains are decentralized, others are centralized, and that there are trends towards both centralization and decentralization in the blockcha…
▽ More
Decentralization is a fundamental design element of the Web3 economy. Blockchains and distributed consensus mechanisms are touted as fault-tolerant, attack-resistant, and collusion-proof because they are decentralized. Recent analyses, however, find some blockchains are decentralized, others are centralized, and that there are trends towards both centralization and decentralization in the blockchain economy. Despite the importance and variability of decentralization across blockchains, we still know little about what enables or constrains blockchain decentralization. We hypothesize that the resource flexibility of consensus mechanisms is a key enabler of the sustained decentralization of blockchain networks. We test this hypothesis using three quasi-experimental shocks -- policy-related, infrastructure-related, and technical -- to resources used in consensus. We find strong suggestive evidence that the resource flexibility of consensus mechanisms enables sustained blockchain decentralization and discuss the implications for the design, regulation, and implementation of blockchains.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Human Trust in AI Search: A Large-Scale Experiment
Authors:
Haiwen Li,
Sinan Aral
Abstract:
Large Language Models (LLMs) increasingly power generative search engines which, in turn, drive human information seeking and decision making at scale. The extent to which humans trust generative artificial intelligence (GenAI) can therefore influence what we buy, how we vote and our health. Unfortunately, no work establishes the causal effect of generative search designs on human trust. Here we e…
▽ More
Large Language Models (LLMs) increasingly power generative search engines which, in turn, drive human information seeking and decision making at scale. The extent to which humans trust generative artificial intelligence (GenAI) can therefore influence what we buy, how we vote and our health. Unfortunately, no work establishes the causal effect of generative search designs on human trust. Here we execute ~12,000 search queries across seven countries, generating ~80,000 real-time GenAI and traditional search results, to understand the extent of current global exposure to GenAI search. We then use a preregistered, randomized experiment on a large study sample representative of the U.S. population to show that while participants trust GenAI search less than traditional search on average, reference links and citations significantly increase trust in GenAI, even when those links and citations are incorrect or hallucinated. Uncertainty highlighting, which reveals GenAI's confidence in its own conclusions, makes us less willing to trust and share generative information whether that confidence is high or low. Positive social feedback increases trust in GenAI while negative feedback reduces trust. These results imply that GenAI designs can increase trust in inaccurate and hallucinated information and reduce trust when GenAI's certainty is made explicit. Trust in GenAI varies by topic and with users' demographics, education, industry employment and GenAI experience, revealing which sub-populations are most vulnerable to GenAI misrepresentations. Trust, in turn, predicts behavior, as those who trust GenAI more click more and spend less time evaluating GenAI search results. These findings suggest directions for GenAI design to safely and productively address the AI "trust gap."
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance
Authors:
Harang Ju,
Sinan Aral
Abstract:
To uncover how AI agents change productivity, performance, and work processes, we introduce MindMeld: an experimentation platform enabling humans and AI agents to collaborate in integrative workspaces. In a large-scale marketing experiment on the platform, 2310 participants were randomly assigned to human-human and human-AI teams, with randomized AI personality traits. The teams exchanged 183,691…
▽ More
To uncover how AI agents change productivity, performance, and work processes, we introduce MindMeld: an experimentation platform enabling humans and AI agents to collaborate in integrative workspaces. In a large-scale marketing experiment on the platform, 2310 participants were randomly assigned to human-human and human-AI teams, with randomized AI personality traits. The teams exchanged 183,691 messages, and created 63,656 image edits, 1,960,095 ad copy edits, and 10,375 AI-generated images while producing 11,138 ads for a large think tank. Analysis of fine-grained communication, collaboration, and workflow logs revealed that collaborating with AI agents increased communication by 137% and allowed humans to focus 23% more on text and image content generation messaging and 20% less on direct text editing. Humans on Human-AI teams sent 23% fewer social messages, creating 60% greater productivity per worker and higher-quality ad copy. In contrast, human-human teams produced higher-quality images, suggesting that AI agents require fine-tuning for multimodal workflows. AI personality prompt randomization revealed that AI traits can complement human personalities to enhance collaboration. For example, conscientious humans paired with open AI agents improved image quality, while extroverted humans paired with conscientious AI agents reduced the quality of text, images, and clicks. In field tests of ad campaigns with ~5M impressions, ads with higher image quality produced by human collaborations and higher text quality produced by AI collaborations performed significantly better on click-through rate and cost per click metrics. Overall, ads created by human-AI teams performed similarly to those created by human-human teams. Together, these results suggest AI agents can improve teamwork and productivity, especially when tuned to complement human traits.
△ Less
Submitted 23 March, 2025;
originally announced March 2025.
-
Advancing AI Negotiations: New Theory and Evidence from a Large-Scale Autonomous Negotiations Competition
Authors:
Michelle Vaccaro,
Michael Caosun,
Harang Ju,
Sinan Aral,
Jared R. Curhan
Abstract:
We conducted an International AI Negotiation Competition in which participants designed and refined prompts for AI negotiation agents. We then facilitated over 180,000 negotiations between these agents across multiple scenarios with diverse characteristics and objectives. Our findings revealed that principles from human negotiation theory remain crucial even in AI-AI contexts. Surprisingly, warmth…
▽ More
We conducted an International AI Negotiation Competition in which participants designed and refined prompts for AI negotiation agents. We then facilitated over 180,000 negotiations between these agents across multiple scenarios with diverse characteristics and objectives. Our findings revealed that principles from human negotiation theory remain crucial even in AI-AI contexts. Surprisingly, warmth--a traditionally human relationship-building trait--was consistently associated with superior outcomes across all key performance metrics. Dominant agents, meanwhile, were especially effective at claiming value. Our analysis also revealed unique dynamics in AI-AI negotiations not fully explained by existing theory, including AI-specific technical strategies like chain-of-thought reasoning, prompt injection, and strategic concealment. When we applied natural language processing (NLP) methods to the full transcripts of all negotiations we found positivity, gratitude and question-asking (associated with warmth) were strongly associated with reaching deals as well as objective and subjective value, whereas conversation lengths (associated with dominance) were strongly associated with impasses. The results suggest the need to establish a new theory of AI negotiation, which integrates classic negotiation theory with AI-specific negotiation theories to better understand autonomous negotiations and optimize agent performance.
△ Less
Submitted 7 July, 2025; v1 submitted 8 March, 2025;
originally announced March 2025.
-
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
Authors:
Matthew DosSantos DiSorbo,
Harang Ju,
Sinan Aral
Abstract:
Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI systems, which make decisions in complex, real-world contexts. Unfortunately, while their generative capabilities are well-documented, their decision-making processes remain poorly understood. This is particularly evident when models are handling exceptions, a critical and challenging aspect of de…
▽ More
Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI systems, which make decisions in complex, real-world contexts. Unfortunately, while their generative capabilities are well-documented, their decision-making processes remain poorly understood. This is particularly evident when models are handling exceptions, a critical and challenging aspect of decision-making made relevant by the inherent incompleteness of contracts. Here we demonstrate that LLMs, even ones that excel at reasoning, deviate significantly from human judgments because they adhere strictly to policies, even when such adherence is impractical, suboptimal, or even counterproductive. We then evaluate three approaches to tuning AI agents to handle exceptions: ethical framework prompting, chain-of-thought reasoning, and supervised fine-tuning. We find that while ethical framework prompting fails and chain-of-thought prompting provides only slight improvements, supervised fine-tuning, specifically with human explanations, yields markedly better results. Surprisingly, in our experiments, supervised fine-tuning even enabled models to generalize human-like decision-making to novel scenarios, demonstrating transfer learning of human-aligned decision-making across contexts. Furthermore, fine-tuning with explanations, not just labels, was critical for alignment, suggesting that aligning LLMs with human judgment requires explicit training on how decisions are made, not just which decisions are made. These findings highlight the need to address LLMs' shortcomings in handling exceptions in order to guide the development of agentic AI toward models that can effectively align with human judgment and simultaneously adapt to novel contexts.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Modeling Dynamic User Interests: A Neural Matrix Factorization Approach
Authors:
Paramveer Dhillon,
Sinan Aral
Abstract:
In recent years, there has been significant interest in understanding users' online content consumption patterns. But, the unstructured, high-dimensional, and dynamic nature of such data makes extracting valuable insights challenging. Here we propose a model that combines the simplicity of matrix factorization with the flexibility of neural networks to efficiently extract nonlinear patterns from m…
▽ More
In recent years, there has been significant interest in understanding users' online content consumption patterns. But, the unstructured, high-dimensional, and dynamic nature of such data makes extracting valuable insights challenging. Here we propose a model that combines the simplicity of matrix factorization with the flexibility of neural networks to efficiently extract nonlinear patterns from massive text data collections relevant to consumers' online consumption patterns. Our model decomposes a user's content consumption journey into nonlinear user and content factors that are used to model their dynamic interests. This natural decomposition allows us to summarize each user's content consumption journey with a dynamic probabilistic weighting over a set of underlying content attributes. The model is fast to estimate, easy to interpret and can harness external data sources as an empirical prior. These advantages make our method well suited to the challenges posed by modern datasets. We use our model to understand the dynamic news consumption interests of Boston Globe readers over five years. Thorough qualitative studies, including a crowdsourced evaluation, highlight our model's ability to accurately identify nuanced and coherent consumption patterns. These results are supported by our model's superior and robust predictive performance over several competitive baseline methods.
△ Less
Submitted 19 September, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
Targeting for long-term outcomes
Authors:
Jeremy Yang,
Dean Eckles,
Paramveer Dhillon,
Sinan Aral
Abstract:
Decision makers often want to target interventions so as to maximize an outcome that is observed only in the long-term. This typically requires delaying decisions until the outcome is observed or relying on simple short-term proxies for the long-term outcome. Here we build on the statistical surrogacy and policy learning literatures to impute the missing long-term outcomes and then approximate the…
▽ More
Decision makers often want to target interventions so as to maximize an outcome that is observed only in the long-term. This typically requires delaying decisions until the outcome is observed or relying on simple short-term proxies for the long-term outcome. Here we build on the statistical surrogacy and policy learning literatures to impute the missing long-term outcomes and then approximate the optimal targeting policy on the imputed outcomes via a doubly-robust approach. We first show that conditions for the validity of average treatment effect estimation with imputed outcomes are also sufficient for valid policy evaluation and optimization; furthermore, these conditions can be somewhat relaxed for policy optimization. We apply our approach in two large-scale proactive churn management experiments at The Boston Globe by targeting optimal discounts to its digital subscribers with the aim of maximizing long-term revenue. Using the first experiment, we evaluate this approach empirically by comparing the policy learned using imputed outcomes with a policy learned on the ground-truth, long-term outcomes. The performance of these two policies is statistically indistinguishable, and we rule out large losses from relying on surrogates. Our approach also outperforms a policy learned on short-term proxies for the long-term outcome. In a second field experiment, we implement the optimal targeting policy with additional randomized exploration, which allows us to update the optimal policy for future subscribers. Over three years, our approach had a net-positive revenue impact in the range of $4-5 million compared to the status quo.
△ Less
Submitted 9 April, 2022; v1 submitted 29 October, 2020;
originally announced October 2020.
-
The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify
Authors:
David Holtz,
Benjamin Carterette,
Praveen Chandar,
Zahra Nazari,
Henriette Cramer,
Sinan Aral
Abstract:
It remains unknown whether personalized recommendations increase or decrease the diversity of content people consume. We present results from a randomized field experiment on Spotify testing the effect of personalized recommendations on consumption diversity. In the experiment, both control and treatment users were given podcast recommendations, with the sole aim of increasing podcast consumption.…
▽ More
It remains unknown whether personalized recommendations increase or decrease the diversity of content people consume. We present results from a randomized field experiment on Spotify testing the effect of personalized recommendations on consumption diversity. In the experiment, both control and treatment users were given podcast recommendations, with the sole aim of increasing podcast consumption. Treatment users' recommendations were personalized based on their music listening history, whereas control users were recommended popular podcasts among users in their demographic group. We find that, on average, the treatment increased podcast streams by 28.90%. However, the treatment also decreased the average individual-level diversity of podcast streams by 11.51%, and increased the aggregate diversity of podcast streams by 5.96%, indicating that personalized recommendations have the potential to create patterns of consumption that are homogenous within and diverse across users, a pattern reflecting Balkanization. Our results provide evidence of an "engagement-diversity trade-off" when recommendations are optimized solely to drive consumption: while personalized recommendations increase user engagement, they also affect the diversity of consumed content. This shift in consumption diversity can affect user retention and lifetime value, and impact the optimal strategy for content producers. We also observe evidence that our treatment affected streams from sections of Spotify's app not directly affected by the experiment, suggesting that exposure to personalized recommendations can affect the content that users consume organically. We believe these findings highlight the need for academics and practitioners to continue investing in personalization methods that explicitly take into account the diversity of content recommended.
△ Less
Submitted 17 March, 2020;
originally announced March 2020.
-
Scalable bundling via dense product embeddings
Authors:
Madhav Kumar,
Dean Eckles,
Sinan Aral
Abstract:
Bundling, the practice of jointly selling two or more products at a discount, is a widely used strategy in industry and a well examined concept in academia. Historically, the focus has been on theoretical studies in the context of monopolistic firms and assumed product relationships, e.g., complementarity in usage. We develop a new machine-learning-driven methodology for designing bundles in a lar…
▽ More
Bundling, the practice of jointly selling two or more products at a discount, is a widely used strategy in industry and a well examined concept in academia. Historically, the focus has been on theoretical studies in the context of monopolistic firms and assumed product relationships, e.g., complementarity in usage. We develop a new machine-learning-driven methodology for designing bundles in a large-scale, cross-category retail setting. We leverage historical purchases and consideration sets created from clickstream data to generate dense continuous representations of products called embeddings. We then put minimal structure on these embeddings and develop heuristics for complementarity and substitutability among products. Subsequently, we use the heuristics to create multiple bundles for each product and test their performance using a field experiment with a large retailer. We combine the results from the experiment with product embeddings using a hierarchical model that maps bundle features to their purchase likelihood, as measured by the add-to-cart rate. We find that our embeddings-based heuristics are strong predictors of bundle success, robust across product categories, and generalize well to the retailer's entire assortment.
△ Less
Submitted 31 January, 2020;
originally announced February 2020.
-
Selection Effects in Online Sharing: Consequences for Peer Adoption
Authors:
Sean J. Taylor,
Eytan Bakshy,
Sinan Aral
Abstract:
Most models of social contagion take peer exposure to be a corollary of adoption, yet in many settings, the visibility of one's adoption behavior happens through a separate decision process. In online systems, product designers can define how peer exposure mechanisms work: adoption behaviors can be shared in a passive, automatic fashion, or occur through explicit, active sharing. The consequences…
▽ More
Most models of social contagion take peer exposure to be a corollary of adoption, yet in many settings, the visibility of one's adoption behavior happens through a separate decision process. In online systems, product designers can define how peer exposure mechanisms work: adoption behaviors can be shared in a passive, automatic fashion, or occur through explicit, active sharing. The consequences of these mechanisms are of substantial practical and theoretical interest: passive sharing may increase total peer exposure but active sharing may expose higher quality products to peers who are more likely to adopt.
We examine selection effects in online sharing through a large-scale field experiment on Facebook that randomizes whether or not adopters share Offers (coupons) in a passive manner. We derive and estimate a joint discrete choice model of adopters' sharing decisions and their peers' adoption decisions. Our results show that active sharing enables a selection effect that exposes peers who are more likely to adopt than the population exposed under passive sharing.
We decompose the selection effect into two distinct mechanisms: active sharers expose peers to higher quality products, and the peers they share with are more likely to adopt independently of product quality. Simulation results show that the user-level mechanism comprises the bulk of the selection effect. The study's findings are among the first to address downstream peer effects induced by online sharing mechanisms, and can inform design in settings where a surplus of sharing could be viewed as costly.
△ Less
Submitted 12 November, 2013;
originally announced November 2013.