Skip to main content

Showing 1–29 of 29 results for author: Bakker, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.24118  [pdf, ps, other

    cs.CY cs.SI

    Scaling Human Judgment in Community Notes with LLMs

    Authors: Haiwen Li, Soham De, Manon Revel, Andreas Haupt, Brad Miller, Keith Coleman, Jay Baxter, Martin Saveski, Michiel A. Bakker

    Abstract: This paper argues for a new paradigm for Community Notes in the LLM era: an open ecosystem where both humans and LLMs can write notes, and the decision of which notes are helpful enough to show remains in the hands of humans. This approach can accelerate the delivery of notes, while maintaining trust and legitimacy through Community Notes' foundational principle: A community of diverse human rater… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  2. arXiv:2505.20067  [pdf, ps, other

    cs.SI cs.AI cs.CY

    Community Moderation and the New Epistemology of Fact Checking on Social Media

    Authors: Isabelle Augenstein, Michiel Bakker, Tanmoy Chakraborty, David Corney, Emilio Ferrara, Iryna Gurevych, Scott Hale, Eduard Hovy, Heng Ji, Irene Larraz, Filippo Menczer, Preslav Nakov, Paolo Papotti, Dhruv Sahnan, Greta Warren, Giovanni Zagni

    Abstract: Social media platforms have traditionally relied on internal moderation teams and partnerships with independent fact-checking organizations to identify and flag misleading content. Recently, however, platforms including X (formerly Twitter) and Meta have shifted towards community-driven content moderation by launching their own versions of crowd-sourced fact-checking -- Community Notes. If effecti… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: 1 Figure, 2 tables

  3. arXiv:2503.15484  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Value Profiles for Encoding Human Variation

    Authors: Taylor Sorensen, Pushkar Mishra, Roma Patel, Michael Henry Tessler, Michiel Bakker, Georgina Evans, Iason Gabriel, Noah Goodman, Verena Rieser

    Abstract: Modelling human variation in rating tasks is crucial for enabling AI systems for personalization, pluralistic model alignment, and computational social science. We propose representing individuals using value profiles -- natural language descriptions of underlying values compressed from in-context demonstrations -- along with a steerable decoder model to estimate ratings conditioned on a value pro… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  4. Using Collective Dialogues and AI to Find Common Ground Between Israeli and Palestinian Peacebuilders

    Authors: Andrew Konya, Luke Thorburn, Wasim Almasri, Oded Adomi Leshem, Ariel D. Procaccia, Lisa Schirch, Michiel A. Bakker

    Abstract: A growing body of work has shown that AI-assisted methods -- leveraging large language models, social choice methods, and collective dialogues -- can help navigate polarization and surface common ground in controlled lab settings. But what can these approaches contribute in real-world contexts? We present a case study applying these techniques to find common ground between Israeli and Palestinian… ▽ More

    Submitted 19 June, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: Accepted at FAccT 2025

  5. arXiv:2502.13410  [pdf, other

    cs.GT cs.AI econ.TH

    Tell Me Why: Incentivizing Explanations

    Authors: Siddarth Srinivasan, Ezra Karger, Michiel Bakker, Yiling Chen

    Abstract: Common sense suggests that when individuals explain why they believe something, we can arrive at more accurate conclusions than when they simply state what they believe. Yet, there is no known mechanism that provides incentives to elicit explanations for beliefs from agents. This likely stems from the fact that standard Bayesian models make assumptions (like conditional independence of signals) th… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  6. arXiv:2502.09369  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Language Agents as Digital Representatives in Collective Decision-Making

    Authors: Daniel Jarrett, Miruna Pîslar, Michiel A. Bakker, Michael Henry Tessler, Raphael Köster, Jan Balaguer, Romuald Elie, Christopher Summerfield, Andrea Tacchetti

    Abstract: Consider the process of collective decision-making, in which a group of individuals interactively select a preferred outcome from among a universe of alternatives. In this context, "representation" is the activity of making an individual's preferences present in the process via participation by a proxy agent -- i.e. their "representative". To this end, learned models of human behavior have the pot… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

  7. arXiv:2412.09988  [pdf

    cs.CY cs.AI

    AI and the Future of Digital Public Squares

    Authors: Beth Goldberg, Diana Acosta-Navas, Michiel Bakker, Ian Beacock, Matt Botvinick, Prateek Buch, Renée DiResta, Nandika Donthi, Nathanael Fast, Ravi Iyer, Zaria Jalan, Andrew Konya, Grace Kwak Danciu, Hélène Landemore, Alice Marwick, Carl Miller, Aviv Ovadya, Emily Saltz, Lisa Schirch, Dalit Shalom, Divya Siddarth, Felix Sieker, Christopher Small, Jonathan Stray, Audrey Tang , et al. (2 additional authors not shown)

    Abstract: Two substantial technological advances have reshaped the public square in recent decades: first with the advent of the internet and second with the recent introduction of large language models (LLMs). LLMs offer opportunities for a paradigm shift towards more decentralized, participatory online spaces that can be used to facilitate deliberative dialogues at scale, but also create risks of exacerba… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: 40 pages, 5 figures

  8. arXiv:2411.09222  [pdf, ps, other

    cs.CY

    Democratic AI is Possible. The Democracy Levels Framework Shows How It Might Work

    Authors: Aviv Ovadya, Kyle Redman, Luke Thorburn, Quan Ze Chen, Oliver Smith, Flynn Devine, Andrew Konya, Smitha Milli, Manon Revel, K. J. Kevin Feng, Amy X. Zhang, Bilva Chandra, Michiel A. Bakker, Atoosa Kasirzadeh

    Abstract: This position paper argues that effectively "democratizing AI" requires democratic governance and alignment of AI, and that this is particularly valuable for decisions with systemic societal impacts. Initial steps -- such as Meta's Community Forums and Anthropic's Collective Constitutional AI -- have illustrated a promising direction, where democratic processes could be used to meaningfully improv… ▽ More

    Submitted 18 June, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: 31 pages. Accepted to the position paper track at ICML 2025. A previous version was presented at the Pluralistic Alignment Workshop at NeurIPS 2024. For ongoing work, see: https://democracylevels.org

  9. arXiv:2411.06116  [pdf, other

    cs.SI

    Supernotes: Driving Consensus in Crowd-Sourced Fact-Checking

    Authors: Soham De, Michiel A. Bakker, Jay Baxter, Martin Saveski

    Abstract: X's Community Notes, a crowd-sourced fact-checking system, allows users to annotate potentially misleading posts. Notes rated as helpful by a diverse set of users are prominently displayed below the original post. While demonstrably effective at reducing misinformation's impact when notes are displayed, there is an opportunity for notes to appear on many more posts: for 91% of posts where at least… ▽ More

    Submitted 9 November, 2024; originally announced November 2024.

    Comments: 11 pages, 10 figures (including appendix)

  10. arXiv:2410.21944  [pdf, other

    cs.HC

    Evaluating Perceptual Deviations in Video See-Through Head-Mounted Displays while Utilizing Physical Touchscreens

    Authors: Rudy De-Xin de Lange, Roemer Martin Bien Bakker, Tanja Johanna Juliana Bos

    Abstract: Extended reality technology has become a useful tool in many applications, but still suffers from visual deviations that can hamper the utility of the technology. This paper discusses the types of persisting visual deviations experienced when observing the natural world through video see-through head-mounted displays. A generalizable method to measure the effect of these deviations on real-world i… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

    Comments: 10 pages. Preprint. A shortened 4-page version of this paper was accepted to the IEEE ISMAR2024 poster track

  11. arXiv:2409.06729  [pdf

    cs.CY cs.AI

    How will advanced AI systems impact democracy?

    Authors: Christopher Summerfield, Lisa Argyle, Michiel Bakker, Teddy Collins, Esin Durmus, Tyna Eloundou, Iason Gabriel, Deep Ganguli, Kobi Hackenburg, Gillian Hadfield, Luke Hewitt, Saffron Huang, Helene Landemore, Nahema Marchal, Aviv Ovadya, Ariel Procaccia, Mathias Risse, Bruce Schneier, Elizabeth Seger, Divya Siddarth, Henrik Skaug Sætra, MH Tessler, Matthew Botvinick

    Abstract: Advanced AI systems capable of generating humanlike text and multimodal content are now widely available. In this paper, we discuss the impacts that generative artificial intelligence may have on democratic processes. We consider the consequences of AI for citizens' ability to make informed choices about political representatives and issues (epistemic impacts). We ask how AI might be used to desta… ▽ More

    Submitted 27 August, 2024; originally announced September 2024.

    Comments: 25 pages

  12. arXiv:2211.15006  [pdf, other

    cs.LG cs.CL

    Fine-tuning language models to find agreement among humans with diverse preferences

    Authors: Michiel A. Bakker, Martin J. Chadwick, Hannah R. Sheahan, Michael Henry Tessler, Lucy Campbell-Gillingham, Jan Balaguer, Nat McAleese, Amelia Glaese, John Aslanides, Matthew M. Botvinick, Christopher Summerfield

    Abstract: Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

  13. arXiv:2110.11404  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    Statistical discrimination in learning agents

    Authors: Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo

    Abstract: Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics. One primary example is \textit{statistical discrimination} -- selecting social partners based not on their underlying attributes, but on readily perceptible characteristics that covary with their suitability for the task at ha… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: 29 pages, 10 figures

    MSC Class: 68T07 (Primary) 91A26; 91-10; 93A16 (Secondary) ACM Class: I.2.11; I.2.0

  14. arXiv:2104.11017  [pdf

    eess.IV cs.CV cs.LG

    Multi-task Semi-supervised Learning for Pulmonary Lobe Segmentation

    Authors: Jingnan Jia, Zhiwei Zhai, M. Els Bakker, I. Hernandez Giron, Marius Staring, Berend C. Stoel

    Abstract: Pulmonary lobe segmentation is an important preprocessing task for the analysis of lung diseases. Traditional methods relying on fissure detection or other anatomical features, such as the distribution of pulmonary vessels and airways, could provide reasonably accurate lobe segmentations. Deep learning based methods can outperform these traditional approaches, but require large datasets. Deep mult… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: 4 pages, to be published in ISBI 2021

  15. arXiv:2104.04991  [pdf, other

    cs.CV

    Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval

    Authors: Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew

    Abstract: Accurately matching visual and textual data in cross-modal retrieval has been widely studied in the multimedia community. To address these challenges posited by the heterogeneity gap and the semantic gap, we propose integrating Shannon information theory and adversarial learning. In terms of the heterogeneity gap, we integrate modality classification and information entropy maximization adversaria… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: Accepted by Pattern Recognition

  16. arXiv:2103.12462  [pdf, other

    cs.CV

    Lifelong Person Re-Identification via Adaptive Knowledge Accumulation

    Authors: Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew

    Abstract: Person ReID methods always learn through a stationary domain that is fixed by the choice of a given dataset. In many contexts (e.g., lifelong learning), those methods are ineffective because the domain is continually changing in which case incremental learning over multiple domains is required potentially. In this work we explore a new and challenging ReID task, namely lifelong person re-identific… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: 10 pages, 5 figures, Accepted by CVPR2021

  17. arXiv:2102.06911  [pdf, other

    cs.MA cs.AI

    Modelling Cooperation in Network Games with Spatio-Temporal Complexity

    Authors: Michiel A. Bakker, Richard Everett, Laura Weidinger, Iason Gabriel, William S. Isaac, Joel Z. Leibo, Edward Hughes

    Abstract: The real world is awash with multi-agent problems that require collective action by self-interested agents, from the routing of packets across a computer network to the management of irrigation systems. Such systems have local incentives for individuals, whose behavior has an impact on the global outcome for the group. Given appropriate mechanisms describing agent interaction, groups may achieve s… ▽ More

    Submitted 13 February, 2021; originally announced February 2021.

    Comments: AAMAS 2021

  18. arXiv:2012.03820  [pdf, other

    cs.CV

    Self-supervised asymmetric deep hashing with margin-scalable constraint

    Authors: Zhengyang Yu, Song Wu, Zhihao Dou, Erwin M. Bakker

    Abstract: Due to its effectivity and efficiency, deep hashing approaches are widely used for large-scale visual search. However, it is still challenging to produce compact and discriminative hash codes for images associated with multiple semantics for two main reasons, 1) similarity constraints designed in most of the existing methods are based upon an oversimplified similarity assignment(i.e., 0 for instan… ▽ More

    Submitted 23 July, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

  19. arXiv:2010.08020  [pdf, other

    cs.CV

    On the Exploration of Incremental Learning for Fine-grained Image Retrieval

    Authors: Wei Chen, Yu Liu, Weiping Wang, Tinne Tuytelaars, Erwin M. Bakker, Michael Lew

    Abstract: In this paper, we consider the problem of fine-grained image retrieval in an incremental setting, when new categories are added over time. On the one hand, repeatedly training the representation on the extended dataset is time-consuming. On the other hand, fine-tuning the learned representation only with the new classes leads to catastrophic forgetting. To this end, we propose an incremental learn… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: BMVC2020

  20. arXiv:2008.02520  [pdf, other

    cs.CV

    Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification

    Authors: Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew

    Abstract: Visible-infrared person re-identification (VI-ReID) is a challenging and essential task in night-time intelligent surveillance systems. Except for the intra-modality variance that RGB-RGB person re-identification mainly overcomes, VI-ReID suffers from additional inter-modality variance caused by the inherent heterogeneous gap. To solve the problem, we present a carefully designed dual Gaussian-bas… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted by ACM MM 2020 poster. 12 pages, 10 appendixes

  21. arXiv:2003.14412  [pdf, other

    cs.CR cs.CY

    Assessing Disease Exposure Risk with Location Data: A Proposal for Cryptographic Preservation of Privacy

    Authors: Alex Berke, Michiel Bakker, Praneeth Vepakomma, Kent Larson, Alex 'Sandy' Pentland

    Abstract: Governments and researchers around the world are implementing digital contact tracing solutions to stem the spread of infectious disease, namely COVID-19. Many of these solutions threaten individual rights and privacy. Our goal is to break past the false dichotomy of effective versus privacy-preserving contact tracing. We offer an alternative approach to assess and communicate users' risk of expos… ▽ More

    Submitted 8 April, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

  22. arXiv:1910.13983  [pdf, other

    cs.LG cs.CY stat.ML

    DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning

    Authors: Michiel A. Bakker, Duy Patrick Tu, Humberto Riverón Valdés, Krishna P. Gummadi, Kush R. Varshney, Adrian Weller, Alex Pentland

    Abstract: We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives. We train a reinforcement learning agent to sequentially acquire a subset of the information while balancing accuracy and fairness of predictors downstream. Based on the set of already acquired features, the age… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

    Comments: Accepted at NeurIPS 2019 HCML Workshop

  23. arXiv:1905.10688  [pdf, other

    cs.LG cs.DB cs.IR stat.ML

    Sherlock: A Deep Learning Approach to Semantic Data Type Detection

    Authors: Madelon Hulsebos, Kevin Hu, Michiel Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Çağatay Demiralp, César Hidalgo

    Abstract: Correctly detecting the semantic type of data columns is crucial for data science tasks such as automated data cleaning, schema matching, and data discovery. Existing data preparation and analysis systems rely on dictionary lookups and regular expression matching to detect semantic types. However, these matching-based approaches often are not robust to dirty data and only detect a limited number o… ▽ More

    Submitted 25 May, 2019; originally announced May 2019.

    Comments: KDD'19

  24. arXiv:1905.04616  [pdf, other

    cs.HC cs.DB cs.LG

    VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository

    Authors: Kevin Hu, Neil Gaikwad, Michiel Bakker, Madelon Hulsebos, Emanuel Zgraggen, César Hidalgo, Tim Kraska, Guoliang Li, Arvind Satyanarayan, Çağatay Demiralp

    Abstract: Researchers currently rely on ad hoc datasets to train automated visualization tools and evaluate the effectiveness of visualization designs. These exemplars often lack the characteristics of real-world datasets, and their one-off nature makes it difficult to compare different techniques. In this paper, we present VizNet: a large-scale corpus of over 31 million datasets compiled from open data rep… ▽ More

    Submitted 11 May, 2019; originally announced May 2019.

    Comments: CHI'19

  25. arXiv:1810.00031  [pdf, other

    cs.CY cs.AI cs.LG stat.AP

    Active Fairness in Algorithmic Decision Making

    Authors: Alejandro Noriega-Campero, Michiel A. Bakker, Bernardo Garcia-Bulle, Alex Pentland

    Abstract: Society increasingly relies on machine learning models for automated decision making. Yet, efficiency gains from automation have come paired with concern for algorithmic discrimination that can systematize inequality. Recent work has proposed optimal post-processing methods that randomize classification decisions for a fraction of individuals, in order to achieve fairness measures related to parit… ▽ More

    Submitted 7 November, 2018; v1 submitted 28 September, 2018; originally announced October 2018.

  26. arXiv:1808.04819  [pdf, other

    cs.HC cs.AI cs.LG

    VizML: A Machine Learning Approach to Visualization Recommendation

    Authors: Kevin Z. Hu, Michiel A. Bakker, Stephen Li, Tim Kraska, César A. Hidalgo

    Abstract: Data visualization should be accessible for all analysts with data, not just the few with technical expertise. Visualization recommender systems aim to lower the barrier to exploring basic visualizations by automatically generating results for analysts to search and select, rather than manually specify. Here, we demonstrate a novel machine learning-based approach to visualization recommendation th… ▽ More

    Submitted 14 August, 2018; originally announced August 2018.

  27. arXiv:1212.2438  [pdf, other

    eess.SY cs.SE math.DS physics.chem-ph q-bio.MN

    Model-order reduction of biochemical reaction networks

    Authors: Shodhan Rao, Arjan van der Schaft, Karen van Eunen, Barbara M. Bakker, Bayu Jayawardhana

    Abstract: In this paper we propose a model-order reduction method for chemical reaction networks governed by general enzyme kinetics, including the mass-action and Michaelis-Menten kinetics. The model-order reduction method is based on the Kron reduction of the weighted Laplacian matrix which describes the graph structure of complexes in the chemical reaction network. We apply our method to a yeast glycolys… ▽ More

    Submitted 11 December, 2012; originally announced December 2012.

    Comments: 7 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:1211.6643, arXiv:1110.6078

  28. arXiv:1105.6060  [pdf

    cs.CV

    Alignment of Microtubule Imagery

    Authors: Feiyang Yu, Ard Oerlemans, Erwin M. Bakker

    Abstract: This work discusses preliminary work aimed at simulating and visualizing the growth process of a tiny structure inside the cell---the microtubule. Difficulty of recording the process lies in the fact that the tissue preparation method for electronic microscopes is highly destructive to live cells. Here in this paper, our approach is to take pictures of microtubules at different time slots and then… ▽ More

    Submitted 30 May, 2011; originally announced May 2011.

  29. arXiv:0805.3897  [pdf, ps, other

    cs.PF

    SPARK00: A Benchmark Package for the Compiler Evaluation of Irregular/Sparse Codes

    Authors: H. L. A. van der Spek, E. M. Bakker, H. A. G. Wijshoff

    Abstract: We propose a set of benchmarks that specifically targets a major cause of performance degradation in high performance computing platforms: irregular access patterns. These benchmarks are meant to be used to asses the performance of optimizing compilers on codes with a varying degree of irregular access. The irregularity caused by the use of pointers and indirection arrays are a major challenge f… ▽ More

    Submitted 26 May, 2008; originally announced May 2008.