Skip to main content

Showing 1–11 of 11 results for author: Dubois, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.03772  [pdf, ps, other

    cs.LG stat.ML

    Skewed Score: A statistical framework to assess autograders

    Authors: Magda Dubois, Harry Coppock, Mario Giulianelli, Timo Flesch, Lennart Luettgau, Cozmin Ududec

    Abstract: The evaluation of large language model (LLM) outputs is increasingly performed by other LLMs, a setup commonly known as "LLM-as-a-judge", or autograders. While autograders offer a scalable alternative to human evaluation, they have shown mixed reliability and may exhibit systematic biases, depending on response type, scoring methodology, domain specificity, or other factors. Here we propose a stat… ▽ More

    Submitted 9 July, 2025; v1 submitted 4 July, 2025; originally announced July 2025.

  2. arXiv:2507.03409  [pdf, ps, other

    cs.AI

    Lessons from a Chimp: AI "Scheming" and the Quest for Ape Language

    Authors: Christopher Summerfield, Lennart Luettgau, Magda Dubois, Hannah Rose Kirk, Kobi Hackenburg, Catherine Fist, Katarina Slama, Nicola Ding, Rebecca Anselmetti, Andrew Strait, Mario Giulianelli, Cozmin Ududec

    Abstract: We examine recent research that asks whether current AI systems may be developing a capacity for "scheming" (covertly and strategically pursuing misaligned goals). We compare current research practices in this field to those adopted in the 1970s to test whether non-human primates could master natural language. We argue that there are lessons to be learned from that historical research endeavour, w… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

  3. arXiv:2505.05602  [pdf, ps, other

    cs.AI stat.AP

    HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics

    Authors: Lennart Luettgau, Harry Coppock, Magda Dubois, Christopher Summerfield, Cozmin Ududec

    Abstract: As Large Language Models (LLMs) and other AI systems evolve, robustly estimating their capabilities from inherently stochastic outputs while systematically quantifying uncertainty in these estimates becomes increasingly important. Further, advanced AI evaluations often have a nested hierarchical structure, exhibit high levels of complexity, and come with high costs in testing the most advanced AI… ▽ More

    Submitted 8 July, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Comments: 23 pages, 9 figures

  4. arXiv:2409.07615  [pdf, ps, other

    cs.CL

    MOSAIC: Multiple Observers Spotting AI Content

    Authors: Matthieu Dubois, François Yvon, Pablo Piantanida

    Abstract: The dissemination of Large Language Models (LLMs), trained at scale, and endowed with powerful text-generating abilities, has made it easier for all to produce harmful, toxic, faked or forged content. In response, various proposals have been made to automatically discriminate artificially generated from human-written texts, typically framing the problem as a binary classification problem. Early ap… ▽ More

    Submitted 11 June, 2025; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: ACL 2025 Findings, code can be found at https://github.com/BaggerOfWords/MOSAIC

  5. arXiv:2311.04909  [pdf

    cs.DL

    Streetlight Effect in Post-Publication Peer Review: Are Open Access Publications More Scrutinized?

    Authors: Abdelghani Maddi, Emmanuel Monneau, Catherine Gaspare, Floriana Gargiulo, Michel Dubois

    Abstract: The Streetlight Effect represents an observation bias that occurs when individuals search for something only where it is easiest to look. Despite the significant development of Post-Publication Peer Review (PPPR) in recent years, facilitated in part by platforms such as PubPeer, existing literature has not examined whether PPPR is affected by this type of bias. In other words, if the PPPR mainly c… ▽ More

    Submitted 23 October, 2023; originally announced November 2023.

  6. arXiv:2310.01046  [pdf, other

    physics.soc-ph cs.SI

    Epistemic integration and social segregation of AI in neuroscience

    Authors: Sylvain Fontaine, Floriana Gargiulo, Michel Dubois, Paola Tubaro

    Abstract: In recent years, Artificial Intelligence (AI) shows a spectacular ability of insertion inside a variety of disciplines which use it for scientific advancements and which sometimes improve it for their conceptual and methodological needs. According to the transverse science framework originally conceived by Shinn and Joerges, AI can be seen as an instrument which is progressively acquiring a univer… ▽ More

    Submitted 6 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  7. arXiv:2306.10484  [pdf, other

    eess.IV cs.CV

    The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data

    Authors: Luuk H. Boulogne, Julian Lorenz, Daniel Kienzle, Robin Schon, Katja Ludwig, Rainer Lienhart, Simon Jegou, Guang Li, Cong Chen, Qi Wang, Derik Shi, Mayug Maniparambil, Dominik Muller, Silvan Mertes, Niklas Schroter, Fabio Hellmann, Miriam Elia, Ine Dirks, Matias Nicolas Bossa, Abel Diaz Berenguer, Tanmoy Mukherjee, Jef Vandemeulebroucke, Hichem Sahli, Nikos Deligiannis, Panagiotis Gonidakis , et al. (13 additional authors not shown)

    Abstract: Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training m… ▽ More

    Submitted 25 June, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

  8. arXiv:1609.03734  [pdf, other

    cs.CR

    Hacking of the AES with Boolean Functions

    Authors: Michel Dubois, Eric Filiol

    Abstract: One of the major issues of cryptography is the cryptanalysis of cipher algorithms. Cryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information that is normally required. Some mechanisms for breaking codes include differential cryptanalysis, advanced statistics and brute-force. Recent works also attempt to use algebraic tools… ▽ More

    Submitted 13 September, 2016; originally announced September 2016.

    Comments: Submitted to FORmal methods for Security Engineering - ForSE 2017 25 pages

  9. arXiv:1403.5370  [pdf, other

    stat.ML cs.CV cs.LG

    Using n-grams models for visual semantic place recognition

    Authors: Mathieu Dubois, Frenoux Emmanuelle, Philippe Tarroux

    Abstract: The aim of this paper is to present a new method for visual place recognition. Our system combines global image characterization and visual words, which allows to use efficient Bayesian filtering methods to integrate several images. More precisely, we extend the classical HMM model with techniques inspired by the field of Natural Language Processing. This paper presents our system and the Bayesian… ▽ More

    Submitted 21 March, 2014; originally announced March 2014.

    Comments: VISAPP (2013)

  10. Assessment of a percutaneous iliosacral screw insertion simulator

    Authors: J. Tonetti, L. Vadcard, P. Girard, M. Dubois, P. Merloz, Jocelyne Troccaz

    Abstract: BACKGROUND: Navigational simulator use for specialized training purposes is rather uncommon in orthopaedic and trauma surgery. However, it reveals providing a valuable tool to train orthopaedic surgeons and help them to plan complex surgical procedures. PURPOSE: This work's objective was to assess educational efficiency of a path simulator under fluoroscopic guidance applied to sacroiliac joint… ▽ More

    Submitted 12 October, 2009; originally announced October 2009.

    Journal ref: Orthop Traumatol Surg Res (2009) epub ahead of print

  11. arXiv:0712.2168  [pdf

    cs.HC

    Study of conditions of use of E-services accessible to visually disabled persons

    Authors: Marc-Eric Bobiller-Chaumon, Michel Dubois, Françoise Sandoz-Guermond

    Abstract: The aim of this paper is to determine the expectations that French-speaking disabled persons have for electronic administrative sites (utility). At the same time, it is a matter of identifying the difficulties of use that the manipulation of these E-services poses concretely for blind people (usability) and of evaluating the psychosocial impacts on the way of life of these people with specific n… ▽ More

    Submitted 13 December, 2007; originally announced December 2007.

    Comments: 4 pages visible à http://ceur-ws.org/Vol-285

    Journal ref: Dans CEUR Workshop Proceedings - DEGAS'07 : Workshop of Design & Evaluation of e-Government Applications and services, Rio de Janeiro : Brésil (2006)