Search | arXiv e-print repository

LLaMandement: Large Language Models for Summarization of French Legislative Proposals

Authors: Joseph Gesnouin, Yannis Tannier, Christophe Gomes Da Silva, Hatim Tapory, Camille Brier, Hugo Simon, Raphael Rozenberg, Hermann Woehrel, Mehdi El Yakaabi, Thomas Binder, Guillaume Marie, Emilie Caron, Mathile Nogueira, Thomas Fontas, Laure Puydebois, Marie Theophile, Stephane Morandi, Mael Petit, David Creissac, Pauline Ennouchy, Elise Valetoux, Celine Visade, Severine Balloux, Emmanuel Cortes, Pierre-Etienne Devineau , et al. (3 additional authors not shown)

Abstract: This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges… ▽ More This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges of manually processing a growing volume of legislative amendments, LLaMandement stands as a significant legal technological milestone, providing a solution that exceeds the scalability of traditional human efforts while matching the robustness of a specialized legal drafter. We release all our fine-tuned models and training data to the community. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 21 pages, 9 figures

arXiv:2007.12882 [pdf, ps, other]

A finite sample analysis of the benign overfitting phenomenon for ridge function estimation

Authors: Emmanuel Caron, Stephane Chretien

Abstract: Recent extensive numerical experiments in high scale machine learning have allowed to uncover a quite counterintuitive phase transition, as a function of the ratio between the sample size and the number of parameters in the model. As the number of parameters $p$ approaches the sample size $n$, the generalisation error increases, but surprisingly, it starts decreasing again past the threshold… ▽ More Recent extensive numerical experiments in high scale machine learning have allowed to uncover a quite counterintuitive phase transition, as a function of the ratio between the sample size and the number of parameters in the model. As the number of parameters $p$ approaches the sample size $n$, the generalisation error increases, but surprisingly, it starts decreasing again past the threshold $p=n$. This phenomenon, brought to the theoretical community attention in \cite{belkin2019reconciling}, has been thoroughly investigated lately, more specifically for simpler models than deep neural networks, such as the linear model when the parameter is taken to be the minimum norm solution to the least-squares problem, firstly in the asymptotic regime when $p$ and $n$ tend to infinity, see e.g. \cite{hastie2019surprises}, and recently in the finite dimensional regime and more specifically for linear models \cite{bartlett2020benign}, \cite{tsigler2020benign}, \cite{lecue2022geometrical}. In the present paper, we propose a finite sample analysis of non-linear models of \textit{ridge} type, where we investigate the \textit{overparametrised regime} of the double descent phenomenon for both the \textit{estimation problem} and the \textit{prediction} problem. Our results provide a precise analysis of the distance of the best estimator from the true parameter as well as a generalisation bound which complements recent works of \cite{bartlett2020benign} and \cite{chinot2020benign}. Our analysis is based on tools closely related to the continuous Newton method \cite{neuberger2007continuous} and a refined quantitative analysis of the performance in prediction of the minimum $\ell_2$-norm solution. △ Less

Submitted 12 January, 2024; v1 submitted 25 July, 2020; originally announced July 2020.

Comments: New section on generalisation added

arXiv:1207.1337 [pdf, other]

Optimization in a Self-Stabilizing Service Discovery Framework for Large Scale Systems

Authors: Eddy Caron, Florent Chuffart, Anissa Lamani, Franck Petit

Abstract: Ability to find and get services is a key requirement in the development of large-scale distributed sys- tems. We consider dynamic and unstable environments, namely Peer-to-Peer (P2P) systems. In previous work, we designed a service discovery solution called Distributed Lexicographic Placement Table (DLPT), based on a hierar- chical overlay structure. A self-stabilizing version was given using the… ▽ More Ability to find and get services is a key requirement in the development of large-scale distributed sys- tems. We consider dynamic and unstable environments, namely Peer-to-Peer (P2P) systems. In previous work, we designed a service discovery solution called Distributed Lexicographic Placement Table (DLPT), based on a hierar- chical overlay structure. A self-stabilizing version was given using the Propagation of Information with Feedback (PIF) paradigm. In this paper, we introduce the self-stabilizing COPIF (for Collaborative PIF) scheme. An algo- rithm is provided with its correctness proof. We use this approach to improve a distributed P2P framework designed for the services discovery. Significantly efficient experimental results are presented. △ Less

Submitted 5 July, 2012; originally announced July 2012.

Comments: (2012)

arXiv:1005.3918 [pdf, ps, other]

doi 10.1063/1.3462722

Cosmological Simulations on a Grid of Computers

Authors: Benjamin Depardon, Eddy Caron, Frédéric Desprez, Jérémy Blaizot, Hélène M. Courtois

Abstract: The work presented in this paper aims at restricting the input parameter values of the semi-analytical model used in GALICS and MOMAF, so as to derive which parameters influence the most the results, e.g., star formation, feedback and halo recycling efficiencies, etc. Our approach is to proceed empirically: we run lots of simulations and derive the correct ranges of values. The computation time ne… ▽ More The work presented in this paper aims at restricting the input parameter values of the semi-analytical model used in GALICS and MOMAF, so as to derive which parameters influence the most the results, e.g., star formation, feedback and halo recycling efficiencies, etc. Our approach is to proceed empirically: we run lots of simulations and derive the correct ranges of values. The computation time needed is so large, that we need to run on a grid of computers. Hence, we model GALICS and MOMAF execution time and output files size, and run the simulation using a grid middleware: DIET. All the complexity of accessing resources, scheduling simulations and managing data is harnessed by DIET and hidden behind a web portal accessible to the users. △ Less

Submitted 21 May, 2010; originally announced May 2010.

Comments: Accepted and Published in AIP Conference Proceedings 1241, 2010, pages 816-825

Journal ref: AIP Conference Proceedings , 2010, 1241, pages 816-825

arXiv:0911.2327 [pdf, other]

doi 10.4204/EPTCS.9.9

An Intuitive Automated Modelling Interface for Systems Biology

Authors: Ozan Kahramanoğullari, Luca Cardelli, Emmanuelle Caron

Abstract: We introduce a natural language interface for building stochastic pi calculus models of biological systems. In this language, complex constructs describing biochemical events are built from basic primitives of association, dissociation and transformation. This language thus allows us to model biochemical systems modularly by describing their dynamics in a narrative-style language, while making a… ▽ More We introduce a natural language interface for building stochastic pi calculus models of biological systems. In this language, complex constructs describing biochemical events are built from basic primitives of association, dissociation and transformation. This language thus allows us to model biochemical systems modularly by describing their dynamics in a narrative-style language, while making amendments, refinements and extensions on the models easy. We demonstrate the language on a model of Fc-gamma receptor phosphorylation during phagocytosis. We provide a tool implementation of the translation into a stochastic pi calculus language, Microsoft Research's SPiM. △ Less

Submitted 12 November, 2009; originally announced November 2009.

Journal ref: EPTCS 9, 2009, pp. 73-86

Showing 1–5 of 5 results for author: Caron, E