Skip to main content

Showing 1–16 of 16 results for author: Adam, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.14907  [pdf, other

    cs.CL cs.AI

    GneissWeb: Preparing High Quality Data for LLMs at Scale

    Authors: Hajar Emami Gohari, Swanand Ravindra Kadhe, Syed Yousaf Shah. Constantin Adam, Abdulhamid Adebayo, Praneet Adusumilli, Farhan Ahmed, Nathalie Baracaldo Angel, Santosh Borse, Yuan-Chi Chang, Xuan-Hong Dang, Nirmit Desai, Ravital Eres, Ran Iwamoto, Alexei Karve, Yan Koyfman, Wei-Han Lee, Changchang Liu, Boris Lublinsky, Takuyo Ohko, Pablo Pesce, Maroun Touma, Shiqiang Wang, Shalisha Witherspoon, Herbert Woisetschlager, David Wood , et al. (6 additional authors not shown)

    Abstract: Data quantity and quality play a vital role in determining the performance of Large Language Models (LLMs). High-quality data, in particular, can significantly boost the LLM's ability to generalize on a wide range of downstream tasks. Large pre-training datasets for leading LLMs remain inaccessible to the public, whereas many open datasets are small in size (less than 5 trillion tokens), limiting… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  2. arXiv:2410.01661  [pdf, other

    cs.AI cs.FL

    Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning

    Authors: Jason Piquenot, Maxime Bérar, Pierre Héroux, Jean-Yves Ramel, Romain Raveaux, Sébastien Adam

    Abstract: This paper presents Grammar Reinforcement Learning (GRL), a reinforcement learning algorithm that uses Monte Carlo Tree Search (MCTS) and a transformer architecture that models a Pushdown Automaton (PDA) within a context-free grammar (CFG) framework. Taking as use case the problem of efficiently counting paths and cycles in graphs, a key challenge in network analysis, computer science, biology, an… ▽ More

    Submitted 23 January, 2025; v1 submitted 2 October, 2024; originally announced October 2024.

  3. arXiv:2407.12370  [pdf, other

    cs.LG

    Temporal receptive field in dynamic graph learning: A comprehensive analysis

    Authors: Yannis Karmim, Leshanshui Yang, Raphaël Fournier S'Niehotta, Clément Chatelain, Sébastien Adam, Nicolas Thome

    Abstract: Dynamic link prediction is a critical task in the analysis of evolving networks, with applications ranging from recommender systems to economic exchanges. However, the concept of the temporal receptive field, which refers to the temporal context that models use for making predictions, has been largely overlooked and insufficiently analyzed in existing research. In this study, we present a comprehe… ▽ More

    Submitted 19 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Journal ref: MLG Workshop at ECML-PKDD, Sep 2024, Vilnius (Lituanie), France

  4. arXiv:2403.02931  [pdf

    cs.CY

    Improving the quality of individual-level online information tracking: challenges of existing approaches and introduction of a new content- and long-tail sensitive academic solution

    Authors: Silke Adam, Mykola Makhortykh, Michaela Maier, Viktor Aigenseer, Aleksandra Urman, Teresa Gil Lopez, Clara Christner, Ernesto de León, Roberto Ulloa

    Abstract: This article evaluates the quality of data collection in individual-level desktop information tracking used in the social sciences and shows that the existing approaches face sampling issues, validity issues due to the lack of content-level data and their disregard of the variety of devices and long-tail consumption patterns as well as transparency and privacy issues. To overcome some of these pro… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 73 pages

  5. arXiv:2304.05729  [pdf, other

    cs.LG

    Dynamic Graph Representation Learning with Neural Networks: A Survey

    Authors: Leshanshui Yang, Sébastien Adam, Clément Chatelain

    Abstract: In recent years, Dynamic Graph (DG) representations have been increasingly used for modeling dynamic systems due to their ability to integrate both topological and temporal information in a compact representation. Dynamic graphs allow to efficiently handle applications such as social network prediction, recommender systems, traffic forecasting or electroencephalography analysis, that can not be ad… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  6. arXiv:2303.01590  [pdf, other

    cs.LG cs.CL

    Technical report: Graph Neural Networks go Grammatical

    Authors: Jason Piquenot, Aldo Moscatelli, Maxime Bérar, Pierre Héroux, Romain raveaux, Jean-Yves Ramel, Sébastien Adam

    Abstract: This paper introduces a framework for formally establishing a connection between a portion of an algebraic language and a Graph Neural Network (GNN). The framework leverages Context-Free Grammars (CFG) to organize algebraic operations into generative rules that can be translated into a GNN layer model. As CFGs derived directly from a language tend to contain redundancies in their rules and variabl… ▽ More

    Submitted 4 October, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: 24 pages, 11 figures

  7. arXiv:2207.00489  [pdf

    cs.CL cs.CY

    Panning for gold: Lessons learned from the platform-agnostic automated detection of political content in textual data

    Authors: Mykola Makhortykh, Ernesto de León, Aleksandra Urman, Clara Christner, Maryna Sydorova, Silke Adam, Michaela Maier, Teresa Gil-Lopez

    Abstract: The growing availability of data about online information behaviour enables new possibilities for political communication research. However, the volume and variety of these data makes them difficult to analyse and prompts the need for developing automated content approaches relying on a broad range of natural language processing techniques (e.g. machine learning- or neural network-based ones). In… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  8. arXiv:2204.02241  [pdf, other

    cs.LG cs.AI

    A Set Membership Approach to Discovering Feature Relevance and Explaining Neural Classifier Decisions

    Authors: Stavros P. Adam, Aristidis C. Likas

    Abstract: Neural classifiers are non linear systems providing decisions on the classes of patterns, for a given problem they have learned. The output computed by a classifier for each pattern constitutes an approximation of the output of some unknown function, mapping pattern data to their respective classes. The lack of knowledge of such a function along with the complexity of neural classifiers, especiall… ▽ More

    Submitted 4 June, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Revised description in Section 2 (The Proposed Approach) results unchanged

  9. arXiv:2106.04319  [pdf, other

    cs.LG cs.AI

    Breaking the Limits of Message Passing Graph Neural Networks

    Authors: Muhammet Balcilar, Pierre Héroux, Benoit Gaüzère, Pascal Vasseur, Sébastien Adam, Paul Honeine

    Abstract: Since the Message Passing (Graph) Neural Networks (MPNNs) have a linear complexity with respect to the number of nodes when applied to sparse graphs, they have been widely implemented and still raise a lot of interest even though their theoretical expressive power is limited to the first order Weisfeiler-Lehman test (1-WL). In this paper, we show that if the graph convolution supports are designed… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: 18 pages, 6 figures

    MSC Class: 68T07 ACM Class: I.5.0; I.2.0

    Journal ref: The Thirty-eighth International Conference on Machine Learning, ICML2021

  10. arXiv:2012.12578  [pdf

    cs.DC

    Soap serialization effect on communication nodes and protocols

    Authors: Ali Baba Dauda, Mohammed Sani Adam, Muhammad Ahmad Mustapha, Audu Musa Mabu, Suleiman Mustafa

    Abstract: Although serialization improves the transmission of data through utilization of bandwidth, but its impact at the communication systems is not fully accounted. This research used Simple Object Access Protocol (SOAP) Web services to exchange serialized and normal messages via Hypertext Transfer Protocol (HTTP) and Java Messaging System (JMS). We implemented two web services as server and client endp… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

    Comments: 13 pages

  11. arXiv:2003.11702  [pdf, other

    cs.LG stat.ML

    Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

    Authors: Muhammet Balcilar, Guillaume Renton, Pierre Heroux, Benoit Gauzere, Sebastien Adam, Paul Honeine

    Abstract: This paper aims at revisiting Graph Convolutional Neural Networks by bridging the gap between spectral and spatial design of graph convolutions. We theoretically demonstrate some equivalence of the graph convolution process regardless it is designed in the spatial or the spectral domain. The obtained general framework allows to lead a spectral analysis of the most popular ConvGNNs, explaining thei… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: 24 pages, 8figures, preprint

    MSC Class: 68T05 ACM Class: I.5.2

  12. arXiv:1709.01867  [pdf, other

    cs.LG stat.ML

    Neural Networks Regularization Through Class-wise Invariant Representation Learning

    Authors: Soufiane Belharbi, Clément Chatelain, Romain Hérault, Sébastien Adam

    Abstract: Training deep neural networks is known to require a large number of training samples. However, in many applications only few training samples are available. In this work, we tackle the issue of training neural networks for classification task when few training samples are available. We attempt to solve this issue by proposing a new regularization term that constrains the hidden layers of a network… ▽ More

    Submitted 22 December, 2017; v1 submitted 6 September, 2017; originally announced September 2017.

    Comments: Submitted to ELSEVIER, 13 pages, 5 figures

  13. arXiv:1505.05740  [pdf, other

    cs.DS cs.CV

    Graph edit distance : a new binary linear programming formulation

    Authors: Julien Lerouge, Zeina Abu-Aisheh, Romain Raveaux, Pierre Héroux, Sébastien Adam

    Abstract: Graph edit distance (GED) is a powerful and flexible graph matching paradigm that can be used to address different tasks in structural pattern recognition, machine learning, and data mining. In this paper, some new binary linear programming formulations for computing the exact GED between two graphs are proposed. A major strength of the formulations lies in their genericity since the GED can be co… ▽ More

    Submitted 21 May, 2015; originally announced May 2015.

  14. arXiv:1504.07550  [pdf, other

    cs.LG stat.ML

    Deep Neural Networks Regularization for Structured Output Prediction

    Authors: Soufiane Belharbi, Romain Hérault, Clément Chatelain, Sébastien Adam

    Abstract: A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation $x \to y$ by exploiting the regularities in the input $x$. In structured output prediction problems, $y$ is multi-dimensional and structural relations often exist between the dimensions. The motivation of this work is to learn the output dependencies that may lie in the outpu… ▽ More

    Submitted 30 October, 2017; v1 submitted 28 April, 2015; originally announced April 2015.

    Comments: Submitted to Neurocomputing, 8 figures

  15. arXiv:1412.4714  [pdf, other

    cs.RO cs.SE

    Towards Interactive, Incremental Programming of ROS Nodes

    Authors: Sorin Adam, Ulrik Pagh Schultz

    Abstract: Writing software for controlling robots is a complex task, usually demanding command of many programming languages and requiring significant experimentation. We believe that a bottom-up development process that complements traditional component- and MDSD-based approaches can facilitate experimentation. We propose the use of an internal DSL providing both a tool to interactively create ROS nodes an… ▽ More

    Submitted 15 December, 2014; originally announced December 2014.

    Comments: Presented at DSLRob 2014 (arXiv:cs/1411.7148)

    Report number: DSLRob/2014/05

  16. arXiv:cs/0303004  [pdf, ps, other

    math.NA cs.MS physics.comp-ph

    Reliability Conditions in Quadrature Algorithms

    Authors: Gh. Adam, S. Adam, N. M. Plakida

    Abstract: The detection of insufficiently resolved or ill-conditioned integrand structures is critical for the reliability assessment of the quadrature rule outputs. We discuss a method of analysis of the profile of the integrand at the quadrature knots which allows inferences approaching the theoretical 100% rate of success, under error estimate sharpening. The proposed procedure is of the highest intere… ▽ More

    Submitted 6 March, 2003; originally announced March 2003.

    Comments: 23 pages, 8 figures, 1 table, LaTeX2e, elsart.cls macro added, submitted to Computer Physics Communications

    Report number: E17-2002-205 (JINR Dubna preprint, sept.2002; preliminary version of this paper) ACM Class: G.4; G.1.4; G.1.0; J.2; D.2.4