Skip to main content

Showing 1–14 of 14 results for author: Lan, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.13398  [pdf, ps, other

    cs.LG cs.CL

    A Minimum Description Length Approach to Regularization in Neural Networks

    Authors: Matan Abudy, Orr Well, Emmanuel Chemla, Roni Katzir, Nur Lan

    Abstract: State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at approximations instead. We show that the choice of regularization method plays a crucial role: when trained on formal languages with standard regularization ($L_1$, $L_2$, or none), expressive architec… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 9 pages

  2. Visual Environment-Interactive Planning for Embodied Complex-Question Answering

    Authors: Ning Lan, Baoshan Ou, Xuemei Xie, Guangming Shi

    Abstract: This study focuses on Embodied Complex-Question Answering task, which means the embodied robot need to understand human questions with intricate structures and abstract semantics. The core of this task lies in making appropriate plans based on the perception of the visual environment. Existing methods often generate plans in a once-for-all manner, i.e., one-step planning. Such approach rely on lar… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  3. arXiv:2502.07687  [pdf, other

    cs.CL

    Large Language Models as Proxies for Theories of Human Linguistic Cognition

    Authors: Imry Ziv, Nur Lan, Emmanuel Chemla, Roni Katzir

    Abstract: We consider the possible role of current large language models (LLMs) in the study of human linguistic cognition. We focus on the use of such models as proxies for theories of cognition that are relatively linguistically-neutral in their representations and learning but differ from current LLMs in key ways. We illustrate this potential use of LLMs as proxies for theories of cognition in the contex… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  4. arXiv:2406.12620  [pdf, other

    cs.CL

    What Makes Two Language Models Think Alike?

    Authors: Jeanne Salle, Louis Jalouzot, Nur Lan, Emmanuel Chemla, Yair Lakretz

    Abstract: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. The approach provides a feature-based comparison of how any two layers of any two models represent linguistic information. We apply the method to BERT, GPT-2 and Mamba. Unlike previous… ▽ More

    Submitted 24 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures

  5. The Invariant Rauch-Tung-Striebel Smoother

    Authors: Niels van der Laan, Mitchell Cohen, Jonathan Arsenault, James Richard Forbes

    Abstract: This paper presents an invariant Rauch-Tung- Striebel (IRTS) smoother applicable to systems with states that are an element of a matrix Lie group. In particular, the extended Rauch-Tung-Striebel (RTS) smoother is adapted to work within a matrix Lie group framework. The main advantage of the invariant RTS (IRTS) smoother is that the linearization of the process and measurement models is independent… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: 8 pages, 3 figures, published in Robotics and Automation Letters

    Journal ref: IEEE Robotics and Automation Letters, vol. 5, no. 4, pp 5067-5074, June 2020

  6. arXiv:2402.11608  [pdf, other

    cs.CL

    Metric-Learning Encoding Models Identify Processing Profiles of Linguistic Features in BERT's Representations

    Authors: Louis Jalouzot, Robin Sobczyk, Bastien Lhopitallier, Jeanne Salle, Nur Lan, Emmanuel Chemla, Yair Lakretz

    Abstract: We introduce Metric-Learning Encoding Models (MLEMs) as a new approach to understand how neural systems represent the theoretical features of the objects they process. As a proof-of-concept, we apply MLEMs to neural representations extracted from BERT, and track a wide variety of linguistic features (e.g., tense, subject person, clause type, clause embedding). We find that: (1) linguistic features… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 17 pages, 13 figures

  7. arXiv:2402.10013  [pdf, other

    cs.CL cs.FL

    Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length

    Authors: Nur Lan, Emmanuel Chemla, Roni Katzir

    Abstract: Neural networks offer good approximation to many tasks but consistently fail to reach perfect generalization, even when theoretical work shows that such perfect solutions can be expressed by certain architectures. Using the task of formal language learning, we focus on one simple formal language and show that the theoretically correct solution is in fact not an optimum of commonly used objectives… ▽ More

    Submitted 6 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures, 3 appendix pages

  8. arXiv:2311.06518  [pdf, other

    cs.LG cs.CL

    Minimum Description Length Hopfield Networks

    Authors: Matan Abudy, Nur Lan, Emmanuel Chemla, Roni Katzir

    Abstract: Associative memory architectures are designed for memorization but also offer, through their retrieval method, a form of generalization to unseen inputs: stored memories can be seen as prototypes from this point of view. Focusing on Modern Hopfield Networks (MHN), we show that a large memorization capacity undermines the generalization opportunity. We offer a solution to better optimize this trade… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: 4 pages, Associative Memory & Hopfield Networks Workshop at NeurIPS2023

  9. arXiv:2308.08253  [pdf, other

    cs.CL

    Benchmarking Neural Network Generalization for Grammar Induction

    Authors: Nur Lan, Emmanuel Chemla, Roni Katzir

    Abstract: How well do neural networks generalize? Even for grammar induction tasks, where the target generalization is fully known, previous works have left the question open, testing very limited ranges beyond the training set and using different success criteria. We provide a measure of neural network generalization based on fully specified formal languages. Given a model and a formal grammar, the method… ▽ More

    Submitted 25 August, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 10 pages, 4 figures, 2 tables. Conference: Learning with Small Data 2023

  10. arXiv:2203.00129  [pdf, other

    eess.IV cs.CV

    BlazeNeo: Blazing fast polyp segmentation and neoplasm detection

    Authors: Nguyen Sy An, Phan Ngoc Lan, Dao Viet Hang, Dao Van Long, Tran Quang Trung, Nguyen Thi Thuy, Dinh Viet Sang

    Abstract: In recent years, computer-aided automatic polyp segmentation and neoplasm detection have been an emerging topic in medical image analysis, providing valuable support to colonoscopy procedures. Attentions have been paid to improving the accuracy of polyp detection and segmentation. However, not much focus has been given to latency and throughput for performing these tasks on dedicated devices, whic… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

  11. arXiv:2111.00600  [pdf, other

    cs.CL

    Minimum Description Length Recurrent Neural Networks

    Authors: Nur Lan, Michal Geyer, Emmanuel Chemla, Roni Katzir

    Abstract: We train neural networks to optimize a Minimum Description Length score, i.e., to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as $a^nb^n$, $a^nb^nc^n$, $a^nb^{2n}$, $a^nb^mc^{n+m}$, and they perfor… ▽ More

    Submitted 31 March, 2022; v1 submitted 31 October, 2021; originally announced November 2021.

    Comments: 15 pages

  12. arXiv:2107.05023  [pdf, other

    eess.IV cs.CV

    NeoUNet: Towards accurate colon polyp segmentation and neoplasm detection

    Authors: Phan Ngoc Lan, Nguyen Sy An, Dao Viet Hang, Dao Van Long, Tran Quang Trung, Nguyen Thi Thuy, Dinh Viet Sang

    Abstract: Automatic polyp segmentation has proven to be immensely helpful for endoscopy procedures, reducing the missing rate of adenoma detection for endoscopists while increasing efficiency. However, classifying a polyp as being neoplasm or not and segmenting it at the pixel level is still a challenging task for doctors to perform in a limited time. In this work, we propose a fine-grained formulation for… ▽ More

    Submitted 11 July, 2021; originally announced July 2021.

  13. arXiv:2105.00402  [pdf, other

    eess.IV cs.CV

    AG-CUResNeSt: A Novel Method for Colon Polyp Segmentation

    Authors: Dinh Viet Sang, Tran Quang Chung, Phan Ngoc Lan, Dao Viet Hang, Dao Van Long, Nguyen Thi Thuy

    Abstract: Colorectal cancer is among the most common malignancies and can develop from high-risk colon polyps. Colonoscopy is an effective screening tool to detect and remove polyps, especially in the case of precancerous lesions. However, the missing rate in clinical practice is relatively high due to many factors. The procedure could benefit greatly from using AI models for automatic polyp segmentation, w… ▽ More

    Submitted 1 March, 2022; v1 submitted 2 May, 2021; originally announced May 2021.

  14. arXiv:2005.00110  [pdf, other

    cs.CL cs.AI cs.MA

    On the Spontaneous Emergence of Discrete and Compositional Signals

    Authors: Nur Geffen Lan, Emmanuel Chemla, Shane Steinert-Threlkeld

    Abstract: We propose a general framework to study language emergence through signaling games with neural agents. Using a continuous latent space, we are able to (i) train using backpropagation, (ii) show that discrete messages nonetheless naturally emerge. We explore whether categorical perception effects follow and show that the messages are not compositional.

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: ACL 2020