Skip to main content

Showing 1–8 of 8 results for author: Farrugia-Roberts, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.03068  [pdf, ps, other

    cs.LG

    Mitigating Goal Misgeneralization with Minimax Regret

    Authors: Karim Abdel Sadek, Matthew Farrugia-Roberts, Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, Michael Dennis

    Abstract: Safe generalization in reinforcement learning requires not only that a learned policy acts capably in new situations, but also that it uses its capabilities towards the pursuit of the designer's intended goal. The latter requirement may fail when a proxy goal incentivizes similar behavior to the intended goal within the training environment, but not in novel deployment environments. This creates t… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: Published at RLC 2025, 11 pages

  2. arXiv:2502.05475  [pdf, other

    cs.LG

    You Are What You Eat -- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

    Authors: Simon Pepin Lehalleur, Jesse Hoogland, Matthew Farrugia-Roberts, Susan Wei, Alexander Gietelink Oldenziel, George Wang, Liam Carroll, Daniel Murfet

    Abstract: In this position paper, we argue that understanding the relation between structure in the data distribution and structure in trained models is central to AI alignment. First, we discuss how two neural networks can have equivalent performance on the training set but compute their outputs in essentially different ways and thus generalise differently. For this reason, standard testing and evaluation… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

  3. arXiv:2501.17745  [pdf, other

    cs.LG

    Dynamics of Transient Structure in In-Context Linear Regression Transformers

    Authors: Liam Carroll, Jesse Hoogland, Matthew Farrugia-Roberts, Daniel Murfet

    Abstract: Modern deep neural networks display striking examples of rich internal computational structure. Uncovering principles governing the development of such structure is a priority for the science of deep learning. In this paper, we explore the transient ridge phenomenon: when transformers are trained on in-context linear regression tasks with intermediate task diversity, they initially behave like rid… ▽ More

    Submitted 31 January, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

    Comments: 37 pages, 27 figures

  4. arXiv:2402.02364  [pdf, other

    cs.LG cs.AI cs.CL

    Loss Landscape Degeneracy Drives Stagewise Development in Transformers

    Authors: Jesse Hoogland, George Wang, Matthew Farrugia-Roberts, Liam Carroll, Susan Wei, Daniel Murfet

    Abstract: Deep learning involves navigating a high-dimensional loss landscape over the neural network parameter space. Over the course of training, complex computational structures form and re-form inside the neural network, leading to shifts in input/output behavior. It is a priority for the science of deep learning to uncover principles governing the development of neural network structure and behavior. D… ▽ More

    Submitted 13 February, 2025; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Material on essential dynamics from v1 of this preprint has been removed from v2 and developed in arXiv:2501.17745

  5. arXiv:2306.02834  [pdf, other

    cs.LG cs.CC

    Proximity to Losslessly Compressible Parameters

    Authors: Matthew Farrugia-Roberts

    Abstract: To better understand complexity in neural networks, we theoretically investigate the idealised phenomenon of lossless network compressibility, whereby an identical function can be implemented with fewer hidden units. In the setting of single-hidden-layer hyperbolic tangent networks, we define the rank of a parameter as the minimum number of hidden units required to implement the same function. We… ▽ More

    Submitted 23 May, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: 9 pages paper, 33 pages total, 10 figures, 4 tables

  6. arXiv:2305.05089  [pdf, other

    cs.NE cs.LG

    Functional Equivalence and Path Connectivity of Reducible Hyperbolic Tangent Networks

    Authors: Matthew Farrugia-Roberts

    Abstract: Understanding the learning process of artificial neural networks requires clarifying the structure of the parameter space within which learning takes place. A neural network parameter's functional equivalence class is the set of parameters implementing the same input--output function. For many architectures, almost all parameters have a simple and well-documented functional equivalence class. Howe… ▽ More

    Submitted 7 June, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: 15 pages, 3 figures

  7. Teaching Simple Constructive Proofs with Haskell Programs

    Authors: Matthew Farrugia-Roberts, Bryn Jeffries, Harald Søndergaard

    Abstract: In recent years we have explored using Haskell alongside a traditional mathematical formalism in our large-enrolment university course on topics including logic and formal languages, aiming to offer our students a programming perspective on these mathematical topics. We have found it possible to offer almost all formative and summative assessment through an interactive learning platform, using Has… ▽ More

    Submitted 26 July, 2022; originally announced August 2022.

    Comments: In Proceedings TFPIE 2021/22, arXiv:2207.11600

    ACM Class: D.1.1; F.1.1; K.3.2

    Journal ref: EPTCS 363, 2022, pp. 54-73

  8. arXiv:2203.07475  [pdf, other

    cs.LG cs.AI stat.ML

    Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

    Authors: Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave

    Abstract: It is often very challenging to manually design reward functions for complex, real-world tasks. To solve this, one can instead use reward learning to infer a reward function from data. However, there are often multiple reward functions that fit the data equally well, even in the infinite-data limit. This means that the reward function is only partially identifiable. In this work, we formally chara… ▽ More

    Submitted 7 June, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: ICML 2023. 9 pages main paper, 26 pages total, 3 figures

    ACM Class: I.2.6