Skip to main content

Showing 1–14 of 14 results for author: Hartley, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.16187  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.DS cs.PF

    HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashing

    Authors: Minghui Liu, Tahseen Rabbani, Tony O'Halloran, Ananth Sankaralingam, Mary-Anne Hartley, Furong Huang, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Transformer-based large language models (LLMs) use the key-value (KV) cache to significantly accelerate inference by storing the key and value embeddings of past tokens. However, this cache consumes significant GPU memory. In this work, we introduce HashEvict, an algorithm that uses locality-sensitive hashing (LSH) to compress the KV cache. HashEvict quickly locates tokens in the cache that are co… ▽ More

    Submitted 4 June, 2025; v1 submitted 13 December, 2024; originally announced December 2024.

    Comments: 10 pages, 6 figures, 2 tables

  2. arXiv:2410.13203  [pdf, other

    cs.LG cs.AI

    TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering

    Authors: Al Zadid Sultan Bin Habib, Kesheng Wang, Mary-Anne Hartley, Gianfranco Doretto, Donald A. Adjeroh

    Abstract: Effective analysis of tabular data still poses a significant problem in deep learning, mainly because features in tabular datasets are often heterogeneous and have different levels of relevance. This work introduces TabSeq, a novel framework for the sequential ordering of features, addressing the vital necessity to optimize the learning process. Features are not always equally informative, and for… ▽ More

    Submitted 21 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: This paper has been accepted for presentation at the 27th International Conference on Pattern Recognition (ICPR 2024) in Kolkata, India

  3. arXiv:2409.13720  [pdf, other

    eess.IV cs.CV

    Efficient Classification of Histopathology Images

    Authors: Mohammad Iqbal Nouyed, Mary-Anne Hartley, Gianfranco Doretto, Donald A. Adjeroh

    Abstract: This work addresses how to efficiently classify challenging histopathology images, such as gigapixel whole-slide images for cancer diagnostics with image-level annotation. We use images with annotated tumor regions to identify a set of tumor patches and a set of benign patches in a cancerous slide. Due to the variable nature of region of interest the tumor positive regions may refer to an extreme… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

    Comments: 12 pages, 2 figures, Accepted paper for the 27th International Conference on Pattern Recognition (ICPR) 2024

  4. arXiv:2405.14020  [pdf, other

    cs.LG cs.AI

    Unlearning Information Bottleneck: Machine Unlearning of Systematic Patterns and Biases

    Authors: Ling Han, Hao Huang, Dustin Scheinost, Mary-Anne Hartley, María Rodríguez Martínez

    Abstract: Effective adaptation to distribution shifts in training data is pivotal for sustaining robustness in neural networks, especially when removing specific biases or outdated information, a process known as machine unlearning. Traditional approaches typically assume that data variations are random, which makes it difficult to adjust the model parameters accurately to remove patterns and characteristic… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  5. arXiv:2403.08124  [pdf, other

    cs.LG cs.AI cs.CR

    Towards Independence Criterion in Machine Unlearning of Features and Labels

    Authors: Ling Han, Nanqing Luo, Hao Huang, Jing Chen, Mary-Anne Hartley

    Abstract: This work delves into the complexities of machine unlearning in the face of distributional shifts, particularly focusing on the challenges posed by non-uniform feature and label removal. With the advent of regulations like the GDPR emphasizing data privacy and the right to be forgotten, machine learning models face the daunting task of unlearning sensitive information without compromising their in… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 10 pages, 1 figure

    ACM Class: I.2.6

  6. arXiv:2402.06318  [pdf, other

    cs.LG

    TimEHR: Image-based Time Series Generation for Electronic Health Records

    Authors: Hojjat Karami, Mary-Anne Hartley, David Atienza, Anisoara Ionescu

    Abstract: Time series in Electronic Health Records (EHRs) present unique challenges for generative models, such as irregular sampling, missing values, and high dimensionality. In this paper, we propose a novel generative adversarial network (GAN) model, TimEHR, to generate time series data from EHRs. In particular, TimEHR treats time series as images and is based on two conditional GANs. The first GAN gener… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  7. arXiv:2311.16079  [pdf, other

    cs.CL cs.AI cs.LG

    MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

    Authors: Zeming Chen, Alejandro Hernández Cano, Angelika Romanou, Antoine Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas Köpf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad, Vinitra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol, Mary-Anne Hartley, Martin Jaggi, Antoine Bosselut

    Abstract: Large language models (LLMs) can potentially democratize access to medical knowledge. While many efforts have been made to harness and improve LLMs' medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities. In this work, we improve access to large-scale medical LLMs by rele… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  8. arXiv:2309.14118  [pdf, other

    cs.LG

    MultiModN- Multimodal, Multi-Task, Interpretable Modular Networks

    Authors: Vinitra Swamy, Malika Satayeva, Jibril Frej, Thierry Bossy, Thijs Vogels, Martin Jaggi, Tanja Käser, Mary-Anne Hartley

    Abstract: Predicting multiple real-world tasks in a single model often requires a particularly diverse feature space. Multimodal (MM) models aim to extract the synergistic predictive potential of multiple data types to create a shared feature space with aligned semantic meaning across inputs of drastically varying sizes (i.e. images, text, sound). Most current MM architectures fuse these representations in… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted as a full paper at NeurIPS 2023 in New Orleans, USA

  9. arXiv:2211.06637  [pdf, other

    cs.LG

    Modular Clinical Decision Support Networks (MoDN) -- Updatable, Interpretable, and Portable Predictions for Evolving Clinical Environments

    Authors: Cécile Trottet, Thijs Vogels, Martin Jaggi, Mary-Anne Hartley

    Abstract: Data-driven Clinical Decision Support Systems (CDSS) have the potential to improve and standardise care with personalised probabilistic guidance. However, the size of data required necessitates collaborative learning from analogous CDSS's, which are often unsharable or imperfectly interoperable (IIO), meaning their feature sets are not perfectly overlapping. We propose Modular Clinical Decision Su… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 9 pages

  10. arXiv:2110.12946  [pdf, other

    cs.LG cs.IR stat.ML

    Optimal Model Averaging: Towards Personalized Collaborative Learning

    Authors: Felix Grimberg, Mary-Anne Hartley, Sai P. Karimireddy, Martin Jaggi

    Abstract: In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node. One such approach is weighted averaging between a locally trained model and the global model. In this theoretical work, we study weighted model averaging for arbitrary scalar mean estimation problems under minimal assumptions… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: 9 pages (12 pages incl. references and appendix), 1 figure, Best Paper at International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML'21) ( https://web.archive.org/web/20210908135923/http://federated-learning.org/fl-icml-2021/ICML\%202021\%20Best\%20Paper.pdf )

  11. arXiv:2110.06978  [pdf, other

    cs.LG

    WAFFLE: Weighted Averaging for Personalized Federated Learning

    Authors: Martin Beaussart, Felix Grimberg, Mary-Anne Hartley, Martin Jaggi

    Abstract: In federated learning, model personalization can be a very effective strategy to deal with heterogeneous training data across clients. We introduce WAFFLE (Weighted Averaging For Federated LEarning), a personalized collaborative machine learning algorithm that leverages stochastic control variates for faster convergence. WAFFLE uses the Euclidean distance between clients' updates to weigh their in… ▽ More

    Submitted 13 December, 2021; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021 Workshop on New Frontiers in Federated Learning: Privacy, Fairness, Robustness, Personalization and Data Ownership

  12. arXiv:2107.06580  [pdf, other

    cs.LG cs.DC

    IFedAvg: Interpretable Data-Interoperability for Federated Learning

    Authors: David Roschewitz, Mary-Anne Hartley, Luca Corinzia, Martin Jaggi

    Abstract: Recently, the ever-growing demand for privacy-oriented machine learning has motivated researchers to develop federated and decentralized learning techniques, allowing individual clients to train models collaboratively without disclosing their private datasets. However, widespread adoption has been limited in domains relying on high levels of user trust, where assessment of data compatibility is es… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  13. arXiv:1708.02654  [pdf, ps, other

    cs.AI cs.GL cs.LO

    Cheryl's Birthday

    Authors: Hans van Ditmarsch, Michael Ian Hartley, Barteld Kooi, Jonathan Welton, Joseph B. W. Yeo

    Abstract: We present four logic puzzles and after that their solutions. Joseph Yeo designed 'Cheryl's Birthday'. Mike Hartley came up with a novel solution for 'One Hundred Prisoners and a Light Bulb'. Jonathan Welton designed 'A Blind Guess' and 'Abby's Birthday'. Hans van Ditmarsch and Barteld Kooi authored the puzzlebook 'One Hundred Prisoners and a Light Bulb' that contains other knowledge puzzles, and… ▽ More

    Submitted 27 July, 2017; originally announced August 2017.

    Comments: In Proceedings TARK 2017, arXiv:1707.08250

    Journal ref: EPTCS 251, 2017, pp. 1-9

  14. arXiv:1511.04285  [pdf, other

    cs.RO

    Kilombo: a Kilobot simulator to enable effective research in swarm robotics

    Authors: Fredrik Jansson, Matthew Hartley, Martin Hinsch, Ivica Slavkov, Noemí Carranza, Tjelvar S. G. Olsson, Roland M. Dries, Johanna H. Grönqvist, Athanasius F. M. Marée, James Sharpe, Jaap A. Kaandorp, Verônica A. Grieneisen

    Abstract: The Kilobot is a widely used platform for investigation of swarm robotics. Physical Kilobots are slow moving and require frequent recalibration and charging, which significantly slows down the development cycle. Simulators can speed up the process of testing, exploring and hypothesis generation, but usually require time consuming and error-prone translation of code between simulator and robot. Mor… ▽ More

    Submitted 9 May, 2016; v1 submitted 13 November, 2015; originally announced November 2015.