Skip to main content

Showing 1–23 of 23 results for author: de Jong, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.15190  [pdf, ps, other

    cs.LG

    Learning Topology Actions for Power Grid Control: A Graph-Based Soft-Label Imitation Learning Approach

    Authors: Mohamed Hassouna, Clara Holzhüter, Malte Lehna, Matthijs de Jong, Jan Viebahn, Bernhard Sick, Christoph Scholz

    Abstract: The rising proportion of renewable energy in the electricity mix introduces significant operational challenges for power grid operators. Effective power grid management demands adaptive decision-making strategies capable of handling dynamic conditions. With the increase in complexity, more and more Deep Learning (DL) approaches have been proposed to find suitable grid topologies for congestion man… ▽ More

    Submitted 18 June, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

    Comments: Accepted at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML) - Applied Data Science Track

  2. arXiv:2501.07186  [pdf, other

    cs.LG cs.AI stat.ML

    Generalizable Graph Neural Networks for Robust Power Grid Topology Control

    Authors: Matthijs de Jong, Jan Viebahn, Yuliya Shapovalova

    Abstract: The energy transition necessitates new congestion management methods. One such method is controlling the grid topology with machine learning (ML). This approach has gained popularity following the Learning to Run a Power Network (L2RPN) competitions. Graph neural networks (GNNs) are a class of ML models that reflect graph structure in their computation, which makes them suitable for power grid mod… ▽ More

    Submitted 18 February, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

  3. arXiv:2407.19865  [pdf, other

    cs.AI cs.LG eess.SY

    Imitation Learning for Intra-Day Power Grid Operation through Topology Actions

    Authors: Matthijs de Jong, Jan Viebahn, Yuliya Shapovalova

    Abstract: Power grid operation is becoming increasingly complex due to the increase in generation of renewable energy. The recent series of Learning To Run a Power Network (L2RPN) competitions have encouraged the use of artificial agents to assist human dispatchers in operating power grids. In this paper we study the performance of imitation learning for day-ahead power grid operation through topology actio… ▽ More

    Submitted 18 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: To be presented at the Machine Learning for Sustainable Power Systems 2024 workshop and to be published in the corresponding Springer Communications in Computer and Information Science proceedings

  4. arXiv:2407.07896  [pdf, other

    physics.optics cond-mat.mes-hall cs.LG physics.app-ph physics.space-ph

    Pentagonal Photonic Crystal Mirrors: Scalable Lightsails with Enhanced Acceleration via Neural Topology Optimization

    Authors: L. Norder, S. Yin, M. J. de Jong, F. Stallone, H. Aydogmus, P. M. Sberna, M. A. Bessa, R. A. Norte

    Abstract: The Starshot Breakthrough Initiative aims to send one-gram microchip probes to Alpha Centauri within 20 years, using gram-scale lightsails propelled by laser-based radiation pressure, reaching velocities nearing a fifth of light speed. This mission requires lightsail materials that challenge the fundamentals of nanotechnology, requiring innovations in optics, material science and structural engine… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  5. arXiv:2308.14903  [pdf, other

    cs.CL

    MEMORY-VQ: Compression for Tractable Internet-Scale Memory

    Authors: Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie

    Abstract: Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world. Memory-based methods like LUMEN pre-compute token representations for retrieved passages to drastically speed up inference. However, memory also leads to much greater storage requirements from storing pre-computed representations. We propose MEMORY-VQ, a new method to reduce stor… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  6. arXiv:2306.10231  [pdf, other

    cs.CL cs.AI cs.LG

    GLIMMER: generalized late-interaction memory reranker

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie

    Abstract: Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text. Recent work introduced LUMEN, a memory-retrieval hybrid that partially pre-computes memory and updates memory representations on the fly with a smaller live encoder. We propose GLIMMER, which improves on this approach th… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  7. arXiv:2305.13245  [pdf, other

    cs.CL cs.LG

    GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

    Authors: Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, Sumit Sanghai

    Abstract: Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We (1) propose a recipe for uptraining existing multi-head language model checkpoints into models with MQA using 5% of original pre-training compute, and… ▽ More

    Submitted 23 December, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023. Added to related work

  8. arXiv:2303.09752  [pdf, other

    cs.CL cs.LG

    CoLT5: Faster Long-Range Transformers with Conditional Computation

    Authors: Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

    Abstract: Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. However, not all tokens are equally important, especially for longer documents. We propose CoLT5, a long-input Transformer model that builds on this in… ▽ More

    Submitted 23 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at EMNLP 2023

  9. arXiv:2301.10448  [pdf, other

    cs.CL cs.AI cs.LG

    Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Joshua Ainslie, Sumit Sanghai, Fei Sha, William Cohen

    Abstract: Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks. However, they are also expensive, due to the need to encode a large number of retrieved passages. Some work avoids this cost by pre-encoding a text corpus into a memory and retrieving dense representations directly. However, pre-encoding memory incurs… ▽ More

    Submitted 2 June, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: ICML 2023

  10. arXiv:2212.08153  [pdf, other

    cs.CL cs.AI cs.LG

    FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

    Authors: Michiel de Jong, Yury Zemlyanskiy, Joshua Ainslie, Nicholas FitzGerald, Sumit Sanghai, Fei Sha, William Cohen

    Abstract: Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks. However, the architecture used for FiD was chosen by making minimal modifications to a standard T5 model, which our analysis shows to be highly suboptimal for a retrieval-augmented model. In particular, FiD allocates the bulk of FLOPs to the encoder, while… ▽ More

    Submitted 2 June, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL Findings 2023

  11. arXiv:2209.14899  [pdf, other

    cs.CL

    Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing

    Authors: Yury Zemlyanskiy, Michiel de Jong, Joshua Ainslie, Panupong Pasupat, Peter Shaw, Linlu Qiu, Sumit Sanghai, Fei Sha

    Abstract: A common recent approach to semantic parsing augments sequence-to-sequence models by retrieving and appending a set of training samples, called exemplars. The effectiveness of this recipe is limited by the ability to retrieve informative exemplars that help produce the correct parse, which is especially challenging in low-resource settings. Existing retrieval is commonly based on similarity of que… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: To appear in the proceedings of COLING 2022

  12. arXiv:2207.00630  [pdf, other

    cs.AI

    QA Is the New KR: Question-Answer Pairs as Knowledge Bases

    Authors: Wenhu Chen, William W. Cohen, Michiel De Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting

    Abstract: In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking. We argue that the proposed type of KB has many of the key advantages of a traditional symbolic KB: in particular, it consists of small modular components, which can be combined compositionally to answer complex queries, including relational queri… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  13. arXiv:2204.04581  [pdf, other

    cs.CL cs.AI cs.LG

    Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering

    Authors: Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen

    Abstract: Retrieval augmented language models have recently become the standard for knowledge intensive tasks. Rather than relying purely on latent semantics within the parameters of large neural models, these methods enlist a semi-parametric memory to encode an index of knowledge for the model to retrieve over. Most prior work has employed text passages as the unit of knowledge, which has high coverage at… ▽ More

    Submitted 23 January, 2023; v1 submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted by EACL 2023

  14. arXiv:2110.06176  [pdf, other

    cs.CL cs.AI cs.LG

    Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Fei Sha, William Cohen

    Abstract: Natural language understanding tasks such as open-domain question answering often require retrieving and assimilating factual information from multiple sources. We propose to address this problem by integrating a semi-parametric representation of a large text corpus into a Transformer model as a source of factual knowledge. Specifically, our method represents knowledge with `mention memory', a tab… ▽ More

    Submitted 19 April, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

  15. arXiv:2108.04809  [pdf, other

    cond-mat.mes-hall cs.LG physics.app-ph

    Spiderweb nanomechanical resonators via Bayesian optimization: inspired by nature and guided by machine learning

    Authors: Dongil Shin, Andrea Cupertino, Matthijs H. J. de Jong, Peter G. Steeneken, Miguel A. Bessa, Richard A. Norte

    Abstract: From ultra-sensitive detectors of fundamental forces to quantum networks and sensors, mechanical resonators are enabling next-generation technologies to operate in room temperature environments. Currently, silicon nitride nanoresonators stand as a leading microchip platform in these advances by allowing for mechanical resonators whose motion is remarkably isolated from ambient thermal noise. Howev… ▽ More

    Submitted 13 December, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

    Journal ref: Shin, D., Cupertino, A., de, M. H. J., Steeneken, P. G., Bessa, M. A., Norte, R. A., Spiderweb Nanomechanical Resonators via Bayesian Optimization: Inspired by Nature and Guided by Machine Learning. Adv. Mater. 2021, 2106248

  16. arXiv:2107.01979  [pdf, other

    cs.LG cs.CR stat.AP

    Machine Learning for Fraud Detection in E-Commerce: A Research Agenda

    Authors: Niek Tax, Kees Jan de Vries, Mathijs de Jong, Nikoleta Dosoula, Bram van den Akker, Jon Smith, Olivier Thuong, Lucas Bernardi

    Abstract: Fraud detection and prevention play an important part in ensuring the sustained operation of any e-commerce business. Machine learning (ML) often plays an important role in these anti-fraud operations, but the organizational context in which these ML models operate cannot be ignored. In this paper, we take an organization-centric view on the topic of fraud detection by formulating an operational m… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: Accepted and to appear in the proceedings of the KDD 2021 co-located workshop: the 2nd International Workshop on Deployable Machine Learning for Security Defense (MLHat)

  17. arXiv:2106.01607  [pdf, other

    cs.LG cs.CL cs.CV

    Grounding Complex Navigational Instructions Using Scene Graphs

    Authors: Michiel de Jong, Satyapriya Krishna, Anuva Agarwal

    Abstract: Training a reinforcement learning agent to carry out natural language instructions is limited by the available supervision, i.e. knowing when the instruction has been carried out. We adapt the CLEVR visual question answering dataset to generate complex natural language navigation instructions and accompanying scene graphs, yielding an environment-agnostic supervised dataset. To demonstrate the use… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: arXiv admin note: text overlap with arXiv:1706.07230 by other authors

  18. arXiv:2105.04241  [pdf, other

    cs.CL cs.LG

    ReadTwice: Reading Very Large Documents with Memories

    Authors: Yury Zemlyanskiy, Joshua Ainslie, Michiel de Jong, Philip Pham, Ilya Eckstein, Fei Sha

    Abstract: Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs such as books or article collections. We propose ReadTwice, a simple and effective technique that combines several strengths of prior approaches to model long-range dependencies with Transformers. The main idea is to read text in small segments, in parallel, summarizi… ▽ More

    Submitted 11 May, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: To appear in the proceedings of NAACL 2021

  19. arXiv:1907.02597  [pdf, ps, other

    cs.MS cs.PL

    Multi-dimensional interpolations in C++

    Authors: Maarten de Jong

    Abstract: A C++ software design is presented that can be used to interpolate data in any number of dimensions. The design is based on a combination of templates of functional collections of elements and so-called type lists. The design allows for different search methodologies and interpolation techniques in each dimension. It is also possible to expand and reduce the number of dimensions, to interpolate co… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

  20. arXiv:1906.06805  [pdf, other

    cs.LG stat.ML

    Neural Theorem Provers Do Not Learn Rules Without Exploration

    Authors: Michiel de Jong, Fei Sha

    Abstract: Neural symbolic processing aims to combine the generalization of logical learning approaches and the performance of neural networks. The Neural Theorem Proving (NTP) model by Rocktaschel et al (2017) learns embeddings for concepts and performs logical unification. While NTP is promising and effective in predicting facts accurately, we have little knowledge how well it can extract true relationship… ▽ More

    Submitted 16 June, 2019; originally announced June 2019.

  21. arXiv:1812.02253  [pdf, other

    cs.CL

    Weighted Global Normalization for Multiple Choice Reading Comprehension over Long Documents

    Authors: Aditi Chaudhary, Bhargavi Paranjape, Michiel de Jong

    Abstract: Motivated by recent evidence pointing out the fragility of high-performing span prediction models, we direct our attention to multiple choice reading comprehension. In particular, this work introduces a novel method for improving answer selection on long documents through weighted global normalization of predictions over portions of the documents. We show that applying our method to a span predict… ▽ More

    Submitted 25 November, 2021; v1 submitted 5 December, 2018; originally announced December 2018.

  22. The Governance of Risks in Ridesharing: A Revelatory Case from Singapore

    Authors: Yanwei Li, Araz Taeihagh, Martin de Jong

    Abstract: Recently we have witnessed the worldwide adoption of many different types of innovative technologies, such as crowdsourcing, ridesharing, open and big data, aiming at delivering public services more efficiently and effectively. Among them, ridesharing has received substantial attention from decision-makers around the world. Because of the multitude of currently understood or potentially unknown ri… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Journal ref: Energies 11, no. 5: 1277 (2018)

  23. arXiv:1708.06293  [pdf, ps, other

    cs.OH

    Neville's algorithm revisited

    Authors: M. de Jong

    Abstract: Neville's algorithm is known to provide an efficient and numerically stable solution for polynomial interpolations. In this paper, an extension of this algorithm is presented which includes the derivatives of the interpolating polynomial.

    Submitted 16 August, 2017; originally announced August 2017.

    Comments: 3 pages