Skip to main content

Showing 1–10 of 10 results for author: Kazakov, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  2. arXiv:2404.18134  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Learning Fairer Representations with FairVIC

    Authors: Charmaine Barker, Daniel Bethell, Dimitar Kazakov

    Abstract: Mitigating bias in automated decision-making systems, particularly in deep learning models, is a critical challenge due to nuanced definitions of fairness, dataset-specific biases, and the inherent trade-off between fairness and accuracy. To address these issues, we introduce FairVIC, an innovative approach that enhances fairness in neural networks by integrating variance, invariance, and covarian… ▽ More

    Submitted 3 February, 2025; v1 submitted 28 April, 2024; originally announced April 2024.

  3. arXiv:2305.06166  [pdf, other

    cs.CL

    ChatGPT as a Text Simplification Tool to Remove Bias

    Authors: Charmaine Barker, Dimitar Kazakov

    Abstract: The presence of specific linguistic signals particular to a certain sub-group of people can be picked up by language models during training. If the model begins to associate specific language with a distinct group, any decisions made based upon this language would hold a strong correlation to a decision based upon their protected characteristic, leading to possible discrimination. We explore a pot… ▽ More

    Submitted 1 June, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  4. arXiv:2102.09337  [pdf, other

    cs.LG cs.AI cs.NI

    Reinforcement Learning for Datacenter Congestion Control

    Authors: Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, Shie Mannor

    Abstract: We approach the task of network congestion control in datacenters using Reinforcement Learning (RL). Successful congestion control algorithms can dramatically improve latency and overall network throughput. Until today, no such learning-based algorithms have shown practical potential in this domain. Evidently, the most popular recent deployments rely on rule-based heuristics that are tested on a p… ▽ More

    Submitted 29 June, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: Presented at IAAI 2022

  5. arXiv:2005.05613  [pdf, other

    cs.NE

    Unified Framework for the Adaptive Operator Selection of Discrete Parameters

    Authors: Mudita Sharma, Manuel Lopez-Ibanez, Dimitar Kazakov

    Abstract: We conduct an exhaustive survey of adaptive selection of operators (AOS) in Evolutionary Algorithms (EAs). We simplified the AOS structure by adding more components to the framework to built upon the existing categorisation of AOS methods. In addition to simplifying, we looked at the commonality among AOS methods from literature to generalise them. Each component is presented with a number of alte… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  6. arXiv:1905.11382  [pdf, other

    cs.LG cs.AI stat.ML

    State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

    Authors: Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Denis Kazakov, Yoshua Bengio, Michael C. Mozer

    Abstract: Machine learning promises methods that generalize well from finite labeled data. However, the brittleness of existing neural net approaches is revealed by notable failures, such as the existence of adversarial examples that are misclassified despite being nearly identical to a training example, or the inability of recurrent sequence-processing nets to stay on track without teacher forcing. We intr… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: ICML 2019 [full oral]. arXiv admin note: text overlap with arXiv:1805.08394

  7. arXiv:1905.08006  [pdf, other

    cs.NE

    Deep Reinforcement Learning Based Parameter Control in Differential Evolution

    Authors: Mudita Sharma, Alexandros Komninos, Manuel Lopez Ibanez, Dimitar Kazakov

    Abstract: Adaptive Operator Selection (AOS) is an approach that controls discrete parameters of an Evolutionary Algorithm (EA) during the run. In this paper, we propose an AOS method based on Double Deep Q-Learning (DDQN), a Deep Reinforcement Learning method, to control the mutation strategies of Differential Evolution (DE). The application of DDQN to DE requires two phases. First, a neural network is trai… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

  8. arXiv:1805.08394  [pdf, other

    cs.NE

    State-Denoised Recurrent Neural Networks

    Authors: Michael C. Mozer, Denis Kazakov, Robert V. Lindsey

    Abstract: Recurrent neural networks (RNNs) are difficult to train on sequence processing tasks, not only because input noise may be amplified through feedback, but also because any inaccuracy in the weights has similar consequences as input noise. We describe a method for denoising the hidden state during training to achieve more robust representations thereby improving generalization performance. Attractor… ▽ More

    Submitted 28 May, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

  9. arXiv:1710.04110  [pdf, other

    cs.NE cs.LG

    Discrete Event, Continuous Time RNNs

    Authors: Michael C. Mozer, Denis Kazakov, Robert V. Lindsey

    Abstract: We investigate recurrent neural network architectures for event-sequence processing. Event sequences, characterized by discrete observations stamped with continuous-valued times of occurrence, are challenging due to the potentially wide dynamic range of relevant time scales as well as interactions between time scales. We describe four forms of inductive bias that should benefit architectures for e… ▽ More

    Submitted 11 October, 2017; originally announced October 2017.

    Comments: 21 pages

    ACM Class: I.2.6

  10. arXiv:1210.5118  [pdf, other

    cs.DS cs.AI

    Creating a level playing field for all symbols in a discretization

    Authors: Matthew Butler, Dimitar Kazakov

    Abstract: In time series analysis research there is a strong interest in discrete representations of real valued data streams. One approach that emerged over a decade ago and is still considered state-of-the-art is the Symbolic Aggregate Approximation algorithm. This discretization algorithm was the first symbolic approach that mapped a real-valued time series to a symbolic representation that was guarantee… ▽ More

    Submitted 18 October, 2012; originally announced October 2012.