-
Humanity's Last Exam
Authors:
Long Phan,
Alice Gatti,
Ziwen Han,
Nathaniel Li,
Josephina Hu,
Hugh Zhang,
Chen Bo Calvin Zhang,
Mohamed Shaaban,
John Ling,
Sean Shi,
Michael Choi,
Anish Agrawal,
Arnav Chopra,
Adam Khoja,
Ryan Kim,
Richard Ren,
Jason Hausenloy,
Oliver Zhang,
Mantas Mazeika,
Dmitry Dodonov,
Tung Nguyen,
Jaeho Lee,
Daron Anderson,
Mikhail Doroshenko,
Alun Cennyth Stokes
, et al. (1084 additional authors not shown)
Abstract:
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of…
▽ More
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.
△ Less
Submitted 19 April, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
Algebraic and geometric properties of homeomorphism groups of ordinals
Authors:
Megha Bhat,
Rongdao Chen,
Adityo Mamun,
Ariana Verbanac,
Eric Vergo,
Nicholas G. Vlamis
Abstract:
We study the homeomorphism groups of ordinals equipped with their order topology, focusing on successor ordinals whose limit capacity is also a successor. This is a rich family of groups that has connections to both permutation groups and homeomorphism groups of manifolds.
For ordinals of Cantor--Bendixson degree one, we prove that the homeomorphism group is strongly distorted and uniformly perf…
▽ More
We study the homeomorphism groups of ordinals equipped with their order topology, focusing on successor ordinals whose limit capacity is also a successor. This is a rich family of groups that has connections to both permutation groups and homeomorphism groups of manifolds.
For ordinals of Cantor--Bendixson degree one, we prove that the homeomorphism group is strongly distorted and uniformly perfect, and we classify its normal generators. As a corollary, we recover and provide a new proof of the classical result that the subgroup of finite permutations in the symmetric group on a countably infinite set is the maximal proper normal subgroup.
For ordinals of higher Cantor--Bendixson degree, we establish a semi-direct product decomposition of the (pure) homeomorphism group. When the limit capacity is one, we further compute the abelianizations and determine normal generating sets of minimal cardinality for these groups.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Applying the Iterative Development Process: The Creation of Fractal Emergence
Authors:
Christopher R. H. Hanusa,
Eric Vergo
Abstract:
The iterative development process is a framework used to design products and applications across a wide range of domains. It centers around building prototypes, testing them, and updating based on the test results. We discuss how we applied this technique to create Fractal Emergence, an interactive piece of mathematical art.
The iterative development process is a framework used to design products and applications across a wide range of domains. It centers around building prototypes, testing them, and updating based on the test results. We discuss how we applied this technique to create Fractal Emergence, an interactive piece of mathematical art.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Two-Disk Compound Symmetry Groups
Authors:
Robert A. Hearn,
William Kretschmer,
Tomas Rokicki,
Benjamin Streeter,
Eric Vergo
Abstract:
Symmetry is at the heart of much of mathematics, physics, and art. Traditional geometric symmetry groups are defined in terms of isometries of the ambient space of a shape or pattern. If we slightly generalize this notion to allow the isometries to operate on overlapping but non-identical metric spaces, we obtain what we call compound symmetry groups. A natural example is that of the groups genera…
▽ More
Symmetry is at the heart of much of mathematics, physics, and art. Traditional geometric symmetry groups are defined in terms of isometries of the ambient space of a shape or pattern. If we slightly generalize this notion to allow the isometries to operate on overlapping but non-identical metric spaces, we obtain what we call compound symmetry groups. A natural example is that of the groups generated by discrete rotations of overlapping disks in the plane. Investigation of these groups reveals a new family of fractals, as well as a rich structure that is intriguing both mathematically and artistically. We report on our initial investigations.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.