Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives
Authors:
Elizabeth Seger,
Noemi Dreksler,
Richard Moulange,
Emily Dardaman,
Jonas Schuett,
K. Wei,
Christoph Winter,
Mackenzie Arnold,
Seán Ó hÉigeartaigh,
Anton Korinek,
Markus Anderljung,
Ben Bucknall,
Alan Chan,
Eoghan Stafford,
Leonie Koessler,
Aviv Ovadya,
Ben Garfinkel,
Emma Bluemke,
Michael Aird,
Patrick Levermore,
Julian Hazell,
Abhishek Gupta
Abstract:
Recent decisions by leading AI labs to either open-source their models or to restrict access to their models has sparked debate about whether, and how, increasingly capable AI models should be shared. Open-sourcing in AI typically refers to making model architecture and weights freely and publicly accessible for anyone to modify, study, build on, and use. This offers advantages such as enabling ex…
▽ More
Recent decisions by leading AI labs to either open-source their models or to restrict access to their models has sparked debate about whether, and how, increasingly capable AI models should be shared. Open-sourcing in AI typically refers to making model architecture and weights freely and publicly accessible for anyone to modify, study, build on, and use. This offers advantages such as enabling external oversight, accelerating progress, and decentralizing control over AI development and use. However, it also presents a growing potential for misuse and unintended consequences. This paper offers an examination of the risks and benefits of open-sourcing highly capable foundation models. While open-sourcing has historically provided substantial net benefits for most software and AI development processes, we argue that for some highly capable foundation models likely to be developed in the near future, open-sourcing may pose sufficiently extreme risks to outweigh the benefits. In such a case, highly capable foundation models should not be open-sourced, at least not initially. Alternative strategies, including non-open-source model sharing options, are explored. The paper concludes with recommendations for developers, standard-setting bodies, and governments for establishing safe and responsible model sharing practices and preserving open-source benefits where safe.
△ Less
Submitted 29 September, 2023;
originally announced November 2023.
Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU
Authors:
Filippo Mantovani,
Marta Garcia-Gasulla,
José Gracia,
Esteban Stafford,
Fabio Banchelli,
Marc Josep-Fabrego,
Joel Criado-Ledesma,
Mathias Nachtmann
Abstract:
In this paper, we analyze the performance and energy consumption of an Arm-based high-performance computing (HPC) system developed within the European project Mont-Blanc 3. This system, called Dibona, has been integrated by ATOS/Bull, and it is powered by the latest Marvell's CPU, ThunderX2. This CPU is the same one that powers the Astra supercomputer, the first Arm-based supercomputer entering th…
▽ More
In this paper, we analyze the performance and energy consumption of an Arm-based high-performance computing (HPC) system developed within the European project Mont-Blanc 3. This system, called Dibona, has been integrated by ATOS/Bull, and it is powered by the latest Marvell's CPU, ThunderX2. This CPU is the same one that powers the Astra supercomputer, the first Arm-based supercomputer entering the Top500 in November 2018. We study from micro-benchmarks up to large production codes. We include an interdisciplinary evaluation of three scientific applications (a finite-element fluid dynamics code, a smoothed particle hydrodynamics code, and a lattice Boltzmann code) and the Graph 500 benchmark, focusing on parallel and energy efficiency as well as studying their scalability up to thousands of Armv8 cores. For comparison, we run the same tests on state-of-the-art x86 nodes included in Dibona and the Tier-0 supercomputer MareNostrum4. Our experiments show that the ThunderX2 has a 25% lower performance on average, mainly due to its small vector unit yet somewhat compensated by its 30% wider links between the CPU and the main memory. We found that the software ecosystem of the Armv8 architecture is comparable to the one available for Intel. Our results also show that ThunderX2 delivers similar or better energy-to-solution and scalability, proving that Arm-based chips are legitimate contenders in the market of next-generation HPC systems.
△ Less
Submitted 10 July, 2020; v1 submitted 9 July, 2020;
originally announced July 2020.