-
Julia in HEP
Authors:
Graeme Andrew Stewart,
Alexander Moreno BriceƱo,
Philippe Gras,
Benedikt Hegner,
Uwe Hernandez Acosta,
Tamas Gal,
Jerry Ling,
Pere Mato,
Mikhail Mikhasenko,
Oliver Schulz,
Sam Skipsey
Abstract:
Julia is a mature general-purpose programming language, with a large ecosystem of libraries and more than 12000 third-party packages, which specifically targets scientific computing. As a language, Julia is as dynamic, interactive, and accessible as Python with NumPy, but achieves run-time performance on par with C/C++. In this paper, we describe the state of adoption of Julia in HEP, where moment…
▽ More
Julia is a mature general-purpose programming language, with a large ecosystem of libraries and more than 12000 third-party packages, which specifically targets scientific computing. As a language, Julia is as dynamic, interactive, and accessible as Python with NumPy, but achieves run-time performance on par with C/C++. In this paper, we describe the state of adoption of Julia in HEP, where momentum has been gathering over a number of years.
HEP-oriented Julia packages can already, via UnROOT.jl, read HEP's major file formats, including TTree and RNTuple. Interfaces to some of HEP's major software packages, such as through Geant4.jl, are available too. Jet reconstruction algorithms in Julia show excellent performance. A number of full HEP analyses have been performed in Julia.
We show how, as the support for HEP has matured, developments have benefited from Julia's core design choices, which makes reuse from and integration with other packages easy. In particular, libraries developed outside HEP for plotting, statistics, fitting, and scientific machine learning are extremely useful.
We believe that the powerful combination of flexibility and speed, the wide selection of scientific programming tools, and support for all modern programming paradigms and tools, make Julia the ideal choice for a future language in HEP.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
A Roadmap for HEP Software and Computing R&D for the 2020s
Authors:
Johannes Albrecht,
Antonio Augusto Alves Jr,
Guilherme Amadio,
Giuseppe Andronico,
Nguyen Anh-Ky,
Laurent Aphecetche,
John Apostolakis,
Makoto Asai,
Luca Atzori,
Marian Babik,
Giuseppe Bagliesi,
Marilena Bandieramonte,
Sunanda Banerjee,
Martin Barisits,
Lothar A. T. Bauerdick,
Stefano Belforte,
Douglas Benjamin,
Catrin Bernius,
Wahid Bhimji,
Riccardo Maria Bianchi,
Ian Bird,
Catherine Biscarat,
Jakob Blomer,
Kenneth Bloom,
Tommaso Boccali
, et al. (285 additional authors not shown)
Abstract:
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for…
▽ More
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.
△ Less
Submitted 19 December, 2018; v1 submitted 18 December, 2017;
originally announced December 2017.
-
Enabling Object Storage via shims for Grid Middleware
Authors:
Samuel Cadellin Skipsey,
Shaun De Witt,
Alastair Dewhurst,
David Britton,
Gareth Roy,
David Crooks
Abstract:
The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requi…
▽ More
The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for "POSIX-like" filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP\cite{GridPP} hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular {\it Ceph} parallel distributed filesystem for the GFAL2 access layer, at Glasgow.
△ Less
Submitted 30 October, 2015;
originally announced December 2015.
-
Extending DIRAC File Management with Erasure-Coding for efficient storage
Authors:
Samuel Cadellin Skipsey,
Paulin Todev,
David Britton,
David Crooks,
Gareth Roy
Abstract:
The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time. We…
▽ More
The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time. We report on work performed as part of GridPP\cite{GridPP}, extending the Dirac File Catalogue and file management interface to allow the placement of erasure-coded files: each file distributed as N identically-sized chunks of data striped across a vector of storage endpoints, encoded such that any M chunks can be lost and the original file can be reconstructed. The tools developed are transparent to the user, and, as well as allowing up and downloading of data to Grid storage, also provide the possibility of parallelising access across all of the distributed chunks at once, improving data transfer and IO performance. We expect this approach to be of most interest to smaller VOs, who have tighter bounds on the storage available to them, but larger (WLCG) VOs may be interested as their total data increases during Run 2. We provide an analysis of the costs and benefits of the approach, along with future development and implementation plans in this area. In general, overheads for multiple file transfers provide the largest issue for competitiveness of this approach at present.
△ Less
Submitted 30 October, 2015;
originally announced October 2015.
-
Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration
Authors:
Samuel C Skipsey,
Wahid Bhimji,
Mike Kenyon
Abstract:
Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can b…
▽ More
Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can both exhibit I/O limited efficiency. We estimate the effectiveness of affordable SSDs in the context of worker nodes, on a large Tier-2 production setup using both low level tools and real LHC I/O intensive data analysis jobs comparing and contrasting with high performance spinning disk based solutions. We consider the applicability of each solution in the context of its price/performance metrics, with an eye on the pragmatic issues facing Tier-2 provision and upgrades
△ Less
Submitted 15 February, 2011;
originally announced February 2011.
-
Optimised access to user analysis data using the gLite DPM
Authors:
Sam Skipsey,
Greig Cowan,
Mike Kenyon,
Stuart Purdie,
Graeme Stewart
Abstract:
The ScotGrid distributed Tier-2 now provides more that 4MSI2K and 500TB for LHC computing, which is spread across three sites at Durham, Edinburgh and Glasgow. Tier-2 sites have a dual role to play in the computing models of the LHC VOs. Firstly, their CPU resources are used for the generation of Monte Carlo event data. Secondly, the end user analysis data is distributed across the grid to the s…
▽ More
The ScotGrid distributed Tier-2 now provides more that 4MSI2K and 500TB for LHC computing, which is spread across three sites at Durham, Edinburgh and Glasgow. Tier-2 sites have a dual role to play in the computing models of the LHC VOs. Firstly, their CPU resources are used for the generation of Monte Carlo event data. Secondly, the end user analysis data is distributed across the grid to the site's storage system and held on disk ready for processing by physicists' analysis jobs. In this paper we show how we have designed the ScotGrid storage and data management resources in order to optimise access by physicists to LHC data. Within ScotGrid, all sites use the gLite DPM storage manager middleware. Using the EGEE grid to submit real ATLAS analysis code to process VO data stored on the ScotGrid sites, we present an analysis of the performance of the architecture at one site, and procedures that may be undertaken to improve such. The results will be presented from the point of view of the end user (in terms of number of events processed/second) and from the point of view of the site, which wishes to minimise load and the impact that analysis activity has on other users of the system.
△ Less
Submitted 23 October, 2009;
originally announced October 2009.
-
ScotGrid: Providing an Effective Distributed Tier-2 in the LHC Era
Authors:
Sam Skipsey,
David Ambrose-Griffith,
Greig Cowan,
Mike Kenyon,
Orlando Richards,
Phil Roffe,
Graeme Stewart
Abstract:
ScotGrid is a distributed Tier-2 centre in the UK with sites in Durham, Edinburgh and Glasgow. ScotGrid has undergone a huge expansion in hardware in anticipation of the LHC and now provides more than 4MSI2K and 500TB to the LHC VOs. Scaling up to this level of provision has brought many challenges to the Tier-2 and we show in this paper how we have adopted new methods of organising the centres,…
▽ More
ScotGrid is a distributed Tier-2 centre in the UK with sites in Durham, Edinburgh and Glasgow. ScotGrid has undergone a huge expansion in hardware in anticipation of the LHC and now provides more than 4MSI2K and 500TB to the LHC VOs. Scaling up to this level of provision has brought many challenges to the Tier-2 and we show in this paper how we have adopted new methods of organising the centres, from fabric management and monitoring to remote management of sites to management and operational procedures, to meet these challenges. We describe how we have coped with different operational models at the sites, where Glagsow and Durham sites are managed "in house" but resources at Edinburgh are managed as a central university resource. This required the adoption of a different fabric management model at Edinburgh and a special engagement with the cluster managers. Challenges arose from the different job models of local and grid submission that required special attention to resolve. We show how ScotGrid has successfully provided an infrastructure for ATLAS and LHCb Monte Carlo production. Special attention has been paid to ensuring that user analysis functions efficiently, which has required optimisation of local storage and networking to cope with the demands of user analysis. Finally, although these Tier-2 resources are pledged to the whole VO, we have established close links with our local physics user communities as being the best way to ensure that the Tier-2 functions effectively as a part of the LHC grid computing framework..
△ Less
Submitted 23 October, 2009;
originally announced October 2009.