-
ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization
Authors:
Ilka Antcheva,
Maarten Ballintijn,
Bertrand Bellenot,
Marek Biskup,
Rene Brun,
Nenad Buncic,
Philippe Canal,
Diego Casadei,
Olivier Couet,
Valery Fine,
Leandro Franco,
Gerardo Ganis,
Andrei Gheata,
David Gonzalez Maline,
Masaharu Goto,
Jan Iwaszkiewicz,
Anna Kreshuk,
Diego Marcos Segura,
Richard Maunder,
Lorenzo Moneta,
Axel Naumann,
Eddy Offermann,
Valeriy Onuchin,
Suzanne Panacek,
Fons Rademakers
, et al. (2 additional authors not shown)
Abstract:
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical…
▽ More
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way.
△ Less
Submitted 31 August, 2015;
originally announced August 2015.
-
JFIT: a framework to obtain combined experimental results through joint fits
Authors:
Eli Ben-Haim,
René Brun,
Bertrand Echenard,
Thomas E. Latham
Abstract:
A master-worker architecture is presented for obtaining combined experimental results through joint fits of datasets from several experiments. The design of the architecture allows such joint fits to be performed keeping the data separated, in its original format, and using independent fitting environments. This allows the benefits of joint fits, such as ensuring that correlations are correctly ta…
▽ More
A master-worker architecture is presented for obtaining combined experimental results through joint fits of datasets from several experiments. The design of the architecture allows such joint fits to be performed keeping the data separated, in its original format, and using independent fitting environments. This allows the benefits of joint fits, such as ensuring that correlations are correctly taken into account and better determination of nuisance parameters, to be harnessed without the need to reformat data samples or to rewrite existing fitting code. The Jfit framework is a C++ implementation of this idea in the Laura++ package, using dedicated classes of the ROOT package. We present the Jfit framework, give instructions for its use, and demonstrate its functionalities with concrete examples.
△ Less
Submitted 7 April, 2018; v1 submitted 17 September, 2014;
originally announced September 2014.
-
Vectorising the detector geometry to optimize particle transport
Authors:
John Apostolakis,
René Brun,
Federico Carminati,
Andrei Gheata,
Sandro Wenzel
Abstract:
Among the components contributing to particle transport, geometry navigation is an important consumer of CPU cycles. The tasks performed to get answers to "basic" queries such as locating a point within a geometry hierarchy or computing accurately the distance to the next boundary can become very computing intensive for complex detector setups. So far, the existing geometry algorithms employ mainl…
▽ More
Among the components contributing to particle transport, geometry navigation is an important consumer of CPU cycles. The tasks performed to get answers to "basic" queries such as locating a point within a geometry hierarchy or computing accurately the distance to the next boundary can become very computing intensive for complex detector setups. So far, the existing geometry algorithms employ mainly scalar optimisation strategies (voxelization, caching) to reduce their CPU consumption. In this paper, we would like to take a different approach and investigate how geometry navigation can benefit from the vector instruction set extensions that are one of the primary source of performance enhancements on current and future hardware. While on paper, this form of microparallelism promises increasing performance opportunities, applying this technology to the highly hierarchical and multiply branched geometry code is a difficult challenge. We refer to the current work done to vectorise an important part of the critical navigation algorithms in the ROOT geometry library. Starting from a short critical discussion about the programming model, we present the current status and first benchmark results of the vectorisation of some elementary geometry shape algorithms. On the path towards a full vector-based geometry navigator, we also investigate the performance benefits in connecting these elementary functions together to develop algorithms which are entirely based on the flow of vector-data. To this end, we discuss core components of a simple vector navigator that is tested and evaluated on a toy detector setup.
△ Less
Submitted 3 December, 2013;
originally announced December 2013.
-
A Geometrical Modeller for HEP
Authors:
R. Brun,
A. Gheata,
M. Gheata,
For ALICE off-line collaboration
Abstract:
Geometrical modelling generally provides the geometrical description of a special structure and a set of services to "navigate" through its structure. HEP geometrical modellers are designed to handle high complexity detector geometries and they are usually embedded within simulation MC frameworks. The fact that these frameworks greatly depend on their specific geometrical tools makes simulation…
▽ More
Geometrical modelling generally provides the geometrical description of a special structure and a set of services to "navigate" through its structure. HEP geometrical modellers are designed to handle high complexity detector geometries and they are usually embedded within simulation MC frameworks. The fact that these frameworks greatly depend on their specific geometrical tools makes simulation applications hardly portable to MC's other than the one they were designed for. The ALICE Off-line Project in collaboration with the ROOT team is proposing a multi-purpose geometrical modeller for HEP that is integrated within a virtual MC infrastructure. This tool has been optimised for performance with the geometry setups of several HEP experiments and provides a single representation for the geometry used by different applications such as simulation, reconstruction or event display.
△ Less
Submitted 20 June, 2003;
originally announced June 2003.
-
The PROOF Distributed Parallel Analysis Framework based on ROOT
Authors:
Maarten Ballintijn,
Rene Brun,
Fons Rademakers,
Gunther Roland
Abstract:
The development of the Parallel ROOT Facility, PROOF, enables a physicist to analyze and understand much larger data sets on a shorter time scale. It makes use of the inherent parallelism in event data and implements an architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. The system provides transparent and interactive access to gigabytes today.…
▽ More
The development of the Parallel ROOT Facility, PROOF, enables a physicist to analyze and understand much larger data sets on a shorter time scale. It makes use of the inherent parallelism in event data and implements an architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. The system provides transparent and interactive access to gigabytes today. Being part of the ROOT framework PROOF inherits the benefits of a performant object storage system and a wealth of statistical and visualization tools. This paper describes the key principles of the PROOF architecture and the implementation of the system. We will illustrate its features using a simple example and present measurements of the scalability of the system. Finally we will discuss how PROOF can be interfaced and make use of the different Grid solutions.
△ Less
Submitted 13 June, 2003;
originally announced June 2003.