-
The ATLAS EventIndex: a BigData catalogue for all ATLAS experiment events
Authors:
Dario Barberis,
Igor Aleksandrov,
Evgeny Alexandrov,
Zbigniew Baranowski,
Luca Canali,
Elizaveta Cherepanova,
Gancho Dimitrov,
Andrea Favareto,
Alvaro Fernandez Casani,
Elizabeth J. Gallas,
Carlos Garcia Montoro,
Santiago Gonzalez de la Hoz,
Julius Hrivnac,
Alexander Iakovlev,
Andrei Kazymov,
Mikhail Mineev,
Fedor Prokoshin,
Grigori Rybkin,
Jose Salt,
Javier Sanchez,
Roman Sorokoletov,
Rainer Toebbicke,
Petya Vasileva,
Miguel Villaplana Perez,
Ruijun Yuan
Abstract:
The ATLAS EventIndex system comprises the catalogue of all events collected, processed or generated by the ATLAS experiment at the CERN LHC accelerator, and all associated software tools to collect, store and query this information. ATLAS records several billion particle interactions every year of operation, processes them for analysis and generates even larger simulated data samples; a global cat…
▽ More
The ATLAS EventIndex system comprises the catalogue of all events collected, processed or generated by the ATLAS experiment at the CERN LHC accelerator, and all associated software tools to collect, store and query this information. ATLAS records several billion particle interactions every year of operation, processes them for analysis and generates even larger simulated data samples; a global catalogue is needed to keep track of the location of each event record and be able to search and retrieve specific events for in-depth investigations. Each EventIndex record includes summary information on the event itself and the pointers to the files containing the full event. Most components of the EventIndex system are implemented using BigData open-source tools. This paper describes the architectural choices and their evolution in time, as well as the past, current and foreseen future implementations of all EventIndex components.
△ Less
Submitted 12 March, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
POOL File Catalog, Collection and Metadata Components
Authors:
C. Cioffi,
S. Eckmann,
M. Girone,
J. Hrivnac,
D. Malon,
H. Schmuecker,
A. Vaniachine,
J. Wojcieszuk,
Z. Xie
Abstract:
The POOL project is the common persistency framework for the LHC experiments to store petabytes of experiment data and metadata in a distributed and grid enabled way. POOL is a hybrid event store consisting of a data streaming layer and a relational layer. This paper describes the design of file catalog, collection and metadata components which are not part of the data streaming layer of POOL an…
▽ More
The POOL project is the common persistency framework for the LHC experiments to store petabytes of experiment data and metadata in a distributed and grid enabled way. POOL is a hybrid event store consisting of a data streaming layer and a relational layer. This paper describes the design of file catalog, collection and metadata components which are not part of the data streaming layer of POOL and outlines how POOL aims to provide transparent and efficient data access for a wide range of environments and use cases - ranging from a large production site down to a single disconnected laptops. The file catalog is the central POOL component translating logical data references to physical data files in a grid environment. POOL collections with their associated metadata provide an abstract way of accessing experiment data via their logical grouping into sets of related data objects.
△ Less
Submitted 13 June, 2003;
originally announced June 2003.
-
Transparent Persistence with Java Data Objects
Authors:
Julius Hrivnac
Abstract:
Flexible and performant Persistency Service is a necessary component of any HEP Software Framework. The building of a modular, non-intrusive and performant persistency component have been shown to be very difficult task. In the past, it was very often necessary to sacrifice modularity to achieve acceptable performance. This resulted in the strong dependency of the overall Frameworks on their Per…
▽ More
Flexible and performant Persistency Service is a necessary component of any HEP Software Framework. The building of a modular, non-intrusive and performant persistency component have been shown to be very difficult task. In the past, it was very often necessary to sacrifice modularity to achieve acceptable performance. This resulted in the strong dependency of the overall Frameworks on their Persistency subsystems.
Recent development in software technology has made possible to build a Persistency Service which can be transparently used from other Frameworks. Such Service doesn't force a strong architectural constraints on the overall Framework Architecture, while satisfying high performance requirements. Java Data Object standard (JDO) has been already implemented for almost all major databases. It provides truly transparent persistency for any Java object (both internal and external). Objects in other languages can be handled via transparent proxies. Being only a thin layer on top of a used database, JDO doesn't introduce any significant performance degradation. Also Aspect-Oriented Programming (AOP) makes possible to treat persistency as an orthogonal Aspect of the Application Framework, without polluting it with persistence-specific concepts.
All these techniques have been developed primarily (or only) for the Java environment. It is, however, possible to interface them transparently to Frameworks built in other languages, like for example C++.
Fully functional prototypes of flexible and non-intrusive persistency modules have been build for several other packages, as for example FreeHEP AIDA and LCG Pool AttributeSet (package Indicium).
△ Less
Submitted 2 June, 2003;
originally announced June 2003.
-
GraXML - Modular Geometric Modeler
Authors:
Julius Hrivnac
Abstract:
Many entities managed by HEP Software Frameworks represent spatial (3-dimensional) real objects. Effective definition, manipulation and visualization of such objects is an indispensable functionality.
GraXML is a modular Geometric Modeling toolkit capable of processing geometric data of various kinds (detector geometry, event geometry) from different sources and delivering them in ways suitabl…
▽ More
Many entities managed by HEP Software Frameworks represent spatial (3-dimensional) real objects. Effective definition, manipulation and visualization of such objects is an indispensable functionality.
GraXML is a modular Geometric Modeling toolkit capable of processing geometric data of various kinds (detector geometry, event geometry) from different sources and delivering them in ways suitable for further use. Geometric data are first modeled in one of the Generic Models. Those Models are then used to populate powerful Geometric Model based on the Java3D technology. While Java3D has been originally created just to provide visualization of 3D objects, its light weight and high functionality allow an effective reuse as a general geometric component. This is possible also thanks to a large overlap between graphical and general geometric functionality and modular design of Java3D itself. Its graphical functionalities also allow a natural visualization of all manipulated elements.
All these techniques have been developed primarily (or only) for the Java environment. It is, however, possible to interface them transparently to Frameworks built in other languages, like for example C++.
The GraXML toolkit has been tested with data from several sources, as for example ATLAS and ALICE detector description and ATLAS event data. Prototypes for other sources, like Geometry Description Markup Language (GDML) exist too and interface to any other source is easy to add.
△ Less
Submitted 2 June, 2003;
originally announced June 2003.