Search | arXiv e-print repository

doi 10.1007/s41781-017-0001-9

HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

Authors: Burt Holzman, Lothar A. T. Bauerdick, Brian Bockelman, Dave Dykstra, Ian Fisk, Stuart Fuess, Gabriele Garzoglio, Maria Girone, Oliver Gutsche, Dirk Hufnagel, Hyunwoo Kim, Robert Kennedy, Nicolo Magini, David Mason, Panagiotis Spentzouris, Anthony Tiradani, Steve Timm, Eric W. Vaandering

Abstract: Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly de… ▽ More Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing nterest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized both local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. In addition, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources. △ Less

Submitted 29 September, 2017; originally announced October 2017.

Comments: 15 pages, 9 figures

Journal ref: Comput Softw Big Sci (2017) 1:1

arXiv:1510.08545 [pdf, ps, other]

High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

Authors: Salman Habib, Robert Roser, Tom LeCompte, Zach Marshall, Anders Borgland, Brett Viren, Peter Nugent, Makoto Asai, Lothar Bauerdick, Hal Finkel, Steve Gottlieb, Stefan Hoeche, Paul Sheldon, Jean-Luc Vay, Peter Elmer, Michael Kirby, Simon Patton, Maxim Potekhin, Brian Yanny, Paolo Calafiura, Eli Dart, Oliver Gutsche, Taku Izubuchi, Adam Lyon, Don Petravick

Abstract: Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence… ▽ More Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlapping drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material. △ Less

Submitted 28 October, 2015; originally announced October 2015.

Comments: 72 pages

arXiv:1507.07430 [pdf, other]

doi 10.1088/1742-6596/664/3/032010

Designing Computing System Architecture and Models for the HL-LHC era

Authors: Lothar Bauerdick, Brian Bockelman, Peter Elmer, Stephen Gowdy, Matevz Tadel, Frank Wuerthwein

Abstract: This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. △ Less

Submitted 20 July, 2015; originally announced July 2015.

Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

arXiv:cs/0305066 [pdf, ps, other]

The CMS Integration Grid Testbed

Authors: Gregory E. Graham, M. Anzar Afaq, Shafqat Aziz, L. A. T. Bauerdick, Michael Ernst, Joseph Kaiser, Natalia Ratnikova, Hans Wenzel, Yujun Wu, Erik Aslakson, Julian Bunn, Saima Iqbal, Iosif Legrand, Harvey Newman, Suresh Singh, Conrad Steenberg, James Branson, Ian Fisk, James Letts, Adam Arbree, Paul Avery, Dimitri Bourilkov, Richard Cavanaugh, Jorge Rodriguez, Suchindra Kategari , et al. (5 additional authors not shown)

Abstract: The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed us… ▽ More The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed using VO management scripts from the European Data Grid (EDG). Gridwide monitoring is accomplished using local tools such as Ganglia interfaced into the Globus Metadata Directory Service (MDS) and the agent based Mona Lisa. Domain specific software is packaged and installed using the Distrib ution After Release (DAR) tool of CMS, while middleware under the auspices of the Virtual Data Toolkit (VDT) is distributed using Pacman. During a continuo us two month span in Fall of 2002, over 1 million official CMS GEANT based Monte Carlo events were generated and returned to CERN for analysis while being demonstrated at SC2002. In this paper, we describe the process that led to one of the world's first continuously available, functioning grids. △ Less

Submitted 10 June, 2003; v1 submitted 30 May, 2003; originally announced May 2003.

Comments: CHEP 2003 MOCT010

ACM Class: A.0; C.2.4

Journal ref: eConfC0303241:MOCT010B,2003

arXiv:cs/0104008 [pdf, ps, other]

doi 10.1016/S0010-4655(01)00162-X

Event Indexing Systems for Efficient Selection and Analysis of HERA Data

Authors: L. A. T. Bauerdick, Adrian Fox-Murphy, Tobias Haas, Stefan Stonjek, Enrico Tassi

Abstract: The design and implementation of two software systems introduced to improve the efficiency of offline analysis of event data taken with the ZEUS Detector at the HERA electron-proton collider at DESY are presented. Two different approaches were made, one using a set of event directories and the other using a tag database based on a commercial object-oriented database management system. These are… ▽ More The design and implementation of two software systems introduced to improve the efficiency of offline analysis of event data taken with the ZEUS Detector at the HERA electron-proton collider at DESY are presented. Two different approaches were made, one using a set of event directories and the other using a tag database based on a commercial object-oriented database management system. These are described and compared. Both systems provide quick direct access to individual collision events in a sequential data store of several terabytes, and they both considerably improve the event analysis efficiency. In particular the tag database provides a very flexible selection mechanism and can dramatically reduce the computing time needed to extract small subsamples from the total event sample. Gains as large as a factor 20 have been obtained. △ Less

Submitted 3 April, 2001; originally announced April 2001.

Comments: Accepted for publication in Computer Physics Communications

Report number: DESY 01-045 ACM Class: H.2.4; H.3.1; H.3.3; H.3.4; J.2; H.2.8

Journal ref: Comput.Phys.Commun. 137 (2001) 236-246

Showing 1–5 of 5 results for author: Bauerdick, L