Skip to main content

Showing 1–6 of 6 results for author: Codreanu, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2107.11832  [pdf, other

    cs.DC

    A Holistic Analysis of Datacenter Operations: Resource Usage, Energy, and Workload Characterization -- Extended Technical Report

    Authors: Laurens Versluis, Mehmet Cetin, Caspar Greeven, Kristian Laursen, Damian Podareanu, Valeriu Codreanu, Alexandru Uta, Alexandru Iosup

    Abstract: Improving datacenter operations is vital for the digital society. We posit that doing so requires our community to shift, from operational aspects taken in isolation to holistic analysis of datacenter resources, energy, and workloads. In turn, this shift will require new analysis methods, and open-access, FAIR datasets with fine temporal and spatial granularity. We leverage in this work one of the… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

  2. arXiv:2103.10142  [pdf, other

    physics.data-an cs.AI hep-ex

    Reduced Precision Strategies for Deep Learning: A High Energy Physics Generative Adversarial Network Use Case

    Authors: Florian Rehm, Sofia Vallecorsa, Vikram Saletore, Hans Pabst, Adel Chaibi, Valeriu Codreanu, Kerstin Borras, Dirk Krücker

    Abstract: Deep learning is finding its way into high energy physics by replacing traditional Monte Carlo simulations. However, deep learning still requires an excessive amount of computational resources. A promising approach to make deep learning more efficient is to quantize the parameters of the neural networks to reduced precision. Reduced precision computing is extensively used in modern deep learning a… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: Submitted at ICPRAM 2021; from CERN openlab - Intel collaboration

    Journal ref: ICPRAM 2021

  3. arXiv:2004.03454  [pdf

    cs.CE

    Deep-learning enhancement of large scale numerical simulations

    Authors: Caspar van Leeuwen, Damian Podareanu, Valeriu Codreanu, Maxwell X. Cai, Axel Berg, Simon Portegies Zwart, Robin Stoffer, Menno Veerman, Chiel van Heerwaarden, Sydney Otten, Sascha Caron, Cunliang Geng, Francesco Ambrosetti, Alexandre M. J. J. Bonvin

    Abstract: Traditional simulations on High-Performance Computing (HPC) systems typically involve modeling very large domains and/or very complex equations. HPC systems allow running large models, but limits in performance increase that have become more prominent in the last 5-10 years will likely be experienced. Therefore new approaches are needed to increase application performance. Deep learning appears to… ▽ More

    Submitted 30 March, 2020; originally announced April 2020.

    Comments: White paper consists of 36 pages and 15 figures

  4. arXiv:1905.04035  [pdf, other

    cs.LG cs.CL cs.DC

    Densifying Assumed-sparse Tensors: Improving Memory Efficiency and MPI Collective Performance during Tensor Accumulation for Parallelized Training of Neural Machine Translation Models

    Authors: Derya Cavdar, Valeriu Codreanu, Can Karakus, John A. Lockman III, Damian Podareanu, Vikram Saletore, Alexander Sergeev, Don D. Smith II, Victor Suthichai, Quy Ta, Srinivas Varadharajan, Lucas A. Wilson, Rengan Xu, Pei Yang

    Abstract: Neural machine translation - using neural networks to translate human language - is an area of active research exploring new neuron types and network topologies with the goal of dramatically improving machine translation performance. Current state-of-the-art approaches, such as the multi-head attention-based transformer, require very large translation corpuses and many epochs to produce models of… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    Comments: 18 pages, 10 figures, accepted at the 2019 International Supercomputing Conference

  5. arXiv:1711.04291  [pdf, other

    stat.ML cs.LG

    Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train

    Authors: Valeriu Codreanu, Damian Podareanu, Vikram Saletore

    Abstract: For the past 5 years, the ILSVRC competition and the ImageNet dataset have attracted a lot of interest from the Computer Vision community, allowing for state-of-the-art accuracy to grow tremendously. This should be credited to the use of deep artificial neural network designs. As these became more complex, the storage, bandwidth, and compute requirements increased. This means that with a non-distr… ▽ More

    Submitted 15 November, 2017; v1 submitted 12 November, 2017; originally announced November 2017.

    Comments: 10 pages, 4 figures, 13 tables

  6. arXiv:1703.06503  [pdf, other

    cs.PF cs.AI cs.DC

    CLTune: A Generic Auto-Tuner for OpenCL Kernels

    Authors: Cedric Nugteren, Valeriu Codreanu

    Abstract: This work presents CLTune, an auto-tuner for OpenCL kernels. It evaluates and tunes kernel performance of a generic, user-defined search space of possible parameter-value combinations. Example parameters include the OpenCL workgroup size, vector data-types, tile sizes, and loop unrolling factors. CLTune can be used in the following scenarios: 1) when there are too many tunable parameters to explor… ▽ More

    Submitted 19 March, 2017; originally announced March 2017.

    Comments: 8 pages, published in MCSoC '15, IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2015