Skip to main content

Showing 1–2 of 2 results for author: Juan, P S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2105.09187  [pdf, other

    cs.DC cs.AR cs.PF

    High performance and energy efficient inference for deep learning on ARM processors

    Authors: Adrián Castelló, Sergio Barrachina, Manuel F. Dolz, Enrique S. Quintana-Ortí, Pau San Juan

    Abstract: We evolve PyDTNN, a framework for distributed parallel training of Deep Neural Networks (DNNs), into an efficient inference tool for convolutional neural networks. Our optimization process on multicore ARM processors involves several high-level transformations of the original framework, such as the development and integration of Cython routines to exploit thread-level parallelism; the design and d… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 13 pages, 7 figures

  2. arXiv:2005.06410  [pdf, other

    cs.PF

    High Performance and Portable Convolution Operators for ARM-based Multicore Processors

    Authors: Pablo San Juan, Adrián Castelló, Manuel F. Dolz, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí

    Abstract: The considerable impact of Convolutional Neural Networks on many Artificial Intelligence tasks has led to the development of various high performance algorithms for the convolution operator present in this type of networks. One of these approaches leverages the \imcol transform followed by a general matrix multiplication (GEMM) in order to take advantage of the highly optimized realizations of the… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

    ACM Class: B.8; C.4; I.2; I.4