Skip to main content

Showing 1–16 of 16 results for author: Jacquelin, M

.
  1. arXiv:2405.20101  [pdf, other

    cs.SD cs.CL eess.AS

    Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

    Authors: Ihab Asaad, Maxime Jacquelin, Olivier Perrotin, Laurent Girin, Thomas Hueber

    Abstract: Most speech self-supervised learning (SSL) models are trained with a pretext task which consists in predicting missing parts of the input signal, either future segments (causal prediction) or segments masked anywhere within the input (non-causal prediction). Learned speech representations can then be efficiently transferred to downstream tasks (e.g., automatic speech or speaker recognition). In th… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2304.11274  [pdf, other

    cs.MS

    Massively Distributed Finite-Volume Flux Computation

    Authors: Ryuichi Sai, Mathias Jacquelin, François P. Hamon, Mauricio Araya-Polo, Randolph R. Settgast

    Abstract: Designing large-scale geological carbon capture and storage projects and ensuring safe long-term CO2 containment - as a climate change mitigation strategy - requires fast and accurate numerical simulations. These simulations involve solving complex PDEs governing subsurface fluid flow using implicit finite-volume schemes widely based on Two-Point Flux Approximation (TPFA). This task is computation… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: 10 pages excl. bibliography. Submitted to SuperComputing 2023

  3. Wafer-Scale Fast Fourier Transforms

    Authors: Marcelo Orenes-Vera, Ilya Sharapov, Robert Schreiber, Mathias Jacquelin, Philippe Vandermersch, Sharan Chetlur

    Abstract: We have implemented fast Fourier transforms for one, two, and three-dimensional arrays on the Cerebras CS-2, a system whose memory and processing elements reside on a single silicon wafer. The wafer-scale engine (WSE) encompasses a two-dimensional mesh of roughly 850,000 processing elements (PEs) with fast local memory and equally fast nearest-neighbor interconnections. Our wafer-scale FFT (wsFF… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Journal ref: Proceedings of the 37th International Conference on Supercomputing 2023

  4. arXiv:2204.03775  [pdf, other

    cs.MS

    Massively scalable stencil algorithm

    Authors: Mathias Jacquelin, Mauricio Araya-Polo, Jie Meng

    Abstract: Stencil computations lie at the heart of many scientific and industrial applications. Unfortunately, stencil algorithms perform poorly on machines with cache based memory hierarchy, due to low re-use of memory accesses. This work shows that for stencil computation a novel algorithm that leverages a localized communication strategy effectively exploits the Cerebras WSE-2, which has no cache hierarc… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: 10 pages excl. bibliography. Submitted to SuperComputing 2022

  5. arXiv:2004.12023  [pdf, other

    physics.chem-ph physics.comp-ph

    NWChem: Past, Present, and Future

    Authors: E. Aprà, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski, T. P. Straatsma, M. Valiev, H. J. J. van Dam, Y. Alexeev, J. Anchell, V. Anisimov, F. W. Aquino, R. Atta-Fynn, J. Autschbach, N. P. Bauman, J. C. Becca, D. E. Bernholdt, K. Bhaskaran-Nair, S. Bogatko, P. Borowski, J. Boschen, J. Brabec, A. Bruner, E. Cauët, Y. Chen , et al. (89 additional authors not shown)

    Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials… ▽ More

    Submitted 26 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: This article appeared in volume 152, issue 18, page 184102 of the Journal of Chemical Physics. It can be found at https://doi.org/10.1063/5.0004997

    Journal ref: J. Chem. Phys., 152, 184102 (2020)

  6. arXiv:1912.13403  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci

    ELSI -- An Open Infrastructure for Electronic Structure Solvers

    Authors: Victor Wen-zhe Yu, Carmen Campos, William Dawson, Alberto García, Ville Havu, Ben Hourahine, William P Huhn, Mathias Jacquelin, Weile Jia, Murat Keçeli, Raul Laasner, Yingzhou Li, Lin Lin, Jianfeng Lu, Jonathan Moussa, Jose E Roman, Álvaro Vázquez-Mayagoitia, Chao Yang, Volker Blum

    Abstract: Routine applications of electronic structure theory to molecules and periodic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries (finite non-periodic vs. infi… ▽ More

    Submitted 4 July, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Journal ref: Computer Physics Communications 256 (2020) 107459

  7. arXiv:1805.05278  [pdf, ps, other

    cs.DC

    A 3D Parallel Algorithm for QR Decomposition

    Authors: Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight

    Abstract: Interprocessor communication often dominates the runtime of large matrix computations. We present a parallel algorithm for computing QR decompositions whose bandwidth cost (communication volume) can be decreased at the cost of increasing its latency cost (number of messages). By varying a parameter to navigate the bandwidth/latency tradeoff, we can tune this algorithm for machines with different c… ▽ More

    Submitted 14 May, 2018; originally announced May 2018.

  8. arXiv:1708.04539  [pdf, ps, other

    cs.MS math.NA

    PSelInv - A Distributed Memory Parallel Algorithm for Selected Inversion: the non-symmetric Case

    Authors: Mathias Jacquelin, Lin Lin, Chao Yang

    Abstract: This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non- symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ = LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInv method computes selected elements of A-1. The selection is con… ▽ More

    Submitted 13 August, 2017; originally announced August 2017.

    Comments: arXiv admin note: text overlap with arXiv:1404.0447

  9. arXiv:1705.11191  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci

    ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

    Authors: Victor Wen-zhe Yu, Fabiano Corsetti, Alberto García, William P. Huhn, Mathias Jacquelin, Weile Jia, Björn Lange, Lin Lin, Jianfeng Lu, Wenhui Mi, Ali Seifitokaldani, Álvaro Vázquez-Mayagoitia, Chao Yang, Haizhao Yang, Volker Blum

    Abstract: Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access dif… ▽ More

    Submitted 31 May, 2017; originally announced May 2017.

    Comments: 55 pages, 14 figures, 2 tables

    Journal ref: Computer Physics Communications 222 (2018) 267-285

  10. arXiv:1610.08128  [pdf, other

    cs.DC cs.MS math.NA

    The Reverse Cuthill-McKee Algorithm in Distributed-Memory

    Authors: Ariful Azad, Mathias Jacquelin, Aydin Buluc, Esmond G. Ng

    Abstract: Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering methods such as nested dissection, many important ordering methods have not been efficiently mapped to distributed-memory architectures. In this paper, we present t… ▽ More

    Submitted 25 October, 2016; originally announced October 2016.

  11. arXiv:1608.00044  [pdf, other

    cs.MS

    An Asynchronous Task-based Fan-Both Sparse Cholesky Solver

    Authors: Mathias Jacquelin, Yili Zheng, Esmond Ng, Katherine Yelick

    Abstract: Systems of linear equations arise at the heart of many scientific and engineering applications. Many of these linear systems are sparse; i.e., most of the elements in the coefficient matrix are zero. Direct methods based on matrix factorizations are sometimes needed to ensure accurate solutions. For example, accurate solution of sparse linear systems is needed in shift-invert Lanczos to compute in… ▽ More

    Submitted 23 August, 2016; v1 submitted 29 July, 2016; originally announced August 2016.

  12. arXiv:1604.02528  [pdf, other

    cs.MS

    A Left-Looking Selected Inversion Algorithm and Task Parallelism on Shared Memory Systems

    Authors: Mathias Jacquelin, Lin Lin, Weile Jia, Yonghua Zhao, Chao Yang

    Abstract: Given a sparse matrix $A$, the selected inversion algorithm is an efficient method for computing certain selected elements of $A^{-1}$. These selected elements correspond to all or some nonzero elements of the LU factors of $A$. In many ways, the type of matrix updates performed in the selected inversion algorithm is similar to that performed in the LU factorization, although the sequence of opera… ▽ More

    Submitted 9 April, 2016; originally announced April 2016.

    Comments: 9 pages, 7 figures, submitted to SuperComputing 2016

  13. arXiv:1504.04714  [pdf, other

    cs.DC cs.MS

    Enhancing the scalability and load balancing of the parallel selected inversion algorithm via tree-based asynchronous communication

    Authors: Mathias Jacquelin, Lin Lin, Nathan Wichmann, Chao Yang

    Abstract: We develop a method for improving the parallel scalability of the recently developed parallel selected inversion algorithm [Jacquelin, Lin and Yang 2014], named PSelInv, on massively parallel distributed memory machines. In the PSelInv method, we compute selected elements of the inverse of a sparse matrix A that can be decomposed as A = LU, where L is lower triangular and U is upper triangular. Up… ▽ More

    Submitted 18 April, 2015; originally announced April 2015.

  14. arXiv:1404.0447  [pdf, ps, other

    math.NA cs.DC

    PSelInv -- A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case

    Authors: Mathias Jacquelin, Lin Lin, Chao Yang

    Abstract: We describe an efficient parallel implementation of the selected inversion algorithm for distributed memory computer systems, which we call \texttt{PSelInv}. The \texttt{PSelInv} method computes selected elements of a general sparse matrix $A$ that can be decomposed as $A = LU$, where $L$ is lower triangular and $U$ is upper triangular. The implementation described in this paper focuses on the cas… ▽ More

    Submitted 28 May, 2015; v1 submitted 1 April, 2014; originally announced April 2014.

  15. arXiv:1303.5837  [pdf, other

    cs.DC

    Multilevel communication optimal LU and QR factorizations for hierarchical platforms

    Authors: Laura Grigori, Mathias Jacquelin, Amal Khabou

    Abstract: This study focuses on the performance of two classical dense linear algebra algorithms, the LU and the QR factorizations, on multilevel hierarchical platforms. We first introduce a new model called Hierarchical Cluster Platform (HCP), encapsulating the characteristics of such platforms. The focus is set on reducing the communication requirements of studied algorithms at each level of the hierarchy… ▽ More

    Submitted 23 March, 2013; originally announced March 2013.

    Report number: RR-8270

  16. arXiv:1104.4475  [pdf, ps, other

    cs.DC

    Tiled QR factorization algorithms

    Authors: Henricus Bouwmeester, Mathias Jacquelin, Julien Langou, Yves Robert

    Abstract: This work revisits existing algorithms for the QR factorization of rectangular matrices composed of p-by-q tiles, where p >= q. Within this framework, we study the critical paths and performance of algorithms such as Sameh and Kuck, Modi and Clarke, Greedy, and those found within PLASMA. Although neither Modi and Clarke nor Greedy is optimal, both are shown to be asymptotically optimal for all mat… ▽ More

    Submitted 22 April, 2011; originally announced April 2011.